Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-12-04 Thread Jörg F . Wittenberger

On Dec 3 2018, Peter Bex wrote:


On Mon, Dec 03, 2018 at 10:46:38AM +0100, Jörg F. Wittenberger wrote:
So for me the question remains: wouldn't it be much, much more 
efficient to work sort-of hand-in-hand with one of the core developers, 
or maybe on the list to get the remaining things (bugs and improvements) 
fixed and reviewed?


I think this would be quite helpful.  Perhaps at another hackathon we can
sit down together (ideally with more than one core developer to ensure we
all are on the same page and understand it).


Agreed.  Or maybe the list?  Could take time to find a chance to meet.


This is one of the ugly truths of open source collaborative development;
you really have to have a good plan on how to communicate the changes
you're making back to "upstream",[...]

Dropping a complex patch is generally not the way to go about adding code
to an existing system.[...] eyeball it for obvious mistakes and
other quality issues.


Too true. Plus: it depends on the culture of the project. Try to put 
yourself in my shoes for a moment.


Up to porting to chicken, I mostly contributed to rscheme, which was a one 
man show. When I hit an issue, I'd send a vague patch and back came a 
completely rewritten one after a day or two. Though it generally where 
small issues or rare failures of complex optimizations going wrong.



This also means that the submitted code has to be so simple that others
who aren't familiar with it can study it and debug it if issues crop up
(and they will, with any sizable change).  The scheduler is a major pain
point regarding this, since concurrency is difficult enough (or
impossible?) to understand at all, regardless of the quality of the code
in the scheduler (which isn't stellar to begin with).


So when I evaluated chicken, I found it to be a compiler producing slightly 
faster code than rscheme.


First tests went well. Then I invested a lot of time until I could run a 
more sizable piece of code.


Just to run into all sorts of issues.

Taking #1564 for an example. It can be quite worse than just killing the 
program: When I ran into it, I was not always so lucky to find the thread 
piled up in the waiting queue to be in a state the consistency check would 
complain about. When the thread was blocked for a different mutex (hence 
sitting in two waiting queues at the same time), the mutex-unlock! would 
happily unblock it - thus stealing the other mutex from the third thread 
holding it.


This kind of poked fun at the idea to use them for synchronization.

At that point it did _not_ occur to me that my code would be especially 
complex a thing. Not did I assume nobody else had ever run more than toy 
examples on load-free systems. I assumed that it was obvious how badly 
broken it was. (And I did not foresee this not coming up elsewhere for a 
decade.)


Sure I felt bad to have to bring up such a huge patch. But it fixed several 
interrelated bugs plus two counts of Big-O reduction of complexity.


I might have expected some questions, comments etc. Certainly not being 
completely ignored.


So I tried to push it for a while.

Same goes for issues, which literally went against the text book examples 
used to teach how not to do things like not using dummy head lists in C - 
something I did not believe anybody would do. At least I expected the 
respective patch to be welcome. Especially as it was quite a job to 
actually change a large file and then test the results before submitting.


Eventually I took being ignored as unwarrented...

...and lost faith in the project.


So at some point, merging a large change is a bit of an act of faith.
It also requires trust, which needs to be built up over time by showing
consistent quality patches and commitment to the project.  This is the
really hard bit, especially if you just want one specific feature to be
added and don't have that many other things to contribute to the system
simply because it works for you.


Yeah, I just needed a compiler as rscheme was dead. I did not want to turn 
into a teacher. Little hope I had that I could ever get something fed back 
upstream. Hence I did no longer try to.



I don't have a good solution for this, but your suggestion to walk
through the code together seems like a good one to me.


Agreed.  Chicken is not that bad.  It just has a couple of rough edges.

Best

/Jörg


___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-12-03 Thread Peter Bex
On Mon, Dec 03, 2018 at 10:46:38AM +0100, Jörg F. Wittenberger wrote:
> Whats going on here IMHO is that a lot of lifetime, your guys and mine, is
> wasted. At the same time the code quality of the result is likely worse that
> what I'm using as the source to cut out those patches. [...]
> 
> So for me the question remains: wouldn't it be much, much more efficient to
> work sort-of hand-in-hand with one of the core developers, or maybe on the
> list to get the remaining things (bugs and improvements) fixed and reviewed?

I think this would be quite helpful.  Perhaps at another hackathon we can
sit down together (ideally with more than one core developer to ensure we
all are on the same page and understand it).

This is one of the ugly truths of open source collaborative development;
you really have to have a good plan on how to communicate the changes
you're making back to "upstream", or face porting nightmares every time
you upgrade.  I've made this mistake a few times too back in the day,
some with CHICKEN (the Makefile refactor is one of those) and some with
other projects.

Dropping a complex patch is generally not the way to go about adding code
to an existing system.  On the other hand, sometimes one person or group
creates such large changes that can't be split up (for instance the
numbers stuff, or the chicken-install rewrite).  At such points there is
no realistic way to review everything, so the best the "upstream" can do
is test the code extensively and eyeball it for obvious mistakes and
other quality issues.

This also means that the submitted code has to be so simple that others
who aren't familiar with it can study it and debug it if issues crop up
(and they will, with any sizable change).  The scheduler is a major pain
point regarding this, since concurrency is difficult enough (or
impossible?) to understand at all, regardless of the quality of the code
in the scheduler (which isn't stellar to begin with).

So at some point, merging a large change is a bit of an act of faith.
It also requires trust, which needs to be built up over time by showing
consistent quality patches and commitment to the project.  This is the
really hard bit, especially if you just want one specific feature to be
added and don't have that many other things to contribute to the system
simply because it works for you.

I don't have a good solution for this, but your suggestion to walk
through the code together seems like a good one to me.

Cheers,
Peter


signature.asc
Description: PGP signature
___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-12-03 Thread Jörg F . Wittenberger

Thanks you so much Kon,

reviewing these logs helped to confirm my feelings.

Feelings, not findings. Yet.

Tinkering with these scheduler/srfi-18 issues again really made me feel bad 
and sorry. In fact the anger has cost me the sleep of the better half of 
the night. Still enrages me.


Whats going on here IMHO is that a lot of lifetime, your guys and mine, is 
wasted. At the same time the code quality of the result is likely worse 
that what I'm using as the source to cut out those patches.


As I can't outright proof this statement to you, let me recap the 
background for a moment: Around a decade ago I ported a rather thread-heavy 
thing (Askemos, which technically is something partially inspired by 
Erlang, bearing similarities to Termite - except that those processes are 
all made persistent and the states is replicated and synchronized in 
byzantine agreement over a part of the network; you might be able to 
imagine that this is really stressing the threading capabilities of the 
language in use) from rscheme to chicken. The code was at that time grown 
for ~7yrs; that's almost 100 modules, which took some months to port. ...


...Only to learn that the threading in chicken was not at all up for the 
job. Hence I spend a few more weeks fixing that one. Including adding an 
prio queue for timeout- and fd-list.


What I could NOT produce where test cases for each of the bugs (1231, 1232, 
1255, 1564 - like these are not all) I fixed in the process.


Nor was is feasible to fix them one-by-one. (Yesterday evening I failed to 
properly backport the fix for 1564 into the ugly code implementing the 
timeout queue -- while asking myself why the hell it is useful; this queue 
should be replaced with a better version anyway.)


The result I posted on chicken-users at that time. It was a complex fix. 
Sure. But those where sort of interrelated bugs.


Then for about seven years I sadly maintained a chicken fork (which I'm 
still using in production) for these differences in order to be able to use 
chicken at all. Since 4.12 it is at least _possible_ to run this code on 
stock chicken. Partly because I changed my code to avoid triggering bugs 
remaining.


So for me the question remains: wouldn't it be much, much more efficient to 
work sort-of hand-in-hand with one of the core developers, or maybe on the 
list to get the remaining things (bugs and improvements) fixed and 
reviewed?


It would be so much more satisfying to me to actually produce code I could 
approve myself than backport yet another hotfix -- creating a result in the 
process I take issues with.


(((Going into details, I'd probably do the prio-queue different today as I 
learned about chickens performance details. And I'm ready to do so. But at 
least I'd like to be allowed to use a prio queue using a proper interface 
than kludging inline handling of a linear list into well tested code -- 
likely creating fresh bugs in the process.)))


Best

/Jörg

On Dec 2 2018, Kon Lovett wrote:


see attached git (C4) & svn (C5) logs

#(in C4 core local repo)
git log --follow -p -- srfi-18.scm >srfi-18.log

#(in C5 svn local repo)
svn log --diff trunk >srfi-18_trunk-diff.log

hth

On Dec 2, 2018, at 1:19 PM, Jörg F. Wittenberger 
 wrote:


Thanks for the replies,

chicken-install -r srfi-18 ;  did the trick already

I should have stated that that's what I have, what I've been looking 
for was the git history. I wonder for some statements why the hell they 
are there at all. Two possible reasons: a) I cleaned them up for being 
obsolete (due to former changes I made) b) removed since I touched the 
file, which begs the question "why where those added".


Never mind.  I can proceed at least.

On Dec 2 2018, Kon Lovett wrote:


well, that shows me. ;-)

trying to track down why 
#497 $ chicken-install -r srfi-18

mapped (srfi-18) to ()
retrieving ...



On Dec 2, 2018, at 10:42 AM, Kon Lovett  wrote:
C5 evicted srfi-18, along w/ srfi-1, 13, 14, & 69, to the egg store.
chicken-install -retrieve.
On Dec 2, 2018, at 10:39 AM, Jörg F. Wittenberger 
 wrote: Hi all, when I tried to 
reply in a timely manner I apparently sent out a link to a broken 
file. Sorry for that. Just wanted to see if I could create a patch 
for the current master. For this I need srfi-18 egg source too. Just 
I can't find it. Jöry On Nov 30 2018, Jörg F. Wittenberger wrote:

Hello Megane,
On Nov 30 2018, megane wrote:

Hi,
Here's another version that crashes quickly with "very high
probability".

...
24 Error: (mutex-unlock) Internal scheduler error: unknown thread 
state 25 # 26 ready
This bears an uncanny resemblance to scheduler issues I've been 
fighting a long ago. Too long to ago.

--- A fix
Just allow the 'ready state for threads in mutex-unlock!
...
Is this a correct fix?
Too long ago. But it feels wrong. We'd rather make sure there is no 
ready thread in the queue waiting for a mutex in the first place. 
Diffing the changes I maintained quite a while back 

Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-12-02 Thread Jörg F . Wittenberger

Thanks for the replies,

chicken-install -r srfi-18 ;  did the trick already

I should have stated that that's what I have, what I've been looking for 
was the git history. I wonder for some statements why the hell they are 
there at all. Two possible reasons: a) I cleaned them up for being obsolete 
(due to former changes I made) b) removed since I touched the file, which 
begs the question "why where those added".


Never mind.  I can proceed at least.

On Dec 2 2018, Kon Lovett wrote:


well, that shows me. ;-)

trying to track down why 


#497 $ chicken-install -r srfi-18
mapped (srfi-18) to ()
retrieving ...



On Dec 2, 2018, at 10:42 AM, Kon Lovett  wrote:

C5 evicted srfi-18, along w/ srfi-1, 13, 14, & 69, to the egg store.

chicken-install -retrieve.

On Dec 2, 2018, at 10:39 AM, Jörg F. Wittenberger 
 wrote:


Hi all,

when I tried to reply in a timely manner I apparently sent out a link 
to a broken file. Sorry for that.


Just wanted to see if I could create a patch for the current master.

For this I need srfi-18 egg source too.  Just I can't find it.

Jöry

On Nov 30 2018, Jörg F. Wittenberger wrote:


Hello Megane,

On Nov 30 2018, megane wrote:


Hi,

Here's another version that crashes quickly with "very high
probability".

...
 24 Error: (mutex-unlock) Internal scheduler error: unknown thread 
state

 25 #
 26 ready


This bears an uncanny resemblance to scheduler issues I've been 
fighting a long ago.


Too long to ago.


--- A fix

Just allow the 'ready state for threads in mutex-unlock!

...
Is this a correct fix?



Too long ago.

But it feels wrong. We'd rather make sure there is no ready thread in 
the queue waiting for a mutex in the first place.


Diffing the changes I maintained quite a while back 
http://ball.askemos.org/Ad60e3fb123a79b2e5128915116b288f7/chicken-4.9.1-ball.tar.gz 
you will find that I added a


##sys#thread-clear-blocking-state!

Towards the end of scheduler.scm and used it for consistency 
whereever I ran into not-so-clean unlocks.


Now this is still an invasive change. But looking at the source of 
scheduler and srfi-18 in chicken 5 right now, I can't fight the 
feeling that it is working around the missing changes at several 
places.


Best

/Jörg


___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers



___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers







___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-12-02 Thread Kon Lovett
well, that shows me. ;-)

trying to track down why 

#497 $ chicken-install -r srfi-18
mapped (srfi-18) to ()
retrieving ...


> On Dec 2, 2018, at 10:42 AM, Kon Lovett  wrote:
> 
> C5 evicted srfi-18, along w/ srfi-1, 13, 14, & 69, to the egg store.
> 
> chicken-install -retrieve.
> 
>> On Dec 2, 2018, at 10:39 AM, Jörg F. Wittenberger 
>>  wrote:
>> 
>> Hi all,
>> 
>> when I tried to reply in a timely manner I apparently sent out a link to a 
>> broken file. Sorry for that.
>> 
>> Just wanted to see if I could create a patch for the current master.
>> 
>> For this I need srfi-18 egg source too.  Just I can't find it.
>> 
>> Jöry
>> 
>> On Nov 30 2018, Jörg F. Wittenberger wrote:
>> 
>>> Hello Megane,
>>> 
>>> On Nov 30 2018, megane wrote:
>>> 
 Hi,
 
 Here's another version that crashes quickly with "very high
 probability".
>>> ...
  24 Error: (mutex-unlock) Internal scheduler error: unknown thread state
  25#
  26ready
>>> 
>>> This bears an uncanny resemblance to scheduler issues I've been fighting a 
>>> long ago.
>>> 
>>> Too long to ago.
>>> 
 --- A fix
 
 Just allow the 'ready state for threads in mutex-unlock!
 
 ...
 Is this a correct fix?
>>> 
>>> 
>>> Too long ago.
>>> 
>>> But it feels wrong. We'd rather make sure there is no ready thread in the 
>>> queue waiting for a mutex in the first place.
>>> 
>>> Diffing the changes I maintained quite a while back 
>>> http://ball.askemos.org/Ad60e3fb123a79b2e5128915116b288f7/chicken-4.9.1-ball.tar.gz
>>>  you will find that I added a
>>> 
>>> ##sys#thread-clear-blocking-state!
>>> 
>>> Towards the end of scheduler.scm and used it for consistency whereever I 
>>> ran into not-so-clean unlocks.
>>> 
>>> Now this is still an invasive change. But looking at the source of 
>>> scheduler and srfi-18 in chicken 5 right now, I can't fight the feeling 
>>> that it is working around the missing changes at several places.
>>> 
>>> Best
>>> 
>>> /Jörg
>>> 
>>> 
>>> ___
>>> Chicken-hackers mailing list
>>> Chicken-hackers@nongnu.org
>>> https://lists.nongnu.org/mailman/listinfo/chicken-hackers
>>> 
>> 
>> ___
>> Chicken-hackers mailing list
>> Chicken-hackers@nongnu.org
>> https://lists.nongnu.org/mailman/listinfo/chicken-hackers
> 

___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-12-02 Thread Kon Lovett
C5 evicted srfi-18, along w/ srfi-1, 13, 14, & 69, to the egg store.

chicken-install -retrieve.

> On Dec 2, 2018, at 10:39 AM, Jörg F. Wittenberger 
>  wrote:
> 
> Hi all,
> 
> when I tried to reply in a timely manner I apparently sent out a link to a 
> broken file. Sorry for that.
> 
> Just wanted to see if I could create a patch for the current master.
> 
> For this I need srfi-18 egg source too.  Just I can't find it.
> 
> Jöry
> 
> On Nov 30 2018, Jörg F. Wittenberger wrote:
> 
>> Hello Megane,
>> 
>> On Nov 30 2018, megane wrote:
>> 
>>> Hi,
>>> 
>>> Here's another version that crashes quickly with "very high
>>> probability".
>> ...
>>>   24 Error: (mutex-unlock) Internal scheduler error: unknown thread state
>>>   25#
>>>   26ready
>> 
>> This bears an uncanny resemblance to scheduler issues I've been fighting a 
>> long ago.
>> 
>> Too long to ago.
>> 
>>> --- A fix
>>> 
>>> Just allow the 'ready state for threads in mutex-unlock!
>>> 
>>> ...
>>> Is this a correct fix?
>> 
>> 
>> Too long ago.
>> 
>> But it feels wrong. We'd rather make sure there is no ready thread in the 
>> queue waiting for a mutex in the first place.
>> 
>> Diffing the changes I maintained quite a while back 
>> http://ball.askemos.org/Ad60e3fb123a79b2e5128915116b288f7/chicken-4.9.1-ball.tar.gz
>>  you will find that I added a
>> 
>> ##sys#thread-clear-blocking-state!
>> 
>> Towards the end of scheduler.scm and used it for consistency whereever I ran 
>> into not-so-clean unlocks.
>> 
>> Now this is still an invasive change. But looking at the source of scheduler 
>> and srfi-18 in chicken 5 right now, I can't fight the feeling that it is 
>> working around the missing changes at several places.
>> 
>> Best
>> 
>> /Jörg
>> 
>> 
>> ___
>> Chicken-hackers mailing list
>> Chicken-hackers@nongnu.org
>> https://lists.nongnu.org/mailman/listinfo/chicken-hackers
>> 
> 
> ___
> Chicken-hackers mailing list
> Chicken-hackers@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/chicken-hackers


___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-12-02 Thread Jörg F . Wittenberger

Hi all,

when I tried to reply in a timely manner I apparently sent out a link to a 
broken file. Sorry for that.


Just wanted to see if I could create a patch for the current master.

For this I need srfi-18 egg source too.  Just I can't find it.

Jöry

On Nov 30 2018, Jörg F. Wittenberger wrote:


Hello Megane,

On Nov 30 2018, megane wrote:


Hi,

Here's another version that crashes quickly with "very high
probability".

...
   24 Error: (mutex-unlock) Internal scheduler error: unknown thread 
state

   25   #
   26   ready


This bears an uncanny resemblance to scheduler issues I've been fighting a 
long ago.


Too long to ago.


--- A fix

Just allow the 'ready state for threads in mutex-unlock!

...
Is this a correct fix?



Too long ago.

But it feels wrong. We'd rather make sure there is no ready thread in the 
queue waiting for a mutex in the first place.


Diffing the changes I maintained quite a while back 
http://ball.askemos.org/Ad60e3fb123a79b2e5128915116b288f7/chicken-4.9.1-ball.tar.gz 
you will find that I added a


##sys#thread-clear-blocking-state!

Towards the end of scheduler.scm and used it for consistency whereever I 
ran into not-so-clean unlocks.


Now this is still an invasive change. But looking at the source of 
scheduler and srfi-18 in chicken 5 right now, I can't fight the feeling 
that it is working around the missing changes at several places.


Best

/Jörg


___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers



___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-11-30 Thread Jörg F . Wittenberger

Hello Megane,

On Nov 30 2018, megane wrote:


Hi,

Here's another version that crashes quickly with "very high
probability".

...

   24   Error: (mutex-unlock) Internal scheduler error: unknown thread state
   25   #
   26   ready


This bears an uncanny resemblance to scheduler issues I've been fighting a 
long ago.


Too long to ago.


--- A fix

Just allow the 'ready state for threads in mutex-unlock!

...
Is this a correct fix?



Too long ago.

But it feels wrong. We'd rather make sure there is no ready thread in the 
queue waiting for a mutex in the first place.


Diffing the changes I maintained quite a while back 
http://ball.askemos.org/Ad60e3fb123a79b2e5128915116b288f7/chicken-4.9.1-ball.tar.gz 
you will find that I added a


##sys#thread-clear-blocking-state!

Towards the end of scheduler.scm and used it for consistency whereever I 
ran into not-so-clean unlocks.


Now this is still an invasive change. But looking at the source of 
scheduler and srfi-18 in chicken 5 right now, I can't fight the feeling 
that it is working around the missing changes at several places.


Best

/Jörg


___
Chicken-hackers mailing list
Chicken-hackers@nongnu.org
https://lists.nongnu.org/mailman/listinfo/chicken-hackers


[Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error

2018-11-30 Thread megane
Hi,

Here's another version that crashes quickly with "very high
probability".

(cond-expand
 (chicken-5 (import (chicken base))
(import (chicken time))
(import srfi-18))
 (else (import chicken)
   (use srfi-18)))

(define m (make-mutex))

(print "@@ " (current-thread) " " "lock")
(mutex-lock! m)

(define t (current-milliseconds))
(define (get-tosleep)
  (/ (floor (* 1000 (- (+ t .030) (current-milliseconds 1000))

(thread-start!
 (make-thread (lambda ()
;; (thread-sleep! .01)
(print "@@ " (current-thread) " " "lock")
(let lp ()
  (when (not (mutex-lock! m (get-tosleep)))
(thread-yield!)
(lp)))
(print "@@ " (current-thread) " " "unlock")
(mutex-unlock! m
(print "@@ " (current-thread) " " "sleep")
(thread-sleep! (get-tosleep))
(print "@@ " (current-thread) " " "unlock")
(mutex-unlock! m)
(thread-yield!)
(thread-sleep! .01)
(print "All ok!!")

--- typical output of a failing execution:

$ stdbuf -oL -eL ./t |& cat -n
 1  @@ # lock
 2  #: locking #
 3  @@ # sleep
 4  # blocks for timeout 933.0
 5   scheduling, current: #, ready: 
(#)
 6  timeout: # -> 933.0 (now: 904)
 7  switching to #
 8  @@ # lock
 9  #: locking #
10  # blocks for timeout 933.0
11  # sleeping on mutex mutex0
12   scheduling, current: #, ready: ()
13  timeout: # -> 933.0 (now: 904)
14  timeout: # -> 933.0 (now: 934)
15  timeout expired for #
16  unblocking: #
17  timeout: # -> 933.0 (now: 934)
18  timeout expired for #
19  unblocking: #
20  switching to #
21  @@ # unlock
22  #: unlocking mutex0
23
24  Error: (mutex-unlock) Internal scheduler error: unknown thread state
25  #
26  ready
27
28  Call history:
29
30  t.scm:27: chicken.base#print
31  t.scm:28: get-tosleep
32  t.scm:15: chicken.time#current-milliseconds
33  t.scm:15: scheme#floor
34  t.scm:15: scheme#/
35  t.scm:28: srfi-18#thread-sleep!
36  t.scm:29: srfi-18#current-thread
37  t.scm:29: chicken.base#print
38  t.scm:30: srfi-18#mutex-unlock! <--

(There's an extra debug message on line 15.
 Add (dbg "timeout expired for " tto) in this true branch:

 (if (>= now tmo1) ; timeout reached?

 in ##sys#schedule)

--- The issue
mutex-unlock! makes the decision that a thread freed from
the mutex's waiting list cannot be in the 'ready state.

>From the output above you see a case how a thread waiting on a mutex
can end up being in the 'ready state.

line  2: The mutex is locked by primordial thread (pt)
line  4: The pt goes to sleep until 933.0
line  7: As the pt goes to sleep thread1 is scheduled to run
line 10: thread1 tries to lock the mutex, but sets a timeout that
 happens to be at time 933.0

lines 12-14: Both threads asleep, time advances to 934
lines 15-16: pt gets put on the ready list
lines 17-19: thread1 gets put on the ready list
line 20: pt starts running
lines 21-22: pt executes mutex-unlock! while thread1 is ready to run

--- A fix

Just allow the 'ready state for threads in mutex-unlock!

In the patch I arbitrarily call ##sys#schedule after removing a thread
from the list, but I think doing nothing would work equally well.

Is this a correct fix?
Sorry, I can't help with that one..

Maybe it's possible there's threads on the waiting list, but the thread
that gets removed is not going to lock the mutex:

There are 3 threads in this scenario, A, B and C.

* A locks mutex
* A sleeps until t
* B tries to lock mutex until t
* C tries to lock mutex
* A and B are woken up at t
* A unlocks mutex, frees B
* B is scheduled to run as per the patch
* B finds out about the timeout, gives up and starts doing something else
* Now thread C is waiting on the mutex but no-one is going to free it!


diff -r 25ced70261b2 5/srfi-18/srfi-18.scm
--- a/5/srfi-18/srfi-18.scm Fri Nov 30 14:40:00 2018 +0200
+++ b/5/srfi-18/srfi-18.scm Fri Nov 30 16:26:19 2018 +0200
@@ -420,6 +420,7 @@
 ((blocked sleeping)
  (##sys#setslot wt 11 #f)
  (##sys#add-to-ready-queue wt))
+ ((ready) (##sys#schedule))
 (else
  (##sys#error 'mutex-unlock "Internal scheduler error: unknown 
thread state"
   wt wts))) ) )
diff -r 25ced70261b2 5/srfi-18/tests/issue-1564.scm
--- /dev/null   Thu Jan 01 00:00:00 1970 +
+++ b/5/srfi-18/tests/issue-1564.scmFri Nov 30 16:26:19 2018 +0200
@@ -0,0 +1,32 @@
+(cond-expand
+ (chicken-5 (import (chicken base))
+(import (chicken time))
+(import srfi-18))
+ (else (import chicken)
+   (use srfi-18)))
+
+(define m (make-mutex))
+
+(print "@@ " (current-thread) " " "lock")