I'm not sure this is the problem that you're seeing, but I see a
problem with the example. It boils down to the fact that futures do not
provide concurrency.

That may sound like a surprising claim, because the whole point of
futures is to run multiple things at a time. But futures merely offer
best-effort parallelism; they do not provide any guarantee of
concurrency.

As a consequence, trying to treat an fsemaphore as a lock can go wrong.
If a future manages to take an fsemaphore lock, but the future is not
demanded by the main thread --- or in a chain of future demands that
are demanded by the main thread --- then nothing obliges the future to
continue running; it can hold the lock forever.

(I put the blame on femspahores. Adding fsemaphores to the future
system was something like adding mutation to a purely functional
language. The addition makes certain things possible, but it also
breaks local reasoning that the original design was supposed to
enable.)

In your example program, I see

 (define workers (do-start-workers))
 (displayln "started")
 (for ((i 10000))
   (mfqueue-enqueue! mfq 1))

where `do-start-workers` creates a chain of futures, but there's no
`touch` on the root future while the loop calls `mfqueue-enqueue!`.
Therefore, the loop can block on an fsemaphore because some future has
taken the lock but stopped running for whatever reason.

In this case, adding `(thread (lambda () (touch workers)))` before the
loop after "started" might fix the example. In other words, you can use
the `thread` concurrency construct in combination with the `future`
parallelism construct to ensure progress. I think this will work
because all futures in the program end up in a linear dependency chain.
If there were a tree of dependencies, then I think you'd need a
`thread` for each `future` to make sure that every future has an active
demand.

If you're seeing a deadlock at the `(touch workers)`, though, my
explanation doesn't cover what you're seeing. I haven't managed to
trigger the deadlock myself.

At Sat, 23 May 2020 18:51:23 +0200, Dominik Pantůček wrote:
> Hello again with futures!
> 
> I started working on futures-based workers and got quickly stuck with a
> dead-lock I think does not originate in my code (although it is two
> semaphores, 8 futures, so I'll refrain from strong opinions here).
> 
> I implemented a very simple futures-friendly queue using mutable pairs
> and created a minimal-deadlocking-example[1]. I am running racket 3m
> 7.7.0.4 which includes fixes for the futures-related bugs I discovered
> recently.
> 
> Sometimes the code just runs fine and shows the numbers of worker
> iterations performed in different futures (as traced by the 'fid'
> argument). But sometimes it locks in a state where there is one last
> number in the queue (0 - zero) and yet the fsemaphore-count for the
> count fsemaphore returns 0. Which means the semaphore was decremented
> twice somewhere. The code is really VERY simple and I do not see a
> race-condition within the code, that would allow any code path to
> decrement the fsema-count fsemaphore twice once the worker future
> receives 0.
> 
> I am able to reproduce the behavior with racket3m running under gdb and
> get the stack traces for all the threads pretty consistently. The
> deadlock is apparently at:
> 
>   2    Thread 0x7ffff7fca700 (LWP 46368) "mfqueue.rkt"
> futex_wait_cancelable (private=<optimized out>, expected=0,
> futex_word=0x5555559d8e78) at ../sysdeps/nptl/futex-internal.h:183
> 
> But that is just where the issue is showing up. The real question is how
> the counter gets decremented twice (given that fsemaphores should be
> futures-safe).
> 
> Any hints would be VERY appreciated!
> 
> 
> Cheers,
> Dominik
> 
> [1] http://pasterack.org/pastes/28883
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/racket-users/5dcf1260-e8bf-d719-adab-5a0fd937
> 8075%40trustica.cz.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/20200523112413.15a%40sirmail.smtp.cs.utah.edu.

Reply via email to