On Wed, Dec 20, 2017 at 10:25:33PM +0100, Aurelien Jarno wrote: > ... > I have also tested that the problem is still present in upstream git > HEAD. > > It looks to me the best is to take this issue upstream. Do you want me > to forward your bug report and your example in the upstream bugzilla [1] > or do you prefer to do it yourself?
despite me doing an honest search for an already reported bug like this before, i did not find this one, but it seems to address the same problem: https://sourceware.org/bugzilla/show_bug.cgi?id=21422 it is marked as INVALID with the explanation that there are no "robust" variants of the pthread-condition variables. torvalds suggestion mentions catching signals, but as i exlpained this is not really an option for us (and does also not work if the waiter is killed by SIGKILL e.g. by the OOM killer...). the other suggestion is to use other non-blocking synchronization methods such as C11 atomics. and further down it was also suggested to directly use linux' futex operations. thats what i actually did in the last week: signal my "subscribers" via a FUTEX_WAKE and by using non-blocking atomic memory ops. it works and will probably yield much better real-time behaviour (not yet tested). still i had to implement it for our system and my guess is that it might break other in-production systems (because they erroneously rely on what is considered to be not-required behaviour). so maybe debian should at least make an explicit warning when installing glibc >= 2.25? -- Florian Schmidt DLR German Aerospace Center - Institute of Robotics and Mechatronics P.O.Box 1116, D-82230 Wessling