On 2/2/2017 14:45, Branko Čibej wrote:
> On 02.02.2017 13:30, Stefan Hett wrote:
>> On 2/2/2017 1:24 PM, Branko Čibej wrote:
>>> On 02.02.2017 12:49, Stefan Hett wrote:
>>>> On 2/2/2017 12:04 PM, Nick Kew wrote:
>>>>> On Wed, 2017-02-01 at 00:23 +0100, Stefan wrote:
>>>>>> Hi,
>>>>>>
>>>>>> the issue was discovered as part of tracing down a deadlock
>>>>>> condition in
>>>>>> an SVN test [1].
>>>>>>
>>>>>> As far as I understand it, the problem lies in the C-standard not
>>>>>> explicitly defining when a function registered with atexit() is
>>>>>> called
>>>>>> in the context of thread termination [2].
>>>>> I had no idea atexit() was called on thread termination.  I guess the
>>>>> manpage must be misleading in telling us it happens at process exit?
>>>> I expressed myself poorly a bit here. I didn't want to suggest
>>>> atexit()-registered functions are called on thread termination but
>>>> rather that threads can be terminated before the atexit-registered
>>>> functions are called.
>>>> So if apr_terminate is registered in a function called from a DLL via
>>>> atexit(), it can happen (and in my scenario does happen) that threads
>>>> got terminated before apr_terminate() is called.
>>> But that should not matter. apr_thread_join() does not terminate a
>>> thread, it waits for the thread to terminate. The thread handle should
>>> remain valid until apr_thread_join() is called, even if the thread
>>> terminated before the join().
>>>
>> This is correct. The problem is that thd_cnt is not decremented
>> (thd->td is valid, apr_thread_join() waits until the thread is (which
>> it is already) and then returns APR_INCOMPLETE - nothing decrements
>> the thd_cnt value for the already terminated thread).
>
> It turns out that this is a bug in APR's thread pool code, and has been
> fixed on trunk. APR_INCOMPLETE is the _expected_ return code from
> apr_thread_join() on Windows in this case, as far as I can interpret the
> code.
>
As it turned out, the issue is actually still present on trunk. The full
details behind the problem are summed up in a blog post now:
http://www.luke1410.de/blog/?p=95

What do you (or the others) think. Does this warrant the patch being
applied? I'd really not feel too good with leaving that issue unresolved
in APR, given there's a fix available; but I'd certainly not commit it,
if it's not getting support by the established community/committers.

If someone things the patch needs to be improved (or a different
approach to the problem should be taken), I'm certainly offering to come
up with alternatives.

Regards,
Stefan

Reply via email to