Re: apr 1.4: testpoll crash on OSX 10.6

Neil Conway Sat, 24 Oct 2009 23:18:11 -0700

Attached is a patch against trunk that fixes this problem by changing
the test suite. For some of the tests, it is sufficient to change the
poll() timeout from 0 (one-time poll) to -1 (blocking poll). In a few
other places, the semantics of the test needed to be changed -- e.g.
if we do a blocking poll() after sending two messages, we need to
account for seeing either 1 or 2 messages.


With these changes applied, OSX 10.6 passes testpoll reliably (for
~1000 local runs), using both the POLLSET_POLL and POLLSET_KQUEUE
methods.

Neil

On Sat, Oct 24, 2009 at 7:30 PM, Neil Conway <[email protected]> wrote:
> On a related note, ISTM that many of the tests for the poll / pollset
> features are wrong in principle. They apparently assume that if you
> send a UDP datagram to localhost and then immediately poll() for it
> (with a timeout of zero), the poll() will pickup the UDP datagram you
> just sent. That is not a safe assumption, however (e.g. I see
> intermittent test failures due to this issue when using
> APR_POLLSET_POLL on OSX 10.6).
>
> Similarly, send_middle_pollset() assumes that if you send two
> datagrams and then poll(), the poll will return exactly two datagrams,
> whereas it might actually return 0, 1, or 2. And that's not even
> accounting for the possibility of UDP packet drops, which is possible
> even on localhost if the machine is under load.
>
> Neil
>
> On Sun, Oct 18, 2009 at 4:37 AM, Ruediger Pluem <[email protected]> wrote:
>>
>>
>> On 10/17/2009 11:58 PM, Ryan Phillips wrote:
>>> On Sat, Oct 17, 2009 at 2:40 AM, Ruediger Pluem <[email protected]> wrote:
>>>>
>>>> On 10/17/2009 05:50 AM, Ryan Phillips wrote:
>>>>> On Wed, Oct 14, 2009 at 12:02 PM, Neil Conway <[email protected]> 
>>>>> wrote:
>>>>>> "./tests/testall testpoll" segfaults for me consistently on OSX 10.6.1
>>>>>> with the latest code from the 1.4-stable branch (64-bit APR library).
>>>>>> gdb info:
>>>>>>
>>>>>> #0  0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0)
>>>>>> at testpoll.c:389
>>>>>> 389         ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s);
>>>>>> (gdb) bt
>>>>>> #0  0x000000010000e9b7 in send0_pollset (tc=0x7fff5fbfef80, data=0x0)
>>>>>> at testpoll.c:389
>>>>>> #1  0x0000000100001456 in abts_run_test (ts=0x100200190, f=0x10000e925
>>>>>> <send0_pollset>, value=0x0) at abts.c:168
>>>>>> #2  0x000000010000f713 in testpoll (suite=0x100200190) at testpoll.c:685
>>>>>> #3  0x0000000100001e35 in main (argc=2, argv=0x7fff5fbff020) at 
>>>>>> abts.c:424
>>>>>> (gdb) p descs
>>>>>> $1 = (const apr_pollfd_t *) 0x0
>>>>>> (gdb) p s[0]
>>>>>> $2 = (apr_socket_t *) 0x100804240
>>>> What is the value of num?
>>>>
>>>>>> (gdb) l
>>>>>> 384         rv = apr_pollset_poll(pollset, 0, &num, &descs);
>>>>>> 385         ABTS_INT_EQUAL(tc, APR_SUCCESS, rv);
>>>>>> 386         ABTS_INT_EQUAL(tc, 1, num);
>>>>>> 387         ABTS_PTR_NOTNULL(tc, descs);
>>>>>> 388
>>>>>> 389         ABTS_PTR_EQUAL(tc, s[0], descs[0].desc.s);
>>>>>> 390         ABTS_PTR_EQUAL(tc, s[0],  descs[0].client_data);
>>>>>> 391     }
>>>>>> 392
>>>>>> 393     static void recv0_pollset(abts_case *tc, void *data)
>>>>>>
>>>> Regards
>>>>
>>>> Rüdiger
>>>>
>>>
>>> Num on the freebsd machine is 0.
>>>
>>
>> Thanks for that.
>>
>> I guess we have two problems here:
>>
>> 1. The crash: We simply should not execute the lines 389 and 390 if descs is 
>> NULL.
>>   Similar situations occur in various other parts of the test suite.
>>   We use ABTS_PTR_NOTNULL and continue afterwards and continue to use the 
>> pointer
>>   that failed ABTS_PTR_NOTNULL. So does this need to be fixed everywhere 
>> where this
>>   occurs? I guess a crash of the test program just because ABTS_PTR_NOTNULL 
>> failed
>>   is not acceptable.
>>
>> 2. If descs is NULL it means that the test failed as we have the 
>> ABTS_PTR_NOTNULL
>>   test in line 387. The question is: Why does this test fail?
>>
>> Regards
>>
>> Rüdiger
>>
>

test_poll_timeout_fix-1.patch
Description: Binary data

Re: apr 1.4: testpoll crash on OSX 10.6

Reply via email to