Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Alex Bennée


Greg Kurz  writes:

> On Fri, 11 Jan 2019 16:41:41 +0100
> Paolo Bonzini  wrote:
>
>> On 11/01/19 16:28, Alex Bennée wrote:
>> >> Why not g_usleep?  It already does a while loop around nanosleep (which
>> >> returns the remaining time in the wait, like select but unlike sleep and
>> >> poll).
>> > Yeah I'm testing that now. However I have managed to trigger:
>> >
>> >   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): 
>> > (35584 == 0)
>>
>> I think that's a good old SIGSEGV (0x8B00).
>>
>
> Hmmm... system() returns a "wait status" that can  be examined using the
> macros described in waitpid(2), and we have:
>
> /* If WIFEXITED(STATUS), the low-order 8 bits of the status.  */
> #define   __WEXITSTATUS(status)   (((status) & 0xff00) >> 8)
>
> So this rather looks like a 139 exit status to me... Not sure how
> this can happen though.

Yeah the child segfaulted in mcount while closing down. I've started a
new thread with the details of the remaining failure modes:

  Subject: Remaining CI failures
  Date: Fri, 11 Jan 2019 19:10:07 +
  Message-ID: <87lg3rui28@linaro.org>


--
Alex Bennée



Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Eduardo Habkost
On Fri, Jan 11, 2019 at 04:06:54PM +, Alex Bennée wrote:
> 
> Paolo Bonzini  writes:
> 
> > On 11/01/19 16:28, Alex Bennée wrote:
> >>> Why not g_usleep?  It already does a while loop around nanosleep (which
> >>> returns the remaining time in the wait, like select but unlike sleep and
> >>> poll).
> >> Yeah I'm testing that now. However I have managed to trigger:
> >>
> >>   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): 
> >> (35584 == 0)
> >
> > I think that's a good old SIGSEGV (0x8B00).
> 
> According to the PC in the logs:
> 
>   Line 98 of "mcount.c" starts at address 0x76e15145 
> <__mcount_internal+69> and ends at 0x76e15148 <__mcount_internal+72>.

Was this on Travis?  Which architecture?

-- 
Eduardo



Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Greg Kurz
On Fri, 11 Jan 2019 16:41:41 +0100
Paolo Bonzini  wrote:

> On 11/01/19 16:28, Alex Bennée wrote:
> >> Why not g_usleep?  It already does a while loop around nanosleep (which
> >> returns the remaining time in the wait, like select but unlike sleep and
> >> poll).  
> > Yeah I'm testing that now. However I have managed to trigger:
> > 
> >   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): 
> > (35584 == 0)  
> 
> I think that's a good old SIGSEGV (0x8B00).
> 

Hmmm... system() returns a "wait status" that can  be examined using the
macros described in waitpid(2), and we have:

/* If WIFEXITED(STATUS), the low-order 8 bits of the status.  */
#define __WEXITSTATUS(status)   (((status) & 0xff00) >> 8)

So this rather looks like a 139 exit status to me... Not sure how
this can happen though.

> Paolo
> 




Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Alex Bennée


Paolo Bonzini  writes:

> On 11/01/19 16:28, Alex Bennée wrote:
>>> Why not g_usleep?  It already does a while loop around nanosleep (which
>>> returns the remaining time in the wait, like select but unlike sleep and
>>> poll).
>> Yeah I'm testing that now. However I have managed to trigger:
>>
>>   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 
>> == 0)
>
> I think that's a good old SIGSEGV (0x8B00).

According to the PC in the logs:

  Line 98 of "mcount.c" starts at address 0x76e15145 <__mcount_internal+69> 
and ends at 0x76e15148 <__mcount_internal+72>.

>
> Paolo


--
Alex Bennée



Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Alex Bennée


Paolo Bonzini  writes:

> On 11/01/19 16:28, Alex Bennée wrote:
>>> Why not g_usleep?  It already does a while loop around nanosleep (which
>>> returns the remaining time in the wait, like select but unlike sleep and
>>> poll).
>> Yeah I'm testing that now. However I have managed to trigger:
>>
>>   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 
>> == 0)
>
> I think that's a good old SIGSEGV (0x8B00).

Hmmm, but I haven't been able to trigger it running it directly:

  retry.py -n 30 -c -- ./tests/qht-bench 1>/dev/null 2>&1 -R -S0.1 -D1 -N1 
-n 4 -u 20 -d 1

Could this be some sort of weird interaction caused by using system()?

--
Alex Bennée



Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Paolo Bonzini
On 11/01/19 16:28, Alex Bennée wrote:
>> Why not g_usleep?  It already does a while loop around nanosleep (which
>> returns the remaining time in the wait, like select but unlike sleep and
>> poll).
> Yeah I'm testing that now. However I have managed to trigger:
> 
>   ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 
> == 0)

I think that's a good old SIGSEGV (0x8B00).

Paolo



Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Alex Bennée


Paolo Bonzini  writes:

> On 11/01/19 15:38, Alex Bennée wrote:
>> Relying on sleep to always return having slept isn't safe as a signal
>> may have occurred. If signals are constantly incoming the program will
>> never reach it's termination condition. This is believed to be the
>> mechanism causing time outs for qht-test in Travis.
>>
>> Instead we use a g_timer to determine if the duration of the test has
>> passed and sleep for a second at a time. This may bias benchmark
>> results for short runs.
>
> Why not g_usleep?  It already does a while loop around nanosleep (which
> returns the remaining time in the wait, like select but unlike sleep and
> poll).

Yeah I'm testing that now. However I have managed to trigger:

  ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 == 
0)

but I'm not sure if this is some other side-effect of the
test-qht-par/qht-bench invocation dance.

--
Alex Bennée



Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer

2019-01-11 Thread Paolo Bonzini
On 11/01/19 15:38, Alex Bennée wrote:
> Relying on sleep to always return having slept isn't safe as a signal
> may have occurred. If signals are constantly incoming the program will
> never reach it's termination condition. This is believed to be the
> mechanism causing time outs for qht-test in Travis.
> 
> Instead we use a g_timer to determine if the duration of the test has
> passed and sleep for a second at a time. This may bias benchmark
> results for short runs.

Why not g_usleep?  It already does a while loop around nanosleep (which
returns the remaining time in the wait, like select but unlike sleep and
poll).

Thanks,

Paolo