Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
Greg Kurz writes: > On Fri, 11 Jan 2019 16:41:41 +0100 > Paolo Bonzini wrote: > >> On 11/01/19 16:28, Alex Bennée wrote: >> >> Why not g_usleep? It already does a while loop around nanosleep (which >> >> returns the remaining time in the wait, like select but unlike sleep and >> >> poll). >> > Yeah I'm testing that now. However I have managed to trigger: >> > >> > ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): >> > (35584 == 0) >> >> I think that's a good old SIGSEGV (0x8B00). >> > > Hmmm... system() returns a "wait status" that can be examined using the > macros described in waitpid(2), and we have: > > /* If WIFEXITED(STATUS), the low-order 8 bits of the status. */ > #define __WEXITSTATUS(status) (((status) & 0xff00) >> 8) > > So this rather looks like a 139 exit status to me... Not sure how > this can happen though. Yeah the child segfaulted in mcount while closing down. I've started a new thread with the details of the remaining failure modes: Subject: Remaining CI failures Date: Fri, 11 Jan 2019 19:10:07 + Message-ID: <87lg3rui28@linaro.org> -- Alex Bennée
Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
On Fri, Jan 11, 2019 at 04:06:54PM +, Alex Bennée wrote: > > Paolo Bonzini writes: > > > On 11/01/19 16:28, Alex Bennée wrote: > >>> Why not g_usleep? It already does a while loop around nanosleep (which > >>> returns the remaining time in the wait, like select but unlike sleep and > >>> poll). > >> Yeah I'm testing that now. However I have managed to trigger: > >> > >> ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): > >> (35584 == 0) > > > > I think that's a good old SIGSEGV (0x8B00). > > According to the PC in the logs: > > Line 98 of "mcount.c" starts at address 0x76e15145 > <__mcount_internal+69> and ends at 0x76e15148 <__mcount_internal+72>. Was this on Travis? Which architecture? -- Eduardo
Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
On Fri, 11 Jan 2019 16:41:41 +0100 Paolo Bonzini wrote: > On 11/01/19 16:28, Alex Bennée wrote: > >> Why not g_usleep? It already does a while loop around nanosleep (which > >> returns the remaining time in the wait, like select but unlike sleep and > >> poll). > > Yeah I'm testing that now. However I have managed to trigger: > > > > ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): > > (35584 == 0) > > I think that's a good old SIGSEGV (0x8B00). > Hmmm... system() returns a "wait status" that can be examined using the macros described in waitpid(2), and we have: /* If WIFEXITED(STATUS), the low-order 8 bits of the status. */ #define __WEXITSTATUS(status) (((status) & 0xff00) >> 8) So this rather looks like a 139 exit status to me... Not sure how this can happen though. > Paolo >
Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
Paolo Bonzini writes: > On 11/01/19 16:28, Alex Bennée wrote: >>> Why not g_usleep? It already does a while loop around nanosleep (which >>> returns the remaining time in the wait, like select but unlike sleep and >>> poll). >> Yeah I'm testing that now. However I have managed to trigger: >> >> ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 >> == 0) > > I think that's a good old SIGSEGV (0x8B00). According to the PC in the logs: Line 98 of "mcount.c" starts at address 0x76e15145 <__mcount_internal+69> and ends at 0x76e15148 <__mcount_internal+72>. > > Paolo -- Alex Bennée
Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
Paolo Bonzini writes: > On 11/01/19 16:28, Alex Bennée wrote: >>> Why not g_usleep? It already does a while loop around nanosleep (which >>> returns the remaining time in the wait, like select but unlike sleep and >>> poll). >> Yeah I'm testing that now. However I have managed to trigger: >> >> ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 >> == 0) > > I think that's a good old SIGSEGV (0x8B00). Hmmm, but I haven't been able to trigger it running it directly: retry.py -n 30 -c -- ./tests/qht-bench 1>/dev/null 2>&1 -R -S0.1 -D1 -N1 -n 4 -u 20 -d 1 Could this be some sort of weird interaction caused by using system()? -- Alex Bennée
Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
On 11/01/19 16:28, Alex Bennée wrote: >> Why not g_usleep? It already does a while loop around nanosleep (which >> returns the remaining time in the wait, like select but unlike sleep and >> poll). > Yeah I'm testing that now. However I have managed to trigger: > > ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 > == 0) I think that's a good old SIGSEGV (0x8B00). Paolo
Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
Paolo Bonzini writes: > On 11/01/19 15:38, Alex Bennée wrote: >> Relying on sleep to always return having slept isn't safe as a signal >> may have occurred. If signals are constantly incoming the program will >> never reach it's termination condition. This is believed to be the >> mechanism causing time outs for qht-test in Travis. >> >> Instead we use a g_timer to determine if the duration of the test has >> passed and sleep for a second at a time. This may bias benchmark >> results for short runs. > > Why not g_usleep? It already does a while loop around nanosleep (which > returns the remaining time in the wait, like select but unlike sleep and > poll). Yeah I'm testing that now. However I have managed to trigger: ERROR:tests/test-qht-par.c:20:test_qht: assertion failed (rc == 0): (35584 == 0) but I'm not sure if this is some other side-effect of the test-qht-par/qht-bench invocation dance. -- Alex Bennée
Re: [Qemu-devel] [RFC PATCH] tests: replace rem = sleep(time) with g_timer
On 11/01/19 15:38, Alex Bennée wrote: > Relying on sleep to always return having slept isn't safe as a signal > may have occurred. If signals are constantly incoming the program will > never reach it's termination condition. This is believed to be the > mechanism causing time outs for qht-test in Travis. > > Instead we use a g_timer to determine if the duration of the test has > passed and sleep for a second at a time. This may bias benchmark > results for short runs. Why not g_usleep? It already does a while loop around nanosleep (which returns the remaining time in the wait, like select but unlike sleep and poll). Thanks, Paolo