Re: [Valgrind-users] Valgrind apparently looping after long test run

Philippe Waroquiers Tue, 29 Oct 2013 13:52:06 -0700

On Tue, 2013-10-29 at 10:24 +0900, Tim Burress wrote:
> I'm using Valgrind 3.8.1 to run fuzzing tests on control software for an
> embedded system. The tests are run on
> 
> Linux pod 3.11.4-201.fc19.x86_64 #1 SMP Thu Oct 10 14:11:18 UTC 2013
> x86_64 x86_64 x86_64 GNU/Linux
> 
> Valgrind itself is the standard Fedora 19 build with the following
> arguments:
> 
> valgrind      -v --leak-check=full
>               --leak-resolution=high
>               --show-possibly-lost=yes
>               --track-fds=yes
>               --num-callers=24
>               --vgdb=no
> 
> Normally all this works fine, but last night I did an overnight run on a
> new dataset and after about eight hours valgrind started looping. I
> wasn't sure what to do about that... there were no diagnostics either
> from my software or from valgrind. When I attached to the program with
> strace I got this endless series:
> 
> clock_gettime(CLOCK_MONOTONIC, {124051, 458677331}) = 0
> rt_sigprocmask(SIG_SETMASK, ~[], ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0
> rt_sigtimedwait([HUP INT ILL BUS FPE KILL SEGV TERM STOP WINCH SYS RTMIN
> RT_1], 0x4028a5e30, {0, 0}, 8) = -1 EAGAIN (Resource temporarily
> unavailable)
> rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0
> gettid()                                = 30464
> gettid()                                = 30464
> write(1029, "A", 1)                     = 1
> gettid()                                = 30464
> read(1028, "A", 1)                      = 1
> gettid()                                = 30464
> clock_gettime(CLOCK_MONOTONIC, {124051, 469072047}) = 0
> rt_sigprocmask(SIG_SETMASK, ~[], ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0
> rt_sigtimedwait([HUP INT ILL BUS FPE KILL SEGV TERM STOP WINCH SYS RTMIN
> RT_1], 0x4028a5e30, {0, 0}, 8) = -1 EAGAIN (Resource temporarily
> unavailable)
> rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0
> gettid()                                = 30464
> gettid()                                = 30464
> write(1029, "B", 1)                     = 1
> gettid()                                = 30464
> read(1028, "B", 1)                      = 1
> gettid()
> 
> It loops through the English alphabet, one letter at a time, uppercase A
> to uppercase Z, and then repeats. TID 30464 is the PID of the valgrind
> process. File descriptors 1028 and 1029 appear to be the opposite ends
> of a pipe, but I have no idea why the alphabet would be cycling through
> it. The sigtimedwait() call is polling for one of those signals and not
> getting it. We have no such calls explicitly in our code, and no RT
> calls at all, which leads me to think this is valgrind's internal code,
> but of course it could still be measuring something within the main program.
All the above is caused by the internal Valgrind kitchen.
In particular, the pipes are used to implement a lock.
The write is an "unlock" and the read is a "lock".


> 
> I ran gstack on it but got only:
> 
> gstack 30464
> #0  0x0000000038033cb0 in vgMemCheck_helperc_LOADV64le ()
> #1  0x0000000403290318 in ?? ()
> #2  0x00000004028a5f00 in ?? ()
> #3  0x0000000000000000 in ?? ()
> 
> which doesn't tell me a whole lot, and I don't find vgMemCheck_helperc
> anywhere in the code, so I guess it's something synthesized at compile time.
What you observe above is the host stack trace executing the guest code.
All the above indicates Valgrind is (still) executing the application
code.
Valgrind is (normally) compiled with -g. But in any case, if it was not
(or the Valgrind debug info was not installed on your fedora), this will
not help to get an understandable guest stack trace.

Rather, do not use --vgdb=no, rather keep the default (--vgdb=yes)
or use --vgdb=full (if needed).


You can then use gdb+vgdb to debug your application running under
Valgrind and see e.g. what your application is doing.

For more information look at Valgrind user manual section
3.2. Debugging your program using Valgrind gdbserver and GDB
http://www.valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.gdbserver

Philippe
NB: just for my own info, why are you using --vgdb=no ?
Any problems with it ?


------------------------------------------------------------------------------
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] Valgrind apparently looping after long test run

Reply via email to