Re: [Valgrind-users] memcheck is getting SIGKILLed before leak report is output

2022-08-31 Thread Bresalier, Rob (Nokia - US/Murray Hill)
> Normally, if it is the OOM that kills a process, you should find a trace of 
> this in the system logs.

I looked in every system log I could find, there was no indication of OOM 
killing it in any system log.

> I do not understand what you mean by reducing the nr of callers from 12 to 6.
> What are these callers ? Is that some threads of the process you are running
> under valgrind ?
> 

I mean the --num-callers option core option to valgrind. By default this is 12, 
and I didn't specify it. I tried using --num-callers=6 to reduce memory 
consumption. From the valgrind manual this means " Specifies the maximum number 
of entries shown in stack traces that identify program locations.". By reducing 
it to 6 I was hoping to reduce valgrind memory consumption in case it really 
was OOM killer, which I really doubt now.

> And just in case: are you using the last version of Valgrind ?

Yes I used the last version of valgrind and many earlier versions.

> You might use "strace" on valgrind to see what is going on at the time
> _exit(0) is called.

I did use 'strace' and dmesg. Neither indicated it was OOM killer.

I did happen to save the strace log when the SIGKILL happened. Here is the part 
around the _exit(0):

read(2040, "R", 1)  = 1
gettid()= 3332
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP SYS], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[], ~[ILL TRAP BUS FPE KILL SEGV STOP SYS], 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP SYS], NULL, 8) = 0
gettid()= 3332
write(2041, "S", 1) = 1
exit(0) = ?
+++ killed by SIGKILL +++

Don't understand why strace log has exit(0) without the underscore, I know for 
a fact that it was with the underscore.

The strace log doesn't indicate anything special happening around the _exit(0). 
When I removed it the SIGKILL went away.

> You might also start valgrind with some debug trace e.g.  -d -d -d -d -v -v 
> -v -v

Was not aware of this and didn't try it. Don't have time to try it now.

Regards,
Rob

___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] memcheck is getting SIGKILLed before leak report is output

2022-08-31 Thread Philippe Waroquiers
On Wed, 2022-08-31 at 17:42 +, Bresalier, Rob (Nokia - US/Murray Hill) 
wrote:
> > When running memcheck on a massive monolith embedded executable
> > (237MB stripped, 1.8GiB unstripped), after I stop the executable under
> > valgrind I see the "HEAP SUMMARY" but then valgrind dies before any leak
> > reports are printed. The parent process sees that the return status of
> > memcheck is that it was SIGKILLed (status returned in waitpid call is '9').
> 
> We found that removing a call to _exit(0) made it so that valgrind is no 
> longer
> SIGKILLED.
> 
> Any ideas why using _exit(0) may get rid of valgrind getting SIGKILLed?
> 
> Previously exit(0) was called, without the leading underscore, but changed it 
> to
> _exit(0) to really make sure no memory was being deallocated. This worked 
> well on a
> different process, so we carried it over to this one, that is why we did it.
> 
> Even with exit(0) (no underscore), in this process there is not much 
> deallocation going
> on in exit handlers, so have lots of doubts that valgrind/memcheck was using 
> too much
> memory and invoking the OOM killer.
> 
> Using strace and dmesg while we had _exit(0) in use didn't show that OOM 
> killer was
> SIGKILLing valgrind.
> 
> I also tried reducing number of callers from 12 to 6 when using _exit(0), 
> still got the
> SIGKILL.
> 
> Also tried using a system that had an additional 4GByte of memory, and also 
> got the
> SIGKILL there.
> 
> So I have many doubts that Valgrind was getting SIGKILLed due to too much 
> memory usage.
> 
> Don't know why removing _exit(0) got rid of the SIGKILL. Was wondering if 
> anyone had any
> ideas?
Normally, if it is the OOM that kills a process, you should find a trace of 
this in the
system logs.

I do not understand what you mean by reducing the nr of callers from 12 to 6.
What are these callers ? Is that some threads of the process you are
running under valgrind ?

And just in case: are you using the last version of Valgrind ?

You might use "strace" on valgrind to see what is going on at the time _exit(0) 
is called.
You might also start valgrind with some debug trace e.g.  -d -d -d -d -v -v -v 
-v

Philippe




___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users


Re: [Valgrind-users] memcheck is getting SIGKILLed before leak report is output

2022-08-31 Thread Bresalier, Rob (Nokia - US/Murray Hill)
> When running memcheck on a massive monolith embedded executable
> (237MB stripped, 1.8GiB unstripped), after I stop the executable under
> valgrind I see the "HEAP SUMMARY" but then valgrind dies before any leak
> reports are printed. The parent process sees that the return status of
> memcheck is that it was SIGKILLed (status returned in waitpid call is '9').

We found that removing a call to _exit(0) made it so that valgrind is no longer 
SIGKILLED.

Any ideas why using _exit(0) may get rid of valgrind getting SIGKILLed?

Previously exit(0) was called, without the leading underscore, but changed it 
to _exit(0) to really make sure no memory was being deallocated. This worked 
well on a different process, so we carried it over to this one, that is why we 
did it.

Even with exit(0) (no underscore), in this process there is not much 
deallocation going on in exit handlers, so have lots of doubts that 
valgrind/memcheck was using too much memory and invoking the OOM killer.

Using strace and dmesg while we had _exit(0) in use didn't show that OOM killer 
was SIGKILLing valgrind.

I also tried reducing number of callers from 12 to 6 when using _exit(0), still 
got the SIGKILL.

Also tried using a system that had an additional 4GByte of memory, and also got 
the SIGKILL there.

So I have many doubts that Valgrind was getting SIGKILLed due to too much 
memory usage.

Don't know why removing _exit(0) got rid of the SIGKILL. Was wondering if 
anyone had any ideas?


___
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users