Re: [Valgrind-users] Wrong function names (or line numbers) in ms_print output

Philippe Waroquiers Tue, 08 Dec 2015 14:27:19 -0800

On Tue, 2015-12-08 at 09:08 -0800, Nikolaus Rath wrote:
> Hello,
> 
> I'm having a problem using massif. When looking at the ms_print output, I'm 
> getting entries like this:
> 
> ->10.77% (11,146,544B) 0xF4AC9C: H5FL_reg_calloc (in 
> /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model)
> | ->10.58% (10,946,896B) 0x1000D57: H5S_copy (in 
> /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model)
> | | ->10.57% (10,940,640B) 0xEE0A80: H5A_create (in 
> /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model)
> | | | ->10.57% (10,940,640B) 0xEDAAAF: H5Acreate2 (in 
> /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model)
> | | |   ->10.57% (10,940,640B) 0xEC985C: h5acreate_c_ (in 
> /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model)
> | | |     ->10.57% (10,940,640B) 0xEC3C55: h5a_mp_h5acreate_f_ (in 
> /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model)
> | | |       ->10.57% (10,940,640B) 0xB99ED5: 
> taehdf5_mp_h5append_data_double_0d_ (taehdf5.f90:1936)
> | | |         ->04.24% (4,387,296B) in 40 places, all below massif's 
> threshold (01.00%)
> 
> 
> However, line 1936 of taehdf5.f90 is actually inside a different subroutine, 
> defined as:
> 
> subroutine h5dump_attr_int(loc_id,f,name)
> [..]    
>     ! next line is 1936 
>     call h5acreate_f(loc_id,name,H5T_NATIVE_INTEGER,space_id,attr_id,hdferr)
> [...]
> end subroutine
> 
> 
> The h5append_data_double_0d subroutine is actually defined much later, 
> starting in line 4120 with:
> 
>   subroutine h5append_data_double_0d(group_id,f,name)
> 
> ..and it does not contain any calls to h5acreate_f. So I think that probably 
> the line number is correct, but the function name is not.
> 
> Does anyone know what might cause this?
What is the platform ? (version of valgrind/os/gcc, which cpu, ....) ?



Strange stacktraces can be given in case the compiler is inlining some
calls or if Valgrind unwinder has a bug.

You might check that using gdb+vgdb: put a break on H5FL_reg_calloc.
When break encountered, do
  (gdb) backtrace
  (gdb) monitor v.info scheduler
This will allow to compare:
   the backtrace produced by gdb
   the stacktrace produced by Valgrind (using inline info)
and the stacktrace produced by massif in its output file.
If the first 2 stacktraces are as expected, but the massif stacktrace
is not, then that is probably because massif does not use inline info
to produce its output.

If the gdb backtrace is ok, but the monitor stack trace is not,
then it means the valgrind unwinder does not properly  unwind your code.

Philippe


Some more background info about stacktraces and inline info:
If you run:
    valgrind --read-inline-info=no ./memcheck/tests/inlinfo
then the last stacktrace produced is:
==2277==    at 0x80485C4: fun_noninline_o (inlinfo.c:40)
==2277==    by 0x804867D: fun_noninline_n (inlinfo.c:48)
==2277==    by 0x804880D: main (inlinfo.c:72)
There is effectively a call to fun_noninline_n at line 72.
However, at line 48, we are inside the function fun_f, while the
stacktrace above shows fun_noninline_n.
When running with the (default) --read-inline-info=yes, the stacktrace
is:
==2302==    at 0x80485C4: fun_noninline_o (inlinfo.c:40)
==2302==    by 0x804867D: fun_f (inlinfo.c:48)
==2302==    by 0x804867D: fun_e (inlinfo.c:54)
==2302==    by 0x804867D: fun_noninline_n (inlinfo.c:60)
==2302==    by 0x804880D: main (inlinfo.c:72)

which is then understandable: when there is inlining, the line
nr corresponds to the (last) inlined function, but the function name
corresponds to the 'inlining' function.

Massif does not expand inlined function calls (so, produces stacktraces
as if --read-inline-info=no was given).
See the call to VG_(describe_IP) in ms_main.c:  it has a NULL second
argument.




------------------------------------------------------------------------------
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Re: [Valgrind-users] Wrong function names (or line numbers) in ms_print output

Reply via email to