On Tue, 2015-12-08 at 09:08 -0800, Nikolaus Rath wrote: > Hello, > > I'm having a problem using massif. When looking at the ms_print output, I'm > getting entries like this: > > ->10.77% (11,146,544B) 0xF4AC9C: H5FL_reg_calloc (in > /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model) > | ->10.58% (10,946,896B) 0x1000D57: H5S_copy (in > /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model) > | | ->10.57% (10,940,640B) 0xEE0A80: H5A_create (in > /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model) > | | | ->10.57% (10,940,640B) 0xEDAAAF: H5Acreate2 (in > /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model) > | | | ->10.57% (10,940,640B) 0xEC985C: h5acreate_c_ (in > /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model) > | | | ->10.57% (10,940,640B) 0xEC3C55: h5a_mp_h5acreate_f_ (in > /work/nrath/issue_2014_q2d_mem/part_agmg0_4_massif_fixed/LR_model) > | | | ->10.57% (10,940,640B) 0xB99ED5: > taehdf5_mp_h5append_data_double_0d_ (taehdf5.f90:1936) > | | | ->04.24% (4,387,296B) in 40 places, all below massif's > threshold (01.00%) > > > However, line 1936 of taehdf5.f90 is actually inside a different subroutine, > defined as: > > subroutine h5dump_attr_int(loc_id,f,name) > [..] > ! next line is 1936 > call h5acreate_f(loc_id,name,H5T_NATIVE_INTEGER,space_id,attr_id,hdferr) > [...] > end subroutine > > > The h5append_data_double_0d subroutine is actually defined much later, > starting in line 4120 with: > > subroutine h5append_data_double_0d(group_id,f,name) > > ..and it does not contain any calls to h5acreate_f. So I think that probably > the line number is correct, but the function name is not. > > Does anyone know what might cause this? What is the platform ? (version of valgrind/os/gcc, which cpu, ....) ?
Strange stacktraces can be given in case the compiler is inlining some calls or if Valgrind unwinder has a bug. You might check that using gdb+vgdb: put a break on H5FL_reg_calloc. When break encountered, do (gdb) backtrace (gdb) monitor v.info scheduler This will allow to compare: the backtrace produced by gdb the stacktrace produced by Valgrind (using inline info) and the stacktrace produced by massif in its output file. If the first 2 stacktraces are as expected, but the massif stacktrace is not, then that is probably because massif does not use inline info to produce its output. If the gdb backtrace is ok, but the monitor stack trace is not, then it means the valgrind unwinder does not properly unwind your code. Philippe Some more background info about stacktraces and inline info: If you run: valgrind --read-inline-info=no ./memcheck/tests/inlinfo then the last stacktrace produced is: ==2277== at 0x80485C4: fun_noninline_o (inlinfo.c:40) ==2277== by 0x804867D: fun_noninline_n (inlinfo.c:48) ==2277== by 0x804880D: main (inlinfo.c:72) There is effectively a call to fun_noninline_n at line 72. However, at line 48, we are inside the function fun_f, while the stacktrace above shows fun_noninline_n. When running with the (default) --read-inline-info=yes, the stacktrace is: ==2302== at 0x80485C4: fun_noninline_o (inlinfo.c:40) ==2302== by 0x804867D: fun_f (inlinfo.c:48) ==2302== by 0x804867D: fun_e (inlinfo.c:54) ==2302== by 0x804867D: fun_noninline_n (inlinfo.c:60) ==2302== by 0x804880D: main (inlinfo.c:72) which is then understandable: when there is inlining, the line nr corresponds to the (last) inlined function, but the function name corresponds to the 'inlining' function. Massif does not expand inlined function calls (so, produces stacktraces as if --read-inline-info=no was given). See the call to VG_(describe_IP) in ms_main.c: it has a NULL second argument. ------------------------------------------------------------------------------ _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users