1. Describe the environment completely.

Also: Any kind of threading (pthreads, or shm_open, or mmap(,,,MAP_SHARED,,))
must be mentioned explicitly.  Multiple execution contexts which access
the same address space instance are a significant complicating factor.

If threading is involved, then try using "valgrind --tool=drd ..."
or --tool=helgrind, because those tools specifically target detecting
race conditions and other synchronization errors, much like --tool=memcheck
[the default tool when no --tool= is mentioned] targets errors involving
malloc() and free(), uninitialized variables, etc.

4. Walk before attempting to run.
Did you try a simple example?  Write a half-page program with 5 subroutines,
each of which calls the next one, and the last one sends SIGABRT to the process.

Does the .core file when run under valgrind give the correct traceback using 
gdb?

Specifically: apply valgrind to the small program which causes a deliberate 
SIGABRT,
and get a core file.  Does gdb give the correct traceback for that core file?
If not, then you have an ideal test case for filing a bug report against 
valgrind
because even the simple core file is bad.  If gdb does give a correct traceback
for the simple core file, then you have to keep looking for the source of the
problem on your larger program.


5. (Learn and) Use the built-in tools where possible.
Run the process interactively, invoking valgrind with "--vgdb-error=0",
and giving the debugger command "(gdb) continue" after establishing
connectivity between vgdb and the process.
See the valgrind manual, section 3.2.9 "vgdb command line options".
When the SIGABRT happens, then vgdb will allow you to use all the ordinary
gdb commands to get a backtrace, go up and down the stack, examine
variables and other memory, run
    (gdb) info proc
    (gdb) shell cat /proc/$PID/maps
to see exactly the layout of process memory, etc.
There are also special commands to access valgrind functionality
interactively, such as checking for memory leaks.


I already explained why I don't want / can't use the interactive gdb. I'm aware 
of the option, I've used it before, but in this case it's not very practical.

The gdb process does not *have* to be run interactively, it just takes more work
and patience to run non-interactively.  Run "valgrind --vgdb-error=0 ..."
and notice the last part of the printed instructions:

         and then give GDB the following command
     ==215935==   target remote | /path/to/libexec/valgrind/../../bin/vgdb 
--pid=215935
     ==215935== --pid is optional if only one valgrind process is running

So if there is only one valgrind process, then you do not need to know the pid.
Thus you can run gdb with re-directed stdin/stdout/stderr, or perhaps use the -x
command-line option.  This allows a static, pre-scripted list of gdb commands;
it may require a few iterations to get a good debug script.  (Try the commands
using the trivial SIGABRT case!)  Also get the full gdb manual (more than 800 
pages)
and look at the "thread apply all ..." and "frame apply all ..." commands.

It may be possible to perform some interactive "reconnaisance" to suggest
good things for the script to try.  Using --vgdb-error=0, put a breakpoint
on a likely location for the error (or shortly before the error),
and look around.  In the logged traceback:

  TRAP: FailedAssertion("prev_first_lsn < cur_txn->first_lsn", File: 
"reorderbuffer.c", Line: 902, PID: 536049)
  (ExceptionalCondition+0x98)[0x8f5cec]
  (+0x57a574)[0x682574]
  (+0x579edc)[0x681edc]
  (ReorderBufferAddNewTupleCids+0x60)[0x6864dc]
  (SnapBuildProcessNewCid+0x94)[0x68b6a4]

any of those named locations, or shortly before them, might be a good spot.
When execution stops at any one of the breakpoints, then look around
and see if you can find clues about "prev_first_lsn < cur_txn->first_lsn"
even though the error has not yet occurred.  Perhaps this will help
identify location(s) that might be closer to the actual error
when it does happen.  This might suggest commands for the non-interactive
gdb debugging script.



_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to