On 9/4/22 04:16, John Reiser wrote:
Any ideas what I might be doing wrong? Or how do I load the core file?

Why does use of valgrind cause programmers to forget general debugging technique?

1. Describe the environment completely.
The report does not say which compilers and compiler versions were used,
or if the compiler commands contained any directives about debugging format.
Such information is necessary to help understand what might be happening
with regard to debugging and tracebacks.


Yes, I should have included this information - I don't have access to the machine at the moment, but I'll share the detailed info early next week.

However, it's running current RasberryPI OS 64-bit version, which is based on Debian 11. So it should have the same version of gcc etc.

2. Get debugging information whenever invoking a compiler.
Traceback lines such as "(+0x57a574)[0x682574]" which lack the name
of a symbol or file, suggest that "-g" debugging info was not requested
for *all* compilations.  Start over ("make clean; rm -rf '*.[oa]'")
then re-compile every source file, making be sure to specify "-g"
and no variant of "-O" or "-On", except possibly "-O0".


This is a bit puzzling. I'm always running valgrind tests with "-O0" and possibly with -fno-omit-frame-pointer, as that gives me the most reliable results etc. "-g" should be enabled too (thanks to the postgres specific --enable-debug configure switch).

3. Optimizing for speed comes after achieving correct execution.
If 'inline' is used anywhere, then re-compile with the compile-time argument
"-Dinline=/*empty*/" in order to #define 'inline' as a one-word comment.
If the behavior of the program changes (any difference at all, excepting
only slower execution), then there is a *design error* in the source code.
Fix that first.


If I was optimizing for speed, I wouldn't be running with "-O0". I'm not sure what's causing the missing symbols, but it certainly is not inline functions - we do have a couple of those, but definitely not this high in the stack.

The other thing is that when loading the core file into gdb, the backtrace is entirely different (and bogus) from what was written into the server log (which comes from "backtrace()" - maybe the missing symbol names are due to some limitation in this).

4. Walk before attempting to run.
Did you try a simple example?  Write a half-page program with 5 subroutines, each of which calls the next one, and the last one sends SIGABRT to the process.

I've inspected *thousands* of core files in the last couple years, both as part of development and supporting all kinds of systems. And most of the time it either works just fine or it's clear why it's not working. Except when running under valgrind, in which case I have no idea why it doesn't work (with the same compile options and all that).

Does the .core file when run under valgrind give the correct traceback using gdb?


I'm not sure I understand the questions. In my initial post I showed two backtraces - one I extracted from the .core file using gdb, and another one that the application itself (postgres) writes into the server log (after using backtrace() etc.).

The logged backtrace has a couple missing symbols, but seems reasonable otherwise.

The backtrace extracted from the .core file is clearly bogus.


5. (Learn and) Use the built-in tools where possible.
Run the process interactively, invoking valgrind with "--vgdb-error=0",
and giving the debugger command "(gdb) continue" after establishing
connectivity between vgdb and the process.
See the valgrind manual, section 3.2.9 "vgdb command line options".
When the SIGABRT happens, then vgdb will allow you to use all the ordinary
gdb commands to get a backtrace, go up and down the stack, examine
variables and other memory, run
    (gdb) info proc
    (gdb) shell cat /proc/$PID/maps
to see exactly the layout of process memory, etc.
There are also special commands to access valgrind functionality
interactively, such as checking for memory leaks.


I already explained why I don't want / can't use the interactive gdb. I'm aware of the option, I've used it before, but in this case it's not very practical.

regards
Tomas


_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to