Re: [casper] Dropped packets during HASHPIPE data acquisition

David MacMahon Thu, 03 Dec 2020 00:31:00 -0800

Hi, Mark,

Sorry to hear you're still getting a segfault.  It sounds like you made some 
progress with gdb, but the fact that you ended up with a different sort of 
error suggests that you were starting hashpipe in the debugger.  To debug your 
initial segfault problem, you can run hashpipe without the debugger, let it 
segfault and generate a core file, then use gdb and the core file (and 
hashpipe) to examine the state of the program when the segfault occurred.  The 
tricky part is getting the core file to be generated on a segfault.  You 
typically have to increase the core file size limit using "ulimit -c unlimited" 
and (because hashpipe is typically installed with the suid bit set) you have to 
let the kernel know it's OK to dump core files for suid programs using "sudo 
sysctl -w fs.suid_dumpable=1" (or maybe 2 if 1 doesn't quite do it).  You can 
read more about these steps with "help ulimit" (ulimit is a bash builtin) and 
"man 5 proc".

Once you have the core file (typically named "core" but it may have a numeric 
extension from the PID of the crashing process) you can debug things with "gbd 
/path/to/hashpipe /path/to/core/file".  Note that the core file may be created 
with permissions that only let root read it, so you might have to "sudo chown 
a+r core" or similar to get read access to it.  This starts the debugger in a a 
sort of forensic mode using the core file as a snapshot of the process and its 
memory space at the time of the segfault.  You can use "info threads" to see 
which threads existed, "thread N" to switch between threads (N is a thread 
number as shown by "info threads"), "bt" to see the function call backtrace fo 
the current thread, and "frame N" to switch to a specific frame in the function 
call backtrace.  Once you zero in on which part of your code was executing when 
the segfault occurred you can examine variables to see what exactly caused the 
segfault to occur.  You might find that the "interesting" or "relevant" 
variables have been optimized away, so you may want/need to recompile with a 
lower optimization level (e.g. -O1 or maybe even -O0?) to prevent that from 
happening.

Because this happens when you reach the end of your data buffer, I have to 
think it's a pointer arithmetic error of some sort.  If you can't figure out 
the problem from the core file, please create a "minimum working example" 
(well, in this case I guess a minimum non-working example), including a dummy 
packet generator script that creates suitable packets, and I'll see if I can 
recreate the problem.

HTH,
Dave

> On Nov 30, 2020, at 14:45, Mark Ruzindana <ruziem...@gmail.com> wrote:
> 
> 'm currently using gdb to debug and it either tells me that I have a 
> segmentation fault at the memcpy() in process_packet() or something very 
> strange happens where the starting mcnt of a block greatly exceeds the mcnt 
> corresponding to the packet being processed and there's no segmentation fault 
> because the mcnt distance becomes negative so the memcpy() is skipped. 
> Hopefully that wasn't too hard to track. Very strange problem that only 
> occurs with gdb and not when I run hashpipe without it. Without gdb, I get 
> the same segmentation fault at the end of the circular buffer as mentioned 
> above.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"casper@lists.berkeley.edu" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to casper+unsubscr...@lists.berkeley.edu.
To view this discussion on the web visit 
https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/AC9534AD-390F-44D8-ABFE-8AE76F059957%40berkeley.edu.

Re: [casper] Dropped packets during HASHPIPE data acquisition

Reply via email to