Thanks for the suggestion David! I was starting hashpipe in the debugger. I'll use gdb and the core file, and let you know what I find. If I still can't figure out the problem, I will send you a minimum non-working example. I definitely think it's some sort of pointer arithmetic error as well, I just can't see it yet. I really appreciate the help.
Thanks again, Mark On Thu, Dec 3, 2020 at 1:30 AM David MacMahon <[email protected]> wrote: > Hi, Mark, > > Sorry to hear you're still getting a segfault. It sounds like you made > some progress with gdb, but the fact that you ended up with a different > sort of error suggests that you were starting hashpipe in the debugger. To > debug your initial segfault problem, you can run hashpipe without the > debugger, let it segfault and generate a core file, then use gdb and the > core file (and hashpipe) to examine the state of the program when the > segfault occurred. The tricky part is getting the core file to be > generated on a segfault. You typically have to increase the core file size > limit using "ulimit -c unlimited" and (because hashpipe is typically > installed with the suid bit set) you have to let the kernel know it's OK to > dump core files for suid programs using "sudo sysctl -w fs.suid_dumpable=1" > (or maybe 2 if 1 doesn't quite do it). You can read more about these steps > with "help ulimit" (ulimit is a bash builtin) and "man 5 proc". > > Once you have the core file (typically named "core" but it may have a > numeric extension from the PID of the crashing process) you can debug > things with "gbd /path/to/hashpipe /path/to/core/file". Note that the core > file may be created with permissions that only let root read it, so you > might have to "sudo chown a+r core" or similar to get read access to it. > This starts the debugger in a a sort of forensic mode using the core file > as a snapshot of the process and its memory space at the time of the > segfault. You can use "info threads" to see which threads existed, "thread > N" to switch between threads (N is a thread number as shown by "info > threads"), "bt" to see the function call backtrace fo the current thread, > and "frame N" to switch to a specific frame in the function call > backtrace. Once you zero in on which part of your code was executing when > the segfault occurred you can examine variables to see what exactly caused > the segfault to occur. You might find that the "interesting" or "relevant" > variables have been optimized away, so you may want/need to recompile with > a lower optimization level (e.g. -O1 or maybe even -O0?) to prevent that > from happening. > > Because this happens when you reach the end of your data buffer, I have to > think it's a pointer arithmetic error of some sort. If you can't figure > out the problem from the core file, please create a "minimum working > example" (well, in this case I guess a minimum non-working example), > including a dummy packet generator script that creates suitable packets, > and I'll see if I can recreate the problem. > > HTH, > Dave > > On Nov 30, 2020, at 14:45, Mark Ruzindana <[email protected]> wrote: > > 'm currently using gdb to debug and it either tells me that I have a > segmentation fault at the memcpy() in process_packet() or something very > strange happens where the starting mcnt of a block greatly exceeds the mcnt > corresponding to the packet being processed and there's no segmentation > fault because the mcnt distance becomes negative so the memcpy() is > skipped. Hopefully that wasn't too hard to track. Very strange problem that > only occurs with gdb and not when I run hashpipe without it. Without gdb, I > get the same segmentation fault at the end of the circular buffer as > mentioned above. > > > -- > You received this message because you are subscribed to the Google Groups " > [email protected]" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/AC9534AD-390F-44D8-ABFE-8AE76F059957%40berkeley.edu > <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/AC9534AD-390F-44D8-ABFE-8AE76F059957%40berkeley.edu?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "[email protected]" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpyphTtDGtJ%3DaremL1gB1atqGOPkDfKFJxR216TJZD5ivg%40mail.gmail.com.

