Thanks for the additional suggestions. I will try those and let you know what happens.
Mark On Mon, May 25, 2020 at 6:07 PM David MacMahon <dav...@berkeley.edu> wrote: > A few more suggestions: > > 1) Enable core dumps. Usually you have to run "ulimit -c unlimited" and > for suid executables there's an extra step related to > /proc/sys/fs/suid_dumpable. See "man 5 core" and "man 5 proc" for > details. Once you have a core file, you can use gdb to examine the state > of things when the segfault happened. You might want to recompile your > plug-in with debugging enabled and fewer optimizations to get the most out > of this approach: "gdb /path/to/hashpipe /path/to/core". (Gotta love how > it's still called "core"!). gdb can be a bit cryptic, but it's also very > powerful. > > 2) Another idea, just for diagnostic purposes, is to omit the "+ > input_databuf_idx(...)" part of the dest_p assignment. That will write all > payloads to the first part of the data block, so not buffer overflow for > sure (assuming idx is in range :)). It's just a way to eliminate a > variable. > > 3) Make sure the packet socket blocks are large enough for the packet > frames. I agree it looks like you're not reading past the end of the > packet payload size, but maybe the payload itself goes beyond the end of > the packet socket blocks? The kernel might silently truncate the packets > in that case. > > 4) If you're using tagged VLANs the PKT_UDP_xxx macros won't work right. > It sounds like that's not happening because you're seeing the expected > size, but it's worth mentioning for mail archive completeness. > > 5) You can use hashpipe_dump_databuf to examine the 159 payloads you were > able copy before the segfault to see whether every byte is properly > positioned and has believable values. You could change memcpy(..) to > memset(p_dest, 'X', PKT_UDP_SIZE(frame)-16) so you'll know the exact value > that every byte should have. Instead of 'X' you could use pkt_num+1 (i.e. a > 1-based packet counter) so you'll know which bytes correspond to which > packets. Using memset() would also eliminate reading from the packet > socket blocks (another variable gone). > > Happy hunting, > Dave > > On May 25, 2020, at 16:33, Mark Ruzindana <ruziem...@gmail.com> wrote: > > Thanks for the suggestions. I neglected to mention that I'm printing out > the PKT_UDP_SIZE() and PKT_UDP_DST() right before the memcpy(), I take into > account the 8 byte UDP header and the size and port are correct. When > performing the memcpy(), I am taking into account that PKT_UDP_DATA() > returns a pointer of the payload and excludes the UDP header. However, I > also have an 8 byte packet header within that payload (this gives me the > mcnt, f-engine, and x-engine indices) and I exclude it when performing the > memcpy(). This is what it looks like: > > uint8_t * dest_p = db->block[idx].data + input_databuf_idx(m, f, 0,0,0); > // This macro index shifts every mcnt and f-engine index > const uint8_t * payload = (uint8_t *)(PKT_UDP_DATA(frame)+8); // Ignore > packet header > > fprintf(...); // prints PKT_UDP_SIZE() and PKT_UDP_DST() > memcpy(dest_p, payload, PKT_UDP_SIZE(frame) - 16) // Ignore both UDP (8 > bytes) and packet header (8 bytes) > > I will look into the other possible issues that you suggested, but as far > as I can tell, it doesn't seem like there should be a segfault given what > I'm doing before that memcpy(). I will let you know what else I find. > > Thanks again, I really appreciate the help. > > Mark > > On Mon, May 25, 2020 at 4:30 PM David MacMahon <dav...@berkeley.edu> > wrote: > >> Hi, Mark, >> >> Sounds like progress! >> >> On May 25, 2020, at 13:56, Mark Ruzindana <ruziem...@gmail.com> wrote: >> >> I have been able to capture data with the first round of frames of the >> circular buffer i.e. if I have 160 frames, I am able to capture packets of >> frames 0 to 159 at which point right at the memcpy() in the >> process_packet() function of the net thread, I get a segmentation fault. >> >> >> The fact that you get a the segfault right at the memcpy of the final >> frame of the ring buffer suggests that there is problem with the parameters >> passed to memcpy. Most likely src+length-1 exceeds the end of the frame so >> you get a segfault when memcpy tries to read from beyond the allocated >> memory. This would explain why it segfaults on the final frame and not the >> previous frames because reading beyond a previous frame still reads from >> "legal" (though incorrect) memory locations. It's also possible that the >> segfault happens due to a bad address on the destination side of the >> memcpy(), but unless the destination buffer is also 160 frames in size that >> seems less likely. >> >> The release_frame function is not likely to be a culprit here unless the >> pointer you are passing it differs from the pointer that the pktsock_recv >> function returned. >> >> For debugging, I suggest logging dst, src, len before calling memcpy. >> Normally you wouldn't generate a log message for every packet because that >> would ruin your throughput, but since you know it's going to crash after >> the first 160 packets there's not much throughout to ruin. :) >> >> One thing to remember is that PKT_UDP_DATA() evaluates to a pointer to >> the UDP payload of the packet, but PKT_UDP_SIZE() evaluates to the total >> UDP size (i.e. 8 bytes for the UDP header plus the length of the UDP >> payload). Passing PKT_UDP_SIZE() as "len" to memcpy without subtracting 8 >> for the header bytes is not correct and could potentially cause this >> problem. >> >> HTH, >> Dave >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "casper@lists.berkeley.edu" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to casper+unsubscr...@lists.berkeley.edu. >> To view this discussion on the web visit >> https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/297C1709-AE9C-488D-9110-FD0832BF5951%40berkeley.edu >> <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/297C1709-AE9C-488D-9110-FD0832BF5951%40berkeley.edu?utm_medium=email&utm_source=footer> >> . >> > > -- > You received this message because you are subscribed to the Google Groups " > casper@lists.berkeley.edu" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to casper+unsubscr...@lists.berkeley.edu. > To view this discussion on the web visit > https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpxVHhDiD6RT6qK86ub3Tq3aQaTFxrGitKFMaNnRh3rKRw%40mail.gmail.com > <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpxVHhDiD6RT6qK86ub3Tq3aQaTFxrGitKFMaNnRh3rKRw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > > -- > You received this message because you are subscribed to the Google Groups " > casper@lists.berkeley.edu" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to casper+unsubscr...@lists.berkeley.edu. > To view this discussion on the web visit > https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/723417E3-C630-4988-84B8-F4F3171DB47E%40berkeley.edu > <https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/723417E3-C630-4988-84B8-F4F3171DB47E%40berkeley.edu?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "casper@lists.berkeley.edu" group. To unsubscribe from this group and stop receiving emails from it, send an email to casper+unsubscr...@lists.berkeley.edu. To view this discussion on the web visit https://groups.google.com/a/lists.berkeley.edu/d/msgid/casper/CA%2B41hpy2YkNOywYgL__gWQupedq%2BVKz-%2BoepWEf9zXDwwxVtig%40mail.gmail.com.