On Sun, 2024-10-20 at 10:27 -0700, Guy Harris wrote: > On Oct 20, 2024, at 2:57 AM, Garri Djavadyan <g.djavad...@gmail.com> > wrote: > > > > > I have to use a very big buffer with a very slow storage, much > > > > slower > > > > than the rate of coming packets received by the filter, and it > > > > is > > > > preferred not to lose a single packet after initiating > > > > termination > > > > the > > > > process. > > > > > > What do you mean by "with a very slow storage"? You can set the > > > size > > > with -B, but that just tells the capture mechanism in the kernel > > > how > > > big a buffer to allocate. It's not as if it tells it to be > > > stored in > > > some slower form of memory. > > > > Let me show an example. To demonstrate the issue, I am generating > > 2MB/s > > stream of dummy packets: > > > > [src]# pv -L 2M /dev/zero | dd bs=1472 > /dev/udp/192.168.0.1/12345 > > > > > > and dumping them to a storage, with cgroup-v2-restricted write > > speed of > > 1MB/s: > > > > [dst]# lsblk /dev/loop0 > > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS > > loop0 7:0 0 3.9G 0 loop /mnt/test > > > > [dst]# cat /sys/fs/cgroup/test/io.max > > 7:0 rbps=max wbps=1024000 riops=max wiops=max > > > > > > To temporarily avoid kernel-level drops, > > Emphasis on *temporarily* - 2MB/s worth of packet data can only be > saved in its entirety if you have 2MB/s or greater write speed.
That is right. However, it also depends on how long one needs to mediate mismatching rates using a large input buffer. For example, with a 2GB input buffer and 1MB/s rate difference, one could safely be filling the buffer for more than half an hour. Safe buffer draining would help a lot in such situations. > > it is clearly seen that the input buffer is being filled at 1MB/s > > rate > > (the diff between the generated traffic rate (2MB/s) and the > > writing > > speed of the storage (1MB/s): > > > > tcpdump: 0 packets captured, 0 packets received by filter, 0 > > packets > > dropped by kernel > > tcpdump: 218 packets captured, 715 packets received by filter, 0 > > packets dropped by kernel > > On all platforms, "packets captured" means "packets read from libpcap > and written to the capture file". > > On Linux, "packets received by filter" means "packets that passed the > filter" (rather than "packets that were run through the filter, > whether or not they passed the filter", which is what it means on > *BSD/macOS/Solaris 11/AIX; unfortunately, you can't get the latter > value from Linux and can't get the former value from BSD, so that > value *can't* be made to mean the same thing on all platforms). It > includes packets that passed the filter but could not be added to the > buffer because the buffer was full. > > On Linux, "packets dropped by kernel" means "packets that passed he > filter but could not be added to the buffer because the buffer was > full". > > (The pcap_stats man page has an entire paragraph devoted to giving > the message that the meaning of the statistics differs between > platforms.) > > I.e., when tcpdump exits, the difference, on Linux, between "packets > received by filter" and "packets captured" is, indeed, "packets > dropped because tcpdump exited without draining the packet buffer". > (On *BSD/macOS/Solaris 11/AIX, the latter value cannot be determined, > as per the above.) > > > > > There are a few options to overcome the problem. For example, > > > > by dumping packets to the memory storage first (e.g. /dev/shm) > > > > > > Presumably meaning you specified "-w /dev/shm" or something such > > > as > > > that? > > > > > > If so, how does that make a difference? > > > > I mean I can first dump packets to the lightning-fast RAM storage > > and > > after being done with the capturing part, copy the dump to the slow > > storage. > > I.e., it means that, when you signal tcpdump to exit, it's not as far > behind the capture mechanism with regards to writing to the capture > file, because it's stalling less waiting for write() calls to finish > (if the write rate limitation you mention limits the rate at which > write() calls can push data to the file descriptor), so the "packets > captured" count is larger. Exactly. > > I see. Thank you so much for the explanation. > > > > Do you think this case can justify feature requests both for > > libpcap > > and tcpdump on github? > > Yes, as it means that tcpdump (and, potentially, other programs such > as Wireshark) can write out *all* packets received before being told > to stop capturing. > > The implementations for various platforms would probably have to 1) > set a "drop all packets" filter on the capture device, 2) possibly > put the capture device in non-blocking mode (as there's no point in > blocking, as no more packets will be seen), and 3) cause the packet > processing loop in libpcap to quit as soon as it finds that there > are no more packets available to read. For programs using > pcap_loop(), that should be transparent; for programs using > pcap_dispatch(), they would have to treat a return value of 0, if > they've put the capture device in "draining mode", as meaning "done" > rather than "the packet buffer timeout expired and no packets were > provided, keep capturing". > > tcpdump uses pcap_loop(), so it'd only have to be changed to use the > new "stop capturing" API. Thank you for sharing your thoughts on this. It is good to know that it is feasible to implement. I will open a feature request for libpcap for now. Guy, thank you so much for all your comments. It is much appreciated. Regards, Garri _______________________________________________ tcpdump-workers mailing list -- tcpdump-workers@lists.tcpdump.org To unsubscribe send an email to tcpdump-workers-le...@lists.tcpdump.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s