On Sep 20, 2019, at 2:02 PM, Michael Richardson <m...@sandelman.ca> wrote:
> Guy Harris <ghar...@sonic.net> wrote: >> Currently, Wireshark's pcapng reading code imposes a block size limit >> for all blocks: > >> enough that * the resulting block size would be less than the previous >> 16 MiB limit. */ #define MAX_BLOCK_SIZE (MIN_EPB_SIZE + >> WTAP_MAX_PACKET_SIZE_DBUS + 131072) > >> WTAP_MAX_PACKET_SIZE_DBUS is 16 MiB. > > So, MIN_EPB_SIZE (28?) + 16MiB + 128KiB. > I think that this is a fine maximum for quite a number of block types. > I propose to introduce sane maximums for each block type, on a block type > basis. Your recent checkin has an SHB maximum size of 1 MiB. An SHB is 24 bytes of fixed data plus option, so that allows almost 1 MiB of options. The size of an EPB is 28 + packet data size (padded to a multiple of 4 bytes) plus options, so your Wireshark-derived maximum size for an EPB is pretty much based on a maximum 128 KiB of options. Is there a reason to have different maximum-bytes-of-options values for different blocks? If not, I'm OK with a maximum of either 128 KiB or 1 MiB (or other reasonable values) for the maximum number of option bytes. The maximum size of an option is 4 plus 65536 (maximum option value size, rounded up to a multiple of 4), so 128 KiB is slightly under 2 maximum-sized options. 1 MiB wouldn't be enough to store all of *War and Peace* in a sequence of comment options (storing it in an English translation; storing it in the initial Russian would be worse, as that's two bytes per letter in UTF-8), but *The Great Gatsby* would fit. :-) > I can live with 16GiB as the *maximum* that we will allocate. > I'd like to put this in the draft: every block should have a *reasonable* > maximum. I plan to work on a mmap() based reading API, Note that memory-mapping means that, on a read error, the program will probably die with a signal (UN*X) or exception (Windows). Disks are pretty reliable, so you probably won't get many EIOs from the disk (I *did* get them at Sun when some SMD disk was failing, but that was the mid-to-late 1980's). However: if the drive is removable, the user unplugging the drive could cause an error; if the "drive" is a share mounted from a file server, unless it's an uninterruptible NFS hard mount, either ^Cing a hard mount or getting a timeout on other mounts could cause an error. At Apple, at least some software only used mmap() for files on a local, non-removable drive. fstatfs() might be able to tell you whether the file is on a local drive (the MNT_LOCAL flag on at least some BSD-flavored OSes; checking the file system type field against known non-local file systems on Linux, although the latter is less robust). I don't remember offhand how you distinguish volumes on removable vs. non-removable media. > and I that shouldn't > have a problem with block size on 64-bit systems. But maybe on 32-bit > systems, it should use mmap() in some more creative way. Map in a region of the file and, if you need something outside that region, uncap the old region and map the new region. > I'm not sure here. Are there any good libraries to outsource this problem? I don't know of any offhand. > I'd like to do an AIO (libuio) libuio or libaio: https://pagure.io/libaio ? Using POSIX aio_ routines would allow it to work on at least some other UN*Xes as well. I guess the Windows equivalent is $QIOW^Woverlapped I/O. _______________________________________________ tcpdump-workers mailing list tcpdump-workers@lists.tcpdump.org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers