On Sep 20, 2019, at 2:02 PM, Michael Richardson <m...@sandelman.ca> wrote:

> Guy Harris <ghar...@sonic.net> wrote:
>> Currently, Wireshark's pcapng reading code imposes a block size limit
>> for all blocks:
> 
>> enough that * the resulting block size would be less than the previous
>> 16 MiB limit.  */ #define MAX_BLOCK_SIZE (MIN_EPB_SIZE +
>> WTAP_MAX_PACKET_SIZE_DBUS + 131072)
> 
>> WTAP_MAX_PACKET_SIZE_DBUS is 16 MiB.
> 
> So, MIN_EPB_SIZE (28?) + 16MiB + 128KiB.
> I think that this is a fine maximum for quite a number of block types.
> I propose to introduce sane maximums for each block type, on a block type 
> basis.

Your recent checkin has an SHB maximum size of 1 MiB.

An SHB is 24 bytes of fixed data plus option, so that allows almost 1 MiB of 
options.

The size of an EPB is 28 + packet data size (padded to a multiple of 4 bytes) 
plus options, so your Wireshark-derived maximum size for an EPB is pretty much 
based on a maximum 128 KiB of options.

Is there a reason to have different maximum-bytes-of-options values for 
different blocks?  If not, I'm OK with a maximum of either 128 KiB or 1 MiB (or 
other reasonable values) for the maximum number of option bytes.  The maximum 
size of an option is 4 plus 65536 (maximum option value size, rounded up to a 
multiple of 4), so 128 KiB is slightly under 2 maximum-sized options.  1 MiB 
wouldn't be enough to store all of *War and Peace* in a sequence of comment 
options (storing it in an English translation; storing it in the initial 
Russian would be worse, as that's two bytes per letter in UTF-8), but *The 
Great Gatsby* would fit. :-)

> I can live with 16GiB as the *maximum* that we will allocate.
> I'd like to put this in the draft: every block should have a *reasonable*
> maximum.  I plan to work on a mmap() based reading API,

Note that memory-mapping means that, on a read error, the program will probably 
die with a signal (UN*X) or exception (Windows).  Disks are pretty reliable, so 
you probably won't get many EIOs from the disk (I *did* get them at Sun when 
some SMD disk was failing, but that was the mid-to-late 1980's).  However:

        if the drive is removable, the user unplugging the drive could cause an 
error;

        if the "drive" is a share mounted from a file server, unless it's an 
uninterruptible NFS hard mount, either ^Cing a hard mount or getting a timeout 
on other mounts could cause an error.

At Apple, at least some software only used mmap() for files on a local, 
non-removable drive.  fstatfs() might be able to tell you whether the file is 
on a local drive (the MNT_LOCAL flag on at least some BSD-flavored OSes; 
checking the file system type field against known non-local file systems on 
Linux, although the latter is less robust).  I don't remember offhand how you 
distinguish volumes on removable vs. non-removable media.

> and I that shouldn't
> have a problem with block size on 64-bit systems.  But maybe on 32-bit
> systems, it should use mmap() in some more creative way.

Map in a region of the file and, if you need something outside that region, 
uncap the old region and map the new region.

> I'm not sure here. Are there any good libraries to outsource this problem?

I don't know of any offhand.

> I'd like to do an AIO (libuio)

libuio or libaio:

        https://pagure.io/libaio

?

Using POSIX aio_ routines would allow it to work on at least some other UN*Xes 
as well.  I guess the Windows equivalent is $QIOW^Woverlapped I/O.
_______________________________________________
tcpdump-workers mailing list
tcpdump-workers@lists.tcpdump.org
https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers

Reply via email to