Re: [tcpdump-workers] advice for heavy traffic capturing

Fulvio Risso Mon, 09 Aug 2004 02:55:49 -0700

Hi Darren.

> -----Original Message-----
> From: Darren Reed [mailto:[EMAIL PROTECTED]
> Sent: lunedi 9 agosto 2004 10.57
> To: Fulvio Risso
> Cc: [EMAIL PROTECTED]
> Subject: Re: [tcpdump-workers] advice for heavy traffic capturing
>
>
> [ Charset ISO-8859-1 unsupported, converting... ]
> >   http://netgroup.polito.it/fulvio.risso/pubs/iscc01-wpcap.pdf
>
> When was it published?  There is no date...


Fulvio Risso, Loris Degioanni, An Architecture for High Performance Network
Analysis, Proceedings of the 6th IEEE Symposium on Computers and
Communications (ISCC 2001), pg. 686-693, Hammamet, Tunisia, July 2001.


> Winpcap appears, by design, to be the same as BPF.  If you reduced the
> number of buffers in the ring used with NPF to 2 buffers, I suspect it
> would be the same as BPF ?

No, there are two different architectural choices.
The ring does not have buffers; it has just space for packets; space
occupancy is exactly the size of the packet.


> And because there is no date, I can say that references to the buffer
> size being 32Kbytes in recent BSD kernels is wrong.  Recent BSD kernels

In 2001 buffer was 32KB.


> use 1MB or 2MB buffers, by default.  Although it then contradicts itself
> later by saying there are larger buffers but that pcap tunes it down to
> 32K....(page 2 vs page 3.)

No, it does not contradicts itself.
At that time, there was a sysctrl option that allowed to increase the buffer
size from the command line.
However, libpcap code included a system call that reset the value of the
buffer to 32KB.
So, even if the used managed to have a bigger buffer through the command
line, it was impossible to use it without modifying the libpcap code.

This was valid in 2001; I don't know now.


>
> > Hardware counts, but... we have been really careful to optimize
> the whole
> > path from the NIC card to the application.
> > See another article on this topic (it covers only Win32):
> >
> >    L. Degioanni, M. Baldi, F. Risso, G. Varenni
> >    Profiling and Optimization of Software-based Network Analysis
> > Applications
> >    http://netgroup.polito.it/fulvio.risso/pubs/sbac03-winpcap.pdf
>
> No date on the paper, here, either.

Gianluca Varenni, Mario Baldi, Loris Degioanni, Fulvio Risso, Optimizing
Packet Capture on Symmetric Multiprocessing Machines, Proceedings of the
15th Symposium on Computer Architecture and High Performance Computing
(SBAC-PAD03), pg. 108-115, Sao Paulo, Brazil, November 2003.



> > Particularly, Figure 9 shows how much work has been done to reduce the
> > processing overhead.
>
> Interestingly, there are a few large areas for improvement: timestamp
> (1800 -> 270), Tap processing (830->560) and filtering (585 -> 109).

... and NIC drivers and Operating system overhead which, as you can see,
account for more or less 50% of the total overhead.


> > And yes, NIC drivers and OS overheads are very important... but
> these are
> > the components that cannot be changed by normal users.
>
> I think that's what you're seeing with the 3Com GigE NIC for 100BT
> receiving.  Do you know what size the buffers on the card are ?

A few KB, less than 10KB if I remember well.


> The Intel 100 ProS have 128K for receieve, as I recall, the same as
> the 1000MX card.  There wasn't much between these two, that I was able
> to observe, except that the 100ProS was slightly better.

The amount of memory you have on the NIC is not very significant.
I cannot give you numbers right now, but this is not the parameter that
changes your life.


> My biggest problem here is that you've expended effort to tune and make
> NPF fast (which is fine) and compare it with existing BPF, almost to say
> that BPF is bad.  I suppose this is what researchers do, but I think it
> is unfair on BPF.  IMHO, you should have tested with the same buffer size
> for both, even if it meant hacking on libpcap.

No ;-)
The 2001 paper compared Win32 and BSD with the same buffer size. We modified
the libpcap code in order to use a different size for the buffer, and then
we ran the tests.

I would add some points to this discussion, quoting from the conclusions of
the second paper (which, for instance, focuses entirely on NPF without even
mention BPF):

=========================================
A valuable result of this study is the quantitative conclusion that,
contrary to common belief, filtering and buffering are not the most critical
factors in determining packet capture performance. Optimization of these two
components, that received most attention so far, is shown to bring little
improvement to the overall packet capture cost, particularly in case of
short packets (or when small snapshot length are needed). The profiling done
on a real system shows that the most important bottlenecks lie in hidden
places, like device driver, interaction between application and OS,
interaction between OS and hardware.
=========================================

We didn't want to say "BPF is good, NPF is bad".
What we said is: be careful, an accurate tuning of the code is more
important than other esotic stuff such as improving filtering (with a
Just-in-Time Compiler) and so on.

And, I would like to say, you need a global picture of where the bottleneck
are before doing optimizations.
For instance, we're now working to decrease the 50% of the time spend by
each packet in the operating systems.


> In the NetBSD emails, I think I ponder making changes to the buffering
> so that it is more ring-buffer like (similar to what exists within NPF
> if I understand the diagram right.)

Eh, what you're saying is good but... the double buffering in the BPF has an
advantage: it is much simpler, and if you're not interested in memory
occupancy, it is a very good choice.
We didn't realize it in 2001; now, we can see less black and white in the
choice between a double buffer and a ring buffer...


> Is the JIT code easily ported to other platforms ?

Yes, as far as the platform is Intel ;-)

        fulvio

-
This is the tcpdump-workers list.
Visit https://lists.sandelman.ca/ to unsubscribe.

Re: [tcpdump-workers] advice for heavy traffic capturing

Reply via email to