RE: [Ntop] Totals for one ETH???

Burton M. Strauss III Tue, 23 Apr 2002 14:07:45 -0700

Short Answer: Faster processor - we did talk about a Pentium 166MMX being
insufficient, remember :-(


Long Answer:  There are two places packets drop "in" ntop.

1. Dropped in the Kernel
========================

First off, this is NOT controlled by ntop.  It's inside the network card
and, perhaps, libpcap.  All ntop does is use pcap_stats:

int pcap_stats() returns 0 and fills in a pcap_stat struct. The values
represent packet statistics from the start of the run to the time of the
call. If there is an error or the under lying packet capture doesn't support
packet statistics, -1 is returned and the error text can be obtained with
pcap_perror() or pcap_geterr().

to get the value and reports it through this code in report.c:

      droppedByKernel=0;

      for(i=0; i<myGlobals.numDevices; i++)
        if(myGlobals.device[i].pcapPtr
           && (!myGlobals.device[i].virtualDevice)) {
          if(pcap_stats(myGlobals.device[i].pcapPtr, &pcapStats) >= 0) {
            droppedByKernel += pcapStats.ps_drop;
          }
        }
...
      if(snprintf(buf, sizeof(buf),
                  "<TR %s><TH "TH_BG"
align=left>Dropped&nbsp;by&nbsp;the&nbsp;kernel</th>"
                  "<TD "TD_BG" COLSPAN=2 align=right>%s</td></TR>\n",
                  getRowColor(), formatPkts(droppedByKernel)) < 0)
        traceEvent(TRACE_ERROR, "Buffer overflow!");
      sendString(buf);

While there are LOTS of possible causes, if the kernel is *routinely*
dropping packets, it's almost certainly an interrupt/processor
speed/buffering issue.

Each time an interrupt occurs, the kernel processes it by moving the data
from the NIC to a kernel buffer, then re-enable interrupts.  Actually, in
Linux (and other OSes), the Kernel interrupt handler is broken in half -
called in Linux the top and bottom.  The bottom interrupt works with
processor interrupts off to grab the data and buffer it and does only
minimal processing as fast as possible so interrupts can be turned back on.
Then the top half is scheduled and processes as a Kernel process (high
priority), but it is less time critical.

I think the call to pcap_stats reads the NIC's counters and compares them to
the Kernel.  So a temporary value > 0 just means that there were - AT THE
INSTANT OF THE pcap_stats call - packets queued in the NIC or Kernel
buffer(s).  This isn't a problem - some NICs with larger buffers even boost
performance by internally queuing a number of packets before interrupting
the OS.

However, if the kernel can't process the bottom half in time - because there
isn't enough memory and/or the processor isn't fast enough to respond to the
interrupt, you do have a problem.   The small buffer in the NIC will
overflow and packets are dropped (this is the ONE place where a better NIC,
with a larger buffer, MIGHT help).

Check ifconfig for this:

eth0      Link encap:Ethernet  HWaddr 00:D0:09:77:85:B9
          inet addr:192.168.42.6  Bcast:192.168.42.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3892397 errors:30 dropped:0 overruns:0 frame:33
                                       ^^^^^^^^^
          TX packets:473009 errors:0 dropped:0 overruns:0 carrier:0
                                     ^^^^^^^^^
          collisions:1073 txqueuelen:100
          RX bytes:2606704447 (2485.9 Mb)  TX bytes:70474880 (67.2 Mb)
          Interrupt:11 Base address:0xc000

Now, if it's an occasional burst, losing 1 or 2 packets won't kill you.
TCP/IP recovers.  And ntop's statistics aren't life-critical.  If, however,
it's continuous, on-going and the count is growing - i.e. the NIC/CPU combo
can't keep up with the AVERAGE network flow.  You're toast.... ANSWER:
Upgrade the Processor or NIC.

ntop
====

ntop runs multiple threads ("NPS - Network Packet Sniffer (main thread)"),
one to handle each incomming device.  These operate much like the
bottom-half interrupt - they accept the packet and queue it to another
thread ("NPA - Network Packet Analyzer (main thread)") for the analysis.

Ultimately, we're interested in the counter droppedPkts, which is
incremented in only one place, pbuf.c at 1398, which this is where NPS is
trying to queue a packet for NPA:


  if(myGlobals.packetQueueLen >= PACKET_QUEUE_LENGTH) {
...
    myGlobals.device[getActualInterface(deviceId)].droppedPkts++;

Now, you can increase PACKET_QUEUE_LENGTH

             ntop.h:   693   #define PACKET_QUEUE_LENGTH     2048

but if you're routinely dropping packets here because ntop can't keep up
with the flow, increasing the queue length will ONLY help if it's an
occasional huge peak with times where the network is quiet enough to allow
you to work off the queue.  And each packet buffer in use takes up memory.
This is in myGlobals.packetQueue, which is an array of

typedef struct packetInformation {
  unsigned short deviceId;
  struct pcap_pkthdr h;
  u_char p[2*DEFAULT_SNAPLEN+1];   <-- 2*384+1
} PacketInformation;

(So it's about 1.5MB for 2048 size).

The instantaneous and maximum value for the queue is reported in the
configuration report:

# Queued Pkts to Process 0
# Max Queued Pkts 0

But, again, the best answer for dropped packets is a faster processor...
(Upgrading to an expensive faster NIC with a larger buffer is rarely the
best "bang for the buck" - if you were running a server, then a server class
NIC is a good idea, but for workstations or network monitors, etc. you just
need something that can keep up with the network flow.)

-----Burton

PS: All line #s are from the cvs version as of yesterday, this should be the
version in the 13Apr2002 snapshot...

(Resent - this was held in the mailing list message queue then rejected)

-----Original Message-----
From: Boniforti Flavio [mailto:[EMAIL PROTECTED]]
Sent: Saturday, April 13, 2002 3:26 AM
To: 'Burton M. Strauss III'; [EMAIL PROTECTED]
Subject: R: [Ntop] Totals for one ETH???


> Couple things:
>
> 1. Packets
>
> Total 5,502
> Dropped by the kernel 0   <-- may be non-zero
> Dropped by ntop 0
> Unicast 99.3% 5,464
> Broadcast 0.7% 38
>
> If the kernel is dropping packets, it can be because the are
> queued for
> analysis or you have a filter in place (-B option).  If
> neither of those is
> true, and the # is growing over time, your machine isn't
> keeping up with
> traffic.  Then all bets are off about totals...

OK, so you're telling me that's normal to have some kernel-dropped packets,
as long as they do not grow over time, right?

Well, unfortunately that's exactly my case!!! What do you think should I
improve on my machine to get this value down?

>
>
> 2. Traffic
>
> Total 2.5 MB [5,502 Pkts]
> IP Traffic 2.5 MB [5,422 Pkts]
> Fragmented IP Traffic   [0.0%]
> Non IP Traffic 4.4 KB
>
> Remember, these are rounded values...
>

How much are they rounded?

>



_______________________________________________
Ntop mailing list
[EMAIL PROTECTED]
http://listmanager.unipi.it/mailman/listinfo/ntop

RE: [Ntop] Totals for one ETH???

Reply via email to