Thanks Alfredo, answers are below. I had to truncate a bit because it
limited the message size.


>Hi Cliff
>please see inline
>
...
>You can avoid this by allocating additional buffers per-thread and using a 
>buffer swap when receiving a packet. This way you can keep aside up to K 
>packets, where K is the number of additional buffers allocated in the 
>per-thread pool.
>In order to allocate these additional buffers please have a look at 
>dna_cluster_low_level_settings().
>To get a buffers from the per-thread pool:
>pkt_handle = pfring_alloc_pkt_buff(ring[thread_id])
>To swap a received packet with another buffer:
>ret = pfring_recv_pkt_buff(ring[thread_id], pkt_handle, &hdr, wait_for_packet)
>

This is very useful. I didn't know it was per-thread, so this should
make a fairly significant impact.


>> For some reason, I am dropping what appears to be an increasing number of 
>> packets, depending on which thread it is. Usually the lower-numbered threads 
>> drop about 10%, while the higher-number ones drop around 90%.
>
>Please pay attention also to logical/physical cores when playing with core 
>affinity. Can I see the output of
>cat /proc/cpuinfo | grep "processor\|model name\|physical id"
>and the affinity you are using?

Cores 0-5 are physical, while 6-11 are hyperthreaded:

processor       : 0
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 0
processor       : 1
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 1
processor       : 2
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 2
processor       : 3
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 8
processor       : 4
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 9
processor       : 5
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 10
processor       : 6
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 0
processor       : 7
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 1
processor       : 8
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 2
processor       : 9
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 8
processor       : 10
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 9
processor       : 11
model name      : Intel(R) Xeon(R) CPU           L5638  @ 2.00GHz
physical id     : 0
core id         : 10

My threads are as follows:


Core 0: DNA Cluster Master

Core 1-9: Receiver Threads


I can move a thread off core 6, which has a shared cache with core 0
if you think that would help. Here is the output of
pfdnamaster_multithread by doing something similar:


root@bond0:/root> /tmp/pfdnacluster_multithread -i dna0 -c 1 -n 8
Capturing from dna0
Using PF_RING v.5.5.2
Hashing packets per-IP Address
The DNA cluster [id: 1][num consumer threads: 8] is running...
Opening cluster dnacluster:1@0
Consumer thread #0 is running...
Opening cluster dnacluster:1@1
Set thread 0 on core 2/12
Consumer thread #1 is running...
Opening cluster dnacluster:1@2
Set thread 1 on core 3/12
Consumer thread #2 is running...
Opening cluster dnacluster:1@3
Set thread 2 on core 4/12
Consumer thread #3 is running...
Opening cluster dnacluster:1@4
Set thread 3 on core 5/12
Consumer thread #4 is running...
Opening cluster dnacluster:1@5
Set thread 4 on core 6/12
Consumer thread #5 is running...
Opening cluster dnacluster:1@6
Set thread 5 on core 7/12
Consumer thread #6 is running...
Opening cluster dnacluster:1@7
Set thread 6 on core 8/12
Consumer thread #7 is running...
Set thread 7 on core 9/12
=========================
Thread 0
Absolute Stats: [52876 pkts rcvd][34172500 bytes rcvd]
                [52876 total pkts][0 pkts dropped (0.0 %)]
                [52'874.94 pkt/sec][273.37 Mbit/sec]
=========================
Thread 1
Absolute Stats: [104995 pkts rcvd][36437358 bytes rcvd]
                [104995 total pkts][0 pkts dropped (0.0 %)]
                [104'992.90 pkt/sec][291.49 Mbit/sec]
=========================
Thread 2
Absolute Stats: [50422 pkts rcvd][27233447 bytes rcvd]
                [50422 total pkts][0 pkts dropped (0.0 %)]
                [50'420.99 pkt/sec][217.86 Mbit/sec]
=========================
Thread 3
Absolute Stats: [55373 pkts rcvd][23669520 bytes rcvd]
                [55373 total pkts][0 pkts dropped (0.0 %)]
                [55'371.89 pkt/sec][189.35 Mbit/sec]
=========================
Thread 4
Absolute Stats: [54588 pkts rcvd][32153687 bytes rcvd]
                [54588 total pkts][0 pkts dropped (0.0 %)]
                [54'586.90 pkt/sec][257.22 Mbit/sec]
=========================
Thread 5
Absolute Stats: [2503 pkts rcvd][13166211 bytes rcvd]
                [2503 total pkts][0 pkts dropped (0.0 %)]
                [2'502.94 pkt/sec][105.33 Mbit/sec]
=========================
Thread 6
Absolute Stats: [54631 pkts rcvd][34089918 bytes rcvd]
                [54631 total pkts][0 pkts dropped (0.0 %)]
                [54'629.90 pkt/sec][272.71 Mbit/sec]
=========================
Thread 7
Absolute Stats: [54764 pkts rcvd][29179172 bytes rcvd]
                [54764 total pkts][0 pkts dropped (0.0 %)]
                [54'762.90 pkt/sec][233.43 Mbit/sec]
=========================


Thanks.



On Fri, May 17, 2013 at 8:30 PM, Cliff Burdick <[email protected]> wrote:

> I have an application configured with the DNA cluster running on core 0,
> with 8 threads running on cores 1-8 on a Xeon processor. I'm using a custom
> hash function which just picks off the last octet of the source IP, and
> sends it to threads 1-8. I'm loading the DNA driver using the following:
>
> insmod ixgbe.ko MQ=0,0  mtu=9000
>
> When I run pfdnacluster_multithread I can start 8 threads without any
> dropping of packets. My understanding is that to use zero-copy mode, I can
> only have a single thread operating on the packets at a time since the
> buffer is automatically freed when another pfring_recv call is made.
> Because of this, each of my slave threads make a copy of the data before
> immediately returning back to call pfring_recv again. For some reason, I am
> dropping what appears to be an increasing number of packets, depending on
> which thread it is. Usually the lower-numbered threads drop about 10%,
> while the higher-number ones drop around 90%. I'm receiving about 230Kpps
> (1.3Gbps) evenly distributed between the threads, and my understanding was
> that DNA mode would handle this. My code for the receiver is identical to
> the multithread example (8192 buffers for rx/tx, receive only, wait_mode
> =0).
>
> My slave thread makes the call using the following:
> pfring_recv_parsed(m_ring, &packet, 0, &header, 1, 0, 1, 0);
>
> Also, what is the preferred way of dropping packets inside of the hash
> function when I don't want it routed to any of my threads, return
> DNA_CLUSTER_FAIL, or send it to a queue that is not being processed?
>
> Any help is appreciated. Thanks.
>
_______________________________________________
Ntop-misc mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Reply via email to