or NUMA

Chris Wakelin Fri, 28 Nov 2014 03:43:13 -0800

I managed to get to the bottom of why Suricata wasn't working with
hugepages in PF_RING libzero or ZC. It seems it was because I was
running Suricata as a non-root user which meant it couldn't mmap the
hugepages (presumably it drops privileges too soon). Running as root
solves the problem (though isn't ideal!).


I'll have a look at the Suricata code to see if it's something very easy
to fix or raise an issue with the developers.

ARGUS is also running as a non-root user, but I guess it didn't suffer
from this as it's using libpcap and/or drops privileges later.

Best Wishes,
Chris

On 29/10/14 20:35, Chris Wakelin wrote:
> Hi Alfredo,
> 
> Did you manage to test Suricata with libzero+hugepages or ZC?
> 
> I've just had another go after a clean reboot (now on fully-patched
> Ubuntu 12.04.5 64-bit, kernel 3.2.0-70, PF_RING 6.0.2), followed by
> reserving 1024 2048-KB pages :-
> 
> insmod ixgbe.ko RSS=1,1 mtu=1522 adapters_to_enable=xx:xx:xx:xx:xx:xx
> num_rx_slots=32768 num_tx_slots=0 numa_cpu_affinity=1,1
> ifconfig up dna0
> 
> echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> cat /proc/meminfo | grep Huge
> mount -t hugetlbfs none /mnt/huge
> 
> pfdnacluster_master -i dna0 -c 1 -n 15,1 -r 15 -m 4 -u /mnt/huge -d
> 
> I connected ARGUS to dnacl:1@15 and it worked fine.
> 
> However any attempt to start Suricata fails with things like:
>> [14115] 29/10/2014 -- 19:59:25 - (runmode-pfring.c:287) <Info> 
>> (ParsePfringConfig) -- DNA interface detected, not setting cluster-id for 
>> PF_RING (iface dnacl:1@0)
>> [14115] 29/10/2014 -- 19:59:25 - (runmode-pfring.c:335) <Info> 
>> (ParsePfringConfig) -- DNA interface detected, not setting cluster type for 
>> PF_RING (iface dnacl:1@0)
>> [14115] 29/10/2014 -- 19:59:25 - (util-runmodes.c:559) <Info> 
>> (RunModeSetLiveCaptureWorkersForDevice) -- Going to use 1 thread(s)
>> [14116] 29/10/2014 -- 19:59:25 - (util-affinity.c:320) <Info> 
>> (AffinityGetNextCPU) -- Setting affinity on CPU 0
>> [14116] 29/10/2014 -- 19:59:25 - (tm-threads.c:1439) <Info> 
>> (TmThreadSetupOptions) -- Setting prio -2 for "RxPFdnacl:1@01" Module to 
>> cpu/core 0, thread id 14116
>> [14116] 29/10/2014 -- 19:59:25 - (tm-threads.c:1350) <Error> 
>> (TmThreadSetPrio) -- [ERRCODE: SC_ERR_THREAD_NICE_PRIO(47)] - Error setting 
>> nice value for thread RxPFdnacl:1@01: Operation not permitted
>> [14116] 29/10/2014 -- 19:59:25 - (tmqh-packetpool.c:291) <Info> 
>> (PacketPoolInit) -- preallocated 512 packets. Total memory 1790976
>> [14116] 29/10/2014 -- 19:59:25 - (source-pfring.c:446) <Error> 
>> (ReceivePfringThreadInit) -- [ERRCODE: SC_ERR_PF_RING_OPEN(34)] - Failed to 
>> open dnacl:1@0: pfring_open error. Check if dnacl:1@0 exists and pf_ring 
>> module is loaded.
>> [14115] 29/10/2014 -- 19:59:25 - (runmode-pfring.c:287) <Info> 
>> (ParsePfringConfig) -- DNA interface detected, not setting cluster-id for 
>> PF_RING (iface dnacl:1@1)
>> [14115] 29/10/2014 -- 19:59:25 - (runmode-pfring.c:335) <Info> 
>> (ParsePfringConfig) -- DNA interface detected, not setting cluster type for 
>> PF_RING (iface dnacl:1@1)
>> [14115] 29/10/2014 -- 19:59:25 - (util-runmodes.c:559) <Info> 
>> (RunModeSetLiveCaptureWorkersForDevice) -- Going to use 1 thread(s)
>> [14117] 29/10/2014 -- 19:59:25 - (util-affinity.c:320) <Info> 
>> (AffinityGetNextCPU) -- Setting affinity on CPU 1
>> [14117] 29/10/2014 -- 19:59:25 - (tm-threads.c:1439) <Info> 
>> (TmThreadSetupOptions) -- Setting prio -2 for "RxPFdnacl:1@11" Module to 
>> cpu/core 1, thread id 14117
>> [14117] 29/10/2014 -- 19:59:25 - (tm-threads.c:1350) <Error> 
>> (TmThreadSetPrio) -- [ERRCODE: SC_ERR_THREAD_NICE_PRIO(47)] - Error setting 
>> nice value for thread RxPFdnacl:1@11: Operation not permitted
>> [14117] 29/10/2014 -- 19:59:25 - (tmqh-packetpool.c:291) <Info> 
>> (PacketPoolInit) -- preallocated 512 packets. Total memory 1790976
>> [14117] 29/10/2014 -- 19:59:25 - (source-pfring.c:446) <Error> 
>> (ReceivePfringThreadInit) -- [ERRCODE: SC_ERR_PF_RING_OPEN(34)] - Failed to 
>> open dnacl:1@1: pfring_open error. Check if dnacl:1@1 exists and pf_ring 
>> module is loaded.
> 
> My Suricata config looks like (I know the cluster settings are ignored):-
> 
> pfring:
>   - interface: dnacl:1@0
>     threads: 1
>     cluster-id: 99
>     cluster-type: cluster_flow
>   - interface: dnacl:1@1
>     threads: 1
>     cluster-id: 99
>     cluster-type: cluster_flow
> 
> ...
> 
>   - interface: dnacl:1@14
>     threads: 1
>     cluster-id: 99
>     cluster-type: cluster_flow
> 
> If I start pfdnacluster_master without "-u /mnt/huge", then Suricata
> works fine (well, it drops some packets; when it's doing that, the CPU
> cores are usually not anywhere near being maxed out, which is why I want
> to get this to work :-) )
> 
> Everything I could think of trying with pfcount or pfdump works fine
> with the huge pages, and as far as I can see pfring_open() is called in
> a similar way to that in Suricata.
> 
> e.g.:
> pfcount -i dnacl:1@14 -m -l 1522 -g 14
> 
> Relevant bit of Suricata (git master of two days ago) src/source-pfring.c :
> 
>>     opflag = PF_RING_REENTRANT | PF_RING_PROMISC;
>>
>>     /* if suri uses VLAN and if we have a recent kernel, we need
>>      * to use parsed_pkt to get VLAN info */
>>     if ((! ptv->vlan_disabled) && SCKernelVersionIsAtLeast(3, 0)) {
>>         opflag |= PF_RING_LONG_HEADER;
>>     }
>>
>>     if (ptv->checksum_mode == CHECKSUM_VALIDATION_RXONLY) {
>>         if (strncmp(ptv->interface, "dna", 3) == 0) {
>>             SCLogWarning(SC_ERR_INVALID_VALUE,
>>                          "Can't use rxonly checksum-checks on DNA interface,"
>>                          " resetting to auto");
>>             ptv->checksum_mode = CHECKSUM_VALIDATION_AUTO;
>>         } else {
>>             opflag |= PF_RING_LONG_HEADER;
>>         }
>>     }
>>
>>     ptv->pd = pfring_open(ptv->interface, (uint32_t)default_packet_size, 
>> opflag);
>>     if (ptv->pd == NULL) {
>>         SCLogError(SC_ERR_PF_RING_OPEN,"Failed to open %s: pfring_open 
>> error."
>>                 " Check if %s exists and pf_ring module is loaded.",
>>                 ptv->interface,
>>                 ptv->interface);
>>         pfconf->DerefFunc(pfconf);
>>         return TM_ECODE_FAILED;
>>     } else {
> 
> I have checksums disabled and VLANs enabled at the moment (though had
> the same problem with VLANs disabled). Default packet size is 1522 (we
> have VLANs).
> 
> P.S. I tried running pfdnacluster_master with just "-n 7,1" and Suricata
> using just the cores on that NUMA node, and it seems I do need more
> cores than that!
> 
> P.P.S. Another question I forgot to ask - do you recommend disabling
> hyperthreading (I have)?
> 
> Best Wishes,
> Chris
> 
> On 22/10/14 23:48, Alfredo Cardigliano wrote:
>> Hi Chris
>> please read below
>>
>>> On 22 Oct 2014, at 21:43, Chris Wakelin <c.d.wake...@reading.ac.uk> wrote:
>>>
>>> Hi,
>>>
>>> Our Suricata instance running on PF_RING with libzero has been dropping
>>> packets recently (at ~2Gb/s load), but the CPU cores are not maxed out
>>> in general. So I've been looking again at more recent PF_RING options :-)
>>>
>>> The setup is a Dell R620 with 64GB RAM (OK I should add more), two CPUS
>>> with 8 cores on each (hyperthreading turned off), and a ixgbe Intel 10Gb
>>> dual-port card of which I'm using just one port. I'm using PF_RING 6.0.2
>>> at the moment.
>>>
>>> I must admit I'm a bit confused!
>>>
>>> I load the DNA ixgbe with
>>>
>>> insmod ixgbe.ko RSS=1,1 mtu=1522 adapters_to_enable=xx:xx:xx:xx:xx:xx
>>> (the port I'm using)
>>> then
>>>
>>> pfdnacluster_master -i dna0 -c 1 -n 15,1 -r 15 -d
>>>
>>> Suricata then runs (in "workers" runmode) using dnacl:1@0 ... 1@14 and
>>> we run ARGUS (using libpcap) on dnacl:1@15
>>>
>>> So questions :-
>>>
>>> 1) How does CPU affinity work in libzero (or ZC)? There's no IRQs to fix ...
>>> Does it bind dnacl:1@0 to core 0, dnacl:1@1 to core 1 etc.?
>>
>> IRQs are not used, you can set core affinity for ring memory allocation 
>> using numa_cpu_affinity
>>
>> insmod ixgbe.ko RSS=1,1 mtu=1522 num_rx_slots=32768 
>> adapters_to_enable=xx:xx:xx:xx:xx:xx numa_cpu_affinity=0,0
>>
>>> What should
>>> the RX thread (pfdnacluster_master -r) be bound to?
>>
>> You should bind the master on one of the cores of the CPU where the NIC is 
>> connected (same core as numa_cpu_affinity).
>>
>>> 2) After reading
>>> http://www.ntop.org/pf_ring/not-all-servers-are-alike-with-pf_ring-zcdna-part-3/
>>> I'm wondering whether I would be better running just 8 queues (or 7 and
>>> 1 for ARGUS) and forcing them somehow to the NUMA node the ixgbe card is
>>> attached to?
>>
>> This is recommended if 8 cores are enough for packet processing, otherwise 
>> it might be worth crossing the QPI bus. You should run some test.
>>
>>> (If yes, how do I bind libzero to cores 0,2,4,6,8,10,12,14 or whatever
>>> numactl says is on the same node as the NIC?)
>>
>> -r for the master, check suricata and argus for affinity options.
>>
>>> 3) Hugepages work in that I can allocate 1024 2048K ones as suggested in
>>> README.hugepages and then run pfdnacluster_master with the "-u
>>> /mnt/huge" option, and then pfcount, tcpdump etc. work. However Suricata
>>> always crashes out.
>>
>> I will run some test asap.
>>
>>> Similarly if I start pfdnacluster_master without huge pages, then
>>> Suricata, then stop and restart pfdnacluster_master with huge pages,
>>> while Suricata is still running the latter fails (but is fine restarting
>>> without huge pages).
>>
>> Expected, you should not change the configuration while running.
>>
>>> If I start ZC version of ixgbe (which needs huge pages of course) and use
>>>
>>> zbalance_ipc -i zc:eth4 -c 1 -n 15,1 -m 1
>>> (with Suricata talking to zc:1@0 .. zc:@14) then Suricata also fails in
>>> a similar way (errors like "[ERRCODE: SC_ERR_PF_RING_OPEN(34)] - Failed
>>> to open zc:1@0: pfring_open error. Check if zc:1@0 exists"), though
>>> pfcount and tcpdump are fine.
>>
>> I will test also this configuration.
>>
>>> Is it worth going for 1GB pages (which are available) and how many would
>>> I need?
>>
>> 1GB pages should be supported but not tested.
>>
>>> 4) Is it worth increasing the number of slots in each queue
>>> (pfdnacluster_master -q) or num_rx_slots (in loading ixgbe)?
>>
>> This can help handling spikes.
>>
>>> (We've replaced our border switches with ones our Network Manager is
>>> confident won't crash if somehow PF_RING *sends* packets to the mirrored
>>> port - that crashed one of the old switches - so I'm allowed to reload
>>> PF_RING + NIC drivers without going through Change Management and
>>> "at-risk" periods now :-) )
>>
>> :-)
>>
>>> Best Wishes,
>>> Chris
>>
>> BR
>> Alfredo
>>
>>>
>>> -- 
>>> --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
>>> Christopher Wakelin,                           c.d.wake...@reading.ac.uk
>>> IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 2908
>>> Whiteknights, Reading, RG6 6AF, UK              Fax: +44 (0)118 975 3094
>>> _______________________________________________
>>> Ntop-misc mailing list
>>> Ntop-misc@listgateway.unipi.it
>>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>>
>> _______________________________________________
>> Ntop-misc mailing list
>> Ntop-misc@listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>>
> 
> 


-- 
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
Christopher Wakelin,                           c.d.wake...@reading.ac.uk
IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 2908
Whiteknights, Reading, RG6 6AF, UK              Fax: +44 (0)118 975 3094
_______________________________________________
Ntop-misc mailing list
Ntop-misc@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-misc

Re: [Ntop-misc] PF_RING ZC/libzero hugepages and/or NUMA

Reply via email to