If you need DPDK based statefull and also HTTP tests take a look on https://github.com/Juniper/warp17
Nitzan On Tue, Aug 8, 2017 at 12:34 AM, James Bensley <jwbens...@gmail.com> wrote: > On 7 August 2017 at 11:40, Raymond Burkholder <r...@oneunified.net> wrote: > > > > on some platforms, like linux, you need to check ‘ethtool -S’ to see if > the operating system is dropping packets (on tx or rx). which may require > some performance tuning of the network interfaces. > > Yeah ethtool -C is import to set the minimum RX IRQ (NET_RX) as low as you > can. > > Without using one of the third party libraries like Netmap, DPDK or > VPP, or similar to implement Kernel bypass techniques, or a tool that > uses them, you have to make lots of “tweaks” to get even a fraction of > that bandwidth or pps rates. EtherateMT uses Tx and Rx ring buffers > (using PACKET_MMAP_TX/PACKET_MMAP_RX), with AF_PACKET to dump the ring > with a single syscall and single context switch, it forcefully > increases the OS socket send/receive buffer size, it uses > PACKET_QDISC_BYPASS to bypass the Linux queuing discipline sub-system > (skipping and QoS configuration basically), it ignores dropped packets > using PACKET_LOSS, and can use FANOUT groups to spray traffic over all > Tx/Rx queues in the NIC. One can also use isolcpus and nohz_full. I > have some noted on host tuning I can share if anyone is interested, > I’d just need to dig them out. However even with all those, DPDK et al > are still much faster. > > > also, on a linux platform, the kernel guys use some trace tools, one of > which will create one buffer, and copy it to the network interface, making > a very effective high bandwidth tester, with some purporting to fill a 10g > link. I don’t have the name off the top of my head. > > You might be thinking of pktgen (the Kernel module and not the DPDK > based app!) which I believe can do 10Gbps using 64 byte packets. I > think (could be wrong here) over the years that morphed into trafgen > in the netsniff package: http://netsniff-ng.org/ > > By loading it into the kernel there is arguably one less copy from > user land process into kernel memory (as is the case with sendto() for > example; https://linux.die.net/man/2/sendto) and but using ring > buffers one syscall can be used to send or receive many packets from > the user land process into sk_buffs in Kernel memory and into DMA > space. DPDK uses similar ideas but it has something called the EAL > (environment abstraction layer) which can provide XSS within minimal > effort from the user and it can use it can DMA directly from it’s ring > buffer removing another copy-per-packet over Linux’s AF_PACKET module > (as well as loads of other cool shit). > > VPP which builds on DPDK recently passed the 1Tbps mark (10x100Gbps > interfaces with like 1M routes in FIB) using the new Intel SkyLake > CPU. They have achieved a PPS budget per packet that was stupidly low, > like 200 instructions per packet. > > > this being a cisco list, some cisco platforms have built in ttcp > performance testers. > > I always forget about that but I've never had a particularly great > experience with it. It's there on some ISR models, I also used it on > the ME3x00 switches once, but the throughput was like 20Mbps and I > found it quite flaky. > > > I think I'm hijacking this thread a bit with my own rants. > Sorry about that, > James. > _______________________________________________ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ > _______________________________________________ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/