Hi, I am using DPDK 18.11 on Ubuntu 18.04, with Mellanox Connect X-5 100G EN (MLNX_OFED_LINUX-4.5-1.0.1.0-ubuntu18.04-x86_64). Packet generator: t-rex 2.49 running on another machine.
I am able to achieve 100G line rate with l3fwd application (fr sz 64B) using the parameters suggested in their performance report. (https://fast.dpdk.org/doc/perf/DPDK_18_11_Mellanox_NIC_performance_report.pdf) However, as soon as I install rte_flow rules to steer packets to different queues and/or use rte_flow's mark action, the throughput reduces to ~41G. I also modified DPDK's flow_filtering example application, and am getting the same reduced throughput of around 41G out of 100G. But without rte_flow, it goes to 100G. I didn't change any OS/Kernel parameters to test l3fwd or the application that uses rte_flow. I also ensure the application is numa-aware and use 20 cores to handle 100G traffic. Upon further investigation (using Mellanox NIC counters), the drop in throughput is due to mbuf allocation errors. Is such performance degradation normal when performing hw-acceleration using rte_flow? Has anyone tested throughput performance using rte_flow @ 100G? Its surprising to see hardware offloading is degrading the performance, unless I am doing something wrong. Thanks, Arvind
