On 23.08.2019 19:08, William Tu wrote:
> On Wed, Aug 21, 2019 at 2:31 AM Eelco Chaudron <[email protected]> wrote:
>>
>>
>>
>>>>> William, Eelco, which HW NIC you're using? Which kernel driver?
>>>>
>>>> I’m using the below on the latest bpf-next driver:
>>>>
>>>> 01:00.0 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
>>>> SFI/SFP+ Network Connection (rev 01)
>>>> 01:00.1 Ethernet controller: Intel Corporation 82599ES 10-Gigabit
>>>> SFI/SFP+ Network Connection (rev 01)
>>>
>>> Thanks for information.
>>> I found one suspicious place inside the ixgbe driver that could break
>>> the completion queue ring and prepared a patch:
>>>
>>> https://protect2.fireeye.com/url?k=ac2418ed930ec67f.ac2593a2-94283087c2dd9833&u=https://patchwork.ozlabs.org/patch/1150244/
>>>
>>> It'll be good if you can test it.
>>
>> Hi Ilya, I was doping some testing of my own, and also concluded it was
>> in the drivers' completion ring. I noticed after sending 512 packets the
>> drivers TX counters kept increasing, which looks related to your fix.
>>
>> Will try it out, and sent results to the upstream mailing list…
>>
>> Thanks,
>>
>> Eelco
>
> Hi,
>
> I'm comparing the performance of netdev-afxdp.c on current master and
> the DPDK's AF_XDP implementation in OVS dpdk-latest branch.
> I'm using ixgbe and doing physical port to physical port forwarding, sending
> 64 byte packets, with OpenFlow rule:
> ovs-ofctl add-flow br0 "in_port=eth2, actions=output:eth3"
>
> In short
> A. OVS's netdev-afxdp: 6.1Mpps
> B. OVS-DPDK AF_XDP pmd: 8Mpps
> So I start to think about how to optimize lib/netdev-afxdp.c. Any comments are
> welcomed! Below is the analysis:
One major difference is that DPDK implementation supports XDP_USE_NEED_WAKEUP
and it will be in use if you're building kernel from latest bpf-next tree.
This allowes to significantly decrease number of syscalls.
According to below perf stats, OVS implementation unlike dpdk one wastes ~11%
of time inside the kernel and this could be fixed by need_wakeup feature.
BTW, there are a lot of pmd threads in case A, but only one in case B.
Was the test setup really equal?
Best regards, Ilya Maximets.
>
> A. OVS netdev-afxdp Physical to physical 6.1Mpps
> # pstree -p 702
> ovs-vswitchd(702)-+-{ct_clean1) S 1 7(706)
> |-{handler4}(712)
> |-{ipf_clean2}(707)
> |-{pmd6}(790)
> |-{pmd7}(791)
> |-{pmd8}(792)
> |-{pmd9}(793)
> |-{revalidator5}(713)
> `-{urcu3}(708)
>
> # ovs-appctl dpif-netdev/pmd-stats-show
> pmd thread numa_id 0 core_id 6:
> packets received: 92290351
> packet recirculations: 0
> avg. datapath passes per packet: 1.00
> emc hits: 92290319
> smc hits: 0
> megaflow hits: 31
> avg. subtable lookups per megaflow hit: 1.00
> miss with success upcall: 1
> miss with failed upcall: 0
> avg. packets per output batch: 31.88
> idle cycles: 20835727677 (34.86%) --> pretty high!?
> processing cycles: 38932097052 (65.14%)
> avg cycles per packet: 647.61 (59767824729/92290351)
> avg processing cycles per packet: 421.84 (38932097052/92290351)
>
> # ./perf record -t 790 sleep 10
> 13.80% pmd6 ovs-vswitchd [.] miniflow_extract
> 13.58% pmd6 ovs-vswitchd [.] __netdev_afxdp_batch_send
> 9.64% pmd6 ovs-vswitchd [.] dp_netdev_input__
> 9.07% pmd6 ovs-vswitchd [.] dp_packet_init__
> 8.91% pmd6 ovs-vswitchd [.] netdev_afxdp_rxq_recv
> 7.40% pmd6 ovs-vswitchd [.] miniflow_hash_5tuple
> 5.32% pmd6 libc-2.23.so [.] __memcpy_avx_unaligned
> 4.60% pmd6 [kernel.vmlinux] [k] do_syscall_64
> 3.72% pmd6 ovs-vswitchd [.] dp_packet_use_afxdp -->
> maybe optimize?
> 2.74% pmd6 libpthread-2.23.so [.] __pthread_enable_asynccancel
> 2.43% pmd6 [kernel.vmlinux] [k] fput_many
> 2.18% pmd6 libc-2.23.so [.] __memcmp_sse4_1
> 2.06% pmd6 [kernel.vmlinux] [k] entry_SYSCALL_64
> 1.79% pmd6 [kernel.vmlinux] [k] syscall_return_via_sysret
> 1.71% pmd6 ovs-vswitchd [.] dp_execute_cb
> 1.03% pmd6 ovs-vswitchd [.] non_atomic_ullong_add
> 0.86% pmd6 ovs-vswitchd [.]dp_netdev_pmd_flush_output_on_port
>
> B. OVS-DPDK afxdp using dpdk-latest 8Mpps
> ovs-vswitchd(19462)-+-{ct_clean3}(19470)
> |-{dpdk_watchdog1}(19468)
> |-{eal-intr-thread}(19466)
> |-{handler16}(19501)
> |-{handler17}(19505)
> |-{handler18}(19506)
> |-{handler19}(19507)
> |-{handler20}(19508)
> |-{handler22}(19502)
> |-{handler24}(19504)
> |-{handler26}(19503)
> |-{ipf_clean4}(19471)
> |-{pmd27}(19536)
> |-{revalidator21}(19509)
> |-{revalidator23}(19511)
> |-{revalidator25}(19510)
> |-{rte_mp_handle}(19467)
> `-{urcu2}(19469)
>
> # ovs-appctl dpif-netdev/pmd-stats-show
> pmd thread numa_id 0 core_id 11:
> packets received: 1813689117
> packet recirculations: 0
> avg. datapath passes per packet: 1.00
> emc hits: 1813689053
> smc hits: 0
> megaflow hits: 63
> avg. subtable lookups per megaflow hit: 1.00
> miss with success upcall: 1
> miss with failed upcall: 0
> avg. packets per output batch: 31.85
> idle cycles: 13848892341 (2.50%)
> processing cycles: 541064826249 (97.50%)
> avg cycles per packet: 305.96 (554913718590/1813689117)
> avg processing cycles per packet: 298.32 (541064826249/1813689117)
>
> # ./perf record -t 19536 sleep 10
> 24.84% pmd27 ovs-vswitchd [.] eth_af_xdp_rx
> 16.27% pmd27 ovs-vswitchd [.] eth_af_xdp_tx
> 13.20% pmd27 ovs-vswitchd [.] dp_netdev_input__
> 12.54% pmd27 ovs-vswitchd [.] pull_umem_cq
> 10.85% pmd27 ovs-vswitchd [.] miniflow_extract
> 5.67% pmd27 ovs-vswitchd [.] miniflow_hash_5tuple
> 3.41% pmd27 libc-2.23.so [.] __memcmp_sse4_1
> 2.14% pmd27 ovs-vswitchd [.] netdev_dpdk_rxq_recv
> 2.13% pmd27 ovs-vswitchd [.] dp_execute_cb
> 1.50% pmd27 ovs-vswitchd [.] non_atomic_ullong_add
> 1.49% pmd27 ovs-vswitchd [.] dp_netdev_pmd_flush_output_on_port
> 1.05% pmd27 ovs-vswitchd [.] netdev_dpdk_filter_packet_len
> 0.79% pmd27 ovs-vswitchd [.] pmd_perf_end_iteration
> 0.74% pmd27 ovs-vswitchd [.] dp_netdev_process_rxq_port
> 0.47% pmd27 ovs-vswitchd [.] memcmp@plt
> 0.42% pmd27 ovs-vswitchd [.] netdev_dpdk_eth_send
>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev