I think The ARP abnormal is caused by the entire FIB of VPP can not process any packets;
During these days, I fount when I use tcpdump on both side interface when VPP NAT multithreads mode, 50% out2in reply packets are lost, the lost packets are handoff to vpp_wk_1(thread_2); the “show run “ results show that these have on active “nat44-out2in “; Is this the root cause? DBGvpp# show run Thread 0 vpp_main (lcore 0) Time 403.2, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call acl-plugin-fa-cleaner-process event wait 0 0 1 3.61e4 0.00 admin-up-down-process event wait 0 0 1 5.37e4 0.00 api-rx-from-ring any wait 0 0 24 7.56e5 0.00 bfd-process event wait 0 0 1 2.92e4 0.00 cdp-process any wait 0 0 55 5.25e5 0.00 dhcp-client-process any wait 0 0 5 3.69e4 0.00 dns-resolver-process any wait 0 0 1 1.34e4 0.00 dpdk-ipsec-process done 1 0 0 2.13e5 0.00 dpdk-process any wait 0 0 133 3.18e6 0.00 fib-walk any wait 0 0 199 1.02e4 0.00 flow-report-process any wait 0 0 1 1.34e4 0.00 flowprobe-timer-process any wait 0 0 1 2.40e4 0.00 ikev2-manager-process any wait 0 0 395 7.59e3 0.00 ioam-export-process any wait 0 0 1 1.99e4 0.00 ip-route-resolver-process any wait 0 0 5 2.75e4 0.00 ip6-icmp-neighbor-discovery-ev any wait 0 0 395 7.52e3 0.00 l2fib-mac-age-scanner-process event wait 0 0 1 9.48e3 0.00 lisp-retry-service any wait 0 0 199 1.47e4 0.00 lldp-process event wait 0 0 1 2.12e7 0.00 memif-process event wait 0 0 1 3.80e4 0.00 nat-det-expire-walk done 1 0 0 1.23e4 0.00 nat64-expire-walk any wait 0 0 41 3.54e4 0.00 nat64-expire-worker-walk interrupt wa 40 0 0 1.47e4 0.00 send-garp-na-process event wait 0 0 1 4.37e3 0.00 startup-config-process done 1 0 1 1.41e9 0.00 udp-ping-process any wait 0 0 1 2.80e4 0.00 unix-cli-local:0 active 0 0 48 2.12e7 0.00 unix-epoll-input polling 170368 0 0 5.27e6 0.00 vhost-user-process any wait 0 0 1 1.42e4 0.00 vhost-user-send-interrupt-proc any wait 0 0 1 1.18e4 0.00 vpe-link-state-process event wait 0 0 4 1.08e4 0.00 vpe-oam-process any wait 0 0 194 1.03e4 0.00 vxlan-gpe-ioam-export-process any wait 0 0 1 2.21e4 0.00 wildcard-ip4-arp-publisher-pro event wait 0 0 1 1.45e4 0.00 --------------- Thread 1 vpp_wk_0 (lcore 1) Time 403.2, average vectors/node 1.71, last 128 main loops 0.00 per node 0.00 vector rates in 1.4584e0, out 3.2244e-1, drop 1.1484e0, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call TenGigabitEthernet83/0/0-outpu active 12 64 0 9.52e2 5.33 TenGigabitEthernet83/0/0-tx active 12 64 0 1.25e3 5.33 TenGigabitEthernet83/0/1-outpu active 8 66 0 1.63e3 8.25 TenGigabitEthernet83/0/1-tx active 8 66 0 2.38e3 8.25 arp-input active 2 2 0 7.03e4 1.00 dpdk-input polling 509908889 588 0 8.09e8 0.00 error-drop active 427 463 0 5.21e3 1.08 ethernet-input active 425 431 0 8.32e3 1.01 ip4-glean active 2 32 0 5.23e4 16.00 ip4-input-no-checksum active 15 157 0 1.08e4 10.47 ip4-load-balance active 12 64 0 1.02e3 5.33 ip4-lookup active 19 160 0 2.44e3 8.42 ip4-rewrite active 17 128 0 1.64e3 7.53 llc-input active 394 396 0 2.89e3 1.01 lldp-input active 30 33 0 1.17e4 1.10 nat44-in2out active 9 96 0 5.42e3 10.67 nat44-in2out-slowpath active 8 96 0 3.86e4 12.00 nat44-in2out-worker-handoff active 5 94 0 6.04e4 18.80 nat44-out2in active 12 64 0 3.63e3 5.33 nat44-out2in-worker-handoff active 11 63 0 5.01e4 5.73 nat64-expire-worker-walk interrupt wa 40 0 0 9.92e3 0.00 --------------- Thread 2 vpp_wk_1 (lcore 7) Time 403.2, average vectors/node 12.79, last 128 main loops 0.00 per node 0.00 vector rates in 4.0429e-1, out 1.5874e-1, drop 7.9370e-2, punt 0.0000e0 Name State Calls Vectors Suspends Clocks Vectors/Call TenGigabitEthernet83/0/1-outpu active 6 64 0 7.76e2 10.67 TenGigabitEthernet83/0/1-tx active 6 64 0 9.85e2 10.67 dpdk-input polling 504121313 163 0 2.93e9 0.00 error-drop active 2 32 0 3.53e3 16.00 ip4-glean active 2 32 0 7.68e2 16.00 ip4-input-no-checksum active 11 163 0 1.21e4 14.82 ip4-lookup active 8 96 0 1.54e3 12.00 ip4-rewrite active 6 64 0 1.43e3 10.67 nat44-in2out active 9 96 0 1.44e4 10.67 nat44-in2out-slowpath active 8 96 0 5.40e4 12.00 nat44-in2out-worker-handoff active 4 98 0 5.61e4 24.50 nat44-out2in-worker-handoff active 7 65 0 4.74e4 9.29 nat64-expire-worker-walk interrupt wa 40 0 0 1.08e4 0.00 发件人: "Matus Fabian -X (matfabia - PANTHEON TECHNOLOGIES at Cisco)" <matfa...@cisco.com> 日期: 2018年1月5日 星期五 下午3:23 收件人: 李洪亮 <lihongli...@360.cn> 抄送: "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io> 主题: RE: The performance problem of NAT plugin I tested ARP with stable/1801 and it works fine 23:20:10.885539 ARP, Request who-has 3.3.3.5 tell 3.3.3.1, length 28 23:20:10.885769 ARP, Reply 3.3.3.5 is-at 08:00:27:c9:ea:36 (oui Unknown), length 46 Matus From: 李洪亮 [mailto:lihongli...@360.cn] Sent: Friday, December 22, 2017 4:36 PM To: Matus Fabian -X (matfabia - PANTHEON TECHNOLOGIES at Cisco) <matfa...@cisco.com> Cc: vpp-dev@lists.fd.io Subject: Re: The performance problem of NAT plugin I find that when I use NAT , the ARP is abnormal; AS, I use the 3.3.3.5 as the NAT pool address: nat44 add address 3.3.3.5 On target NIC(3.3.3.1), the tcpdump can capture the ARP request without reply: 23:21:57.783994 ARP, Request who-has 3.3.3.5 tell 3.3.3.1, length 28
_______________________________________________ vpp-dev mailing list vpp-dev@lists.fd.io https://lists.fd.io/mailman/listinfo/vpp-dev