Hi

So without bgp (lcp) and only one static route performance is basically as it is on "paper" 24Mpps/100Gbit/s without problem.

And then no matter if with bond or without bond (with lcp) there are problems starting.


Basically side where im Receiving most traffic that need to be TX-ed to other interface is ok.

Interface with most RX traffic on vlan 2906 has ip address and when I ping from this ip address to ptp ip other side - there is no packet drops.

But on the interface where this traffic trat was RX-ed from this vlan2906 - and need to be TX-ed on vlan 514 - there are drops to ptp ip of other side - from 10 to 20%

Same is when I ping /mtr from RX-side to TX side there are drops - but there are no drops when I ping from TX side to RX side - so forwarding is done other side thru interface that has most RX - less TX


So it looks like interface busy with RX-traffix is ok - problem is when interface is mostly TX-ing traffic RX-ed from other interface... but dont know how to check what is causing it ... ethtool -S for any interface is showing no errors/drops at interface lvl.




On 12/16/22 10:51 AM, Benoit Ganne (bganne) via lists.fd.io wrote:
Hi,

So the hardware is:
Intel 6246R
96GB ram
Mellanox Connect-X 5 2x 100GB Ethernet NIC
And simple configuration  with vpp/frr where one vlan interface all
traffix is RX-ed and second vlan interface where this traffic is TX-ed -
it is normal internet traffic - about 20Gbit/s with 2Mpps
2Mpps looks definitely too low, in a similar setup, CSIT measures IPv4 NDR with 
rdma at ~17.6Mpps with 2 workers on 1 core (2 hyperthreads): 
http://csit.fd.io/trending/#eNrlkk0OwiAQhU-DGzNJwdKuXFh7D0NhtE36QwBN6-mljXHahTt3LoCQb-Y95gUfBocXj-2RyYLlBRN5Y-LGDqd9PB7WguhBtyPwJLmhsFyPUmYKnOkUNDaFLK2Aa8BQz7e4KuUReuNmFXGeVcw9bCSJ2Hoi8t2IGpRDRR3RjVBAv7LZvoeqrk516JsnUmmcgLiOeRDieqsfJrui7yHzcqn4XXj2H8Kzn_BkuesH1y0_UJYvWG6xEg

The output of 'sh err' and 'sh hard' would be useful too.

Below vpp config:
To start with, I'd recommend doing a simple test removing lcp, vlan & bond to 
see if you can reproduce CSIT performance, and then maybe add bond and finally lcp 
and vlan. This could help narrowing where performance drops.

Below also show run
The vector rate is really low, so it is really surprising there are drops...
Do you capture the show run output when you're dropping packets? Basically, 
when traffic is going through VPP and performance is maxing out, do 'cle run' 
and then 'sh run' to see the instantaneous values and not averages.

Anyone know how to interpret this data ? what are the Suspends for
api-rx-from-ring ?
This is a control plane task in charge of processing API messages. VPP uses 
cooperative multitasking within the main thread for control plane tasks, 
Suspends counts the number of times this specific task voluntarily released the 
CPU, yielding to other tasks.

and how to check what type of error(traffic) is doing drops:
You can capture dropped traffic:
pcap trace drop
<wait for some traffic to be dropped...>
pcap trace drop off

You can also use VPP packet tracer:
tr add rdma-input 1000
<wait for some traffic to be dropped...>
tr filter include error-drop 1000
sh tr max 1000

Best
ben



-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22343): https://lists.fd.io/g/vpp-dev/message/22343
Mute This Topic: https://lists.fd.io/mt/95697757/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to