Hi Flavio,
Thanks for your reply.
I have captured the suggested information but do not see anything that
could cause the packet drops.
Can you please take a look at the below data and see if you can find
something unusual ?
The PMDs are running on CPU 1,2,3,4 and CPU 1-7 are isolated cores.
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
root@bcm958802a8046c:~# cstats ; sleep 10; cycles
pmd thread numa_id 0 core_id 1:
idle cycles: 99140849 (7.93%)
processing cycles: 1151423715 (92.07%)
avg cycles per packet: 116.94 (1250564564/10693918)
avg processing cycles per packet: 107.67 (1151423715/10693918)
pmd thread numa_id 0 core_id 2:
idle cycles: 118373662 (9.47%)
processing cycles: 1132193442 (90.53%)
avg cycles per packet: 124.39 (1250567104/10053309)
avg processing cycles per packet: 112.62 (1132193442/10053309)
pmd thread numa_id 0 core_id 3:
idle cycles: 53805933 (4.30%)
processing cycles: 1196762002 (95.70%)
avg cycles per packet: 107.35 (1250567935/11649948)
avg processing cycles per packet: 102.73 (1196762002/11649948)
pmd thread numa_id 0 core_id 4:
idle cycles: 189102938 (15.12%)
processing cycles: 1061463293 (84.88%)
avg cycles per packet: 143.47 (1250566231/8716828)
avg processing cycles per packet: 121.77 (1061463293/8716828)
pmd thread numa_id 0 core_id 5:
pmd thread numa_id 0 core_id 6:
pmd thread numa_id 0 core_id 7:
*Runtime summary* comm parent sched-in
run-time min-run avg-run max-run stddev migrations
(count) (msec) (msec)
(msec) (msec) %
---------------------------------------------------------------------------------------------------------------------
ksoftirqd/0[7] 2 1 0.079 0.079
0.079 0.079 0.00 0
rcu_sched[8] 2 14 0.067 0.002
0.004 0.009 9.96 0
rcuos/4[38] 2 6 0.027 0.002
0.004 0.008 20.97 0
rcuos/5[45] 2 4 0.018 0.004
0.004 0.005 6.63 0
kworker/0:1[71] 2 12 0.156 0.008
0.013 0.019 6.72 0
mmcqd/0[1230] 2 3 0.054 0.001
0.018 0.031 47.29 0
kworker/0:1H[1248] 2 1 0.006 0.006
0.006 0.006 0.00 0
kworker/u16:2[1547] 2 16 0.045 0.001
0.002 0.012 26.19 0
ntpd[5282] 1 1 0.063 0.063
0.063 0.063 0.00 0
watchdog[6988] 1 2 0.089 0.012
0.044 0.076 72.26 0
ovs-vswitchd[9239] 1 2 0.326 0.152
0.163 0.173 6.45 0
revalidator8[9309/9239] 9239 2 1.260 0.607
0.630 0.652 3.58 0
perf[27150] 27140 1 0.000 0.000
0.000 0.000 0.00 0
Terminated tasks:
sleep[27151] 27150 4 1.002 0.015
0.250 0.677 58.22 0
Idle stats:
CPU 0 idle for 999.814 msec ( 99.84%)
*CPU 1 idle entire time window CPU 2 idle entire time window CPU 3
idle entire time window CPU 4 idle entire time window*
CPU 5 idle for 500.326 msec ( 49.96%)
CPU 6 idle entire time window
CPU 7 idle entire time window
Total number of unique tasks: 14
Total number of context switches: 115
Total run time (msec): 3.198
Total scheduling time (msec): 1001.425 (x 8)
(END)
*02:16:22 UID TGID TID %usr %system %guest %wait
%CPU CPU Command *02:16:23 0 9239 - 100.00 0.00
0.00 0.00 100.00 5 ovs-vswitchd
02:16:23 0 - 9239 2.00 0.00 0.00 0.00
2.00 5 |__ovs-vswitchd
02:16:23 0 - 9240 0.00 0.00 0.00 0.00
0.00 0 |__vfio-sync
02:16:23 0 - 9241 0.00 0.00 0.00 0.00
0.00 5 |__eal-intr-thread
02:16:23 0 - 9242 0.00 0.00 0.00 0.00
0.00 5 |__dpdk_watchdog1
02:16:23 0 - 9244 0.00 0.00 0.00 0.00
0.00 5 |__urcu2
02:16:23 0 - 9279 0.00 0.00 0.00 0.00
0.00 5 |__ct_clean3
02:16:23 0 - 9308 0.00 0.00 0.00 0.00
0.00 5 |__handler9
02:16:23 0 - 9309 0.00 0.00 0.00 0.00
0.00 5 |__revalidator8
02:16:23 0 - 9328 0.00 0.00 0.00 0.00
0.00 6 |__pmd13
02:16:23 0 - 9330 100.00 0.00 0.00 0.00
100.00 3 |__pmd12
02:16:23 0 - 9331 100.00 0.00 0.00 0.00
100.00 1 |__pmd11
02:16:23 0 - 9332 0.00 0.00 0.00 0.00
0.00 7 |__pmd10
02:16:23 0 - 9333 0.00 0.00 0.00 0.00
0.00 5 |__pmd16
02:16:23 0 - 9334 100.00 0.00 0.00 0.00
100.00 2 |__pmd15
02:16:23 0 - 9335 100.00 0.00 0.00 0.00
100.00 4 |__pmd14
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Thanks
Vinay
On Tue, Jun 2, 2020 at 12:06 PM Flavio Leitner <[email protected]> wrote:
> On Mon, Jun 01, 2020 at 07:27:09PM -0400, Shahaji Bhosle via dev wrote:
> > Hi Ben/Ilya,
> > Hope you guys are doing well and staying safe. I have been chasing a
> weird
> > problem with small drops and I think that is causing lots of TCP
> > retransmission.
> >
> > Setup details
> > iPerf3(1k-5K
> > Servers)<--DPDK2:OvS+DPDK(VxLAN:BOND)[DPDK0+DPDK1)<====2x25G<====
> > [DPDK0+DPDK1)(VxLAN:BOND)OVS+DPDKDPDK2<---iPerf3(Clients)
> >
> > All the Drops are ring drops on BONDed functions on the server side. I
> > have 4 CPUs each with 3PMD threads, DPDK0, DPDK1 and DPDK2 all running
> with
> > 4 Rx rings each.
> >
> > What is interesting is when I give each Rx rings its own CPU the drops go
> > away. Or if I set cother_config:emc-insert-inv-prob=1 the drops go away.
> > But I need to scale up the number of flows so trying to run this with EMC
> > disabled.
> >
> > I can tell that the rings are not getting serviced for 30-40usec because
> of
> > some kind context switch or interrupts on these cores. I have tried to do
> > the usual isolation, nohz_full rcu_nocbs etc. Move all the interrupts
> away
> > from these cores etc. But nothing helps. I mean it improves, but the
> drops
> > still happen.
>
> When you disable the EMC (or reduce its efficiency) the per packet cost
> increases, then it becomes more sensitive to variations. If you share
> a CPU with multiple queues, you decrease the amount of time available
> to process the queue. In either case, there will be less room to tolerate
> variations.
>
> Well, you might want to use 'perf' and monitor for the scheduling events
> and then based on the stack trace see what is causing it and try to
> prevent it.
>
> For example:
> # perf record -e sched:sched_switch -a -g sleep 1
>
> For instance, you might see that another NIC used for management has
> IRQs assigned to one isolated CPU. You can move it to another CPU to
> reduce the noise, etc...
>
> Another suggestion is look at PMD thread idle statistics because it
> will tell you how much "extra" room you have left. As it approaches
> to 0, more fine tuned your setup needs to be to avoid drops.
>
> HTH,
> --
> fbl
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev