Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-06 Thread Yanqin Wei
The 2nd one is another periodic task for dpcls ranking.

From: Yanqin Wei
Sent: Tuesday, July 7, 2020 1:19 PM
To: Shahaji Bhosle 
Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd 
; Ilya Maximets ; Lee Reed 
; Vinay Gupta ; Alex Barba 

Subject: RE: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Shahaji,

Yes, update some counter each 10 second for pmd balance and pmd info 
collection.  I have no idea of how to disable them from outside.
You could try to modify the following number and observe packet loss.

/* Time in microseconds of the interval in which rxq processing cycles used
* in rxq to pmd assignments is measured and stored. */
#define PMD_RXQ_INTERVAL_LEN 1000LL

/* Time in microseconds between successive optimizations of the dpcls
* subtable vector */
#define DPCLS_OPTIMIZATION_INTERVAL 100LL

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Tuesday, July 7, 2020 12:23 PM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Thanks Yangin,
What does this define mean? Every 10 second some kind of book keeping of the 
packet processing cycles ? Are you saying to make this even bigger in time. 
1000 seconds or something? If I want to disable what do I do?
Thanks, Shahaji

On Mon, Jul 6, 2020 at 10:30 PM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

It seems to be caused by some periodic task.  In the pmd thread, pmd auto load 
balance would be done periodically.
/* Time in microseconds of the interval in which rxq processing cycles used
* in rxq to pmd assignments is measured and stored. */
#define PMD_RXQ_INTERVAL_LEN 1000LL

Would you like to disable it if it is not necessary?

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Monday, July 6, 2020 8:24 PM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Yanqin,
The drops are random intervals, sometimes I can run for minutes without drops. 
The case is very borderline with when CPUs are close to 99% and with around 
1000 flows. We see the drops once every 10-15 seconds and its random in nature. 
If I use one ring per core the drops go away, if I enable EMC then the drops go 
away etc.
Thanks, Shahaji

On Mon, Jul 6, 2020 at 5:27 AM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

I have not measured context switch overhead, but I feel it should be 
acceptable. Because 10Mpps throughput with zero-packet drop(20s) could be 
achieved in some arm server.  Maybe you could make performance profiling on 
your test bench to find out the root cause of performance degradation of  
multi-rings.

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Thursday, July 2, 2020 9:27 PM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Thanks Yanqin,
I am not seeing any context switches beyond 40usec in our do nothing loop test. 
But when OvS packets multiple rings(queues) on the same CPU and the number of 
packet it starts batching (MAX_BURST_SIZE) the toops will will take more time, 
I can see rings getting getting filled up. And then its a feedback loop. CPUs 
are running close to 100% any disturbance at that point I think is too much.
Do you have any data that you use to monitor OvS. I am doing all the above 
experiments without OvS.
Thanks, Shahaji

On Thu, Jul 2, 2020 at 4:43 AM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I have 
no idea of why it caused packet loss because it should be only a small overhead 
when rcu_nocbs is enabled .

Best Regards,
Wei Yanqin

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-06 Thread Yanqin Wei
Hi Shahaji,

Yes, update some counter each 10 second for pmd balance and pmd info 
collection.  I have no idea of how to disable them from outside.
You could try to modify the following number and observe packet loss.

/* Time in microseconds of the interval in which rxq processing cycles used
* in rxq to pmd assignments is measured and stored. */
#define PMD_RXQ_INTERVAL_LEN 1000LL

/* Time in microseconds between successive optimizations of the dpcls
* subtable vector */
#define DPCLS_OPTIMIZATION_INTERVAL 100LL

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
Sent: Tuesday, July 7, 2020 12:23 PM
To: Yanqin Wei 
Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd 
; Ilya Maximets ; Lee Reed 
; Vinay Gupta ; Alex Barba 

Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Thanks Yangin,
What does this define mean? Every 10 second some kind of book keeping of the 
packet processing cycles ? Are you saying to make this even bigger in time. 
1000 seconds or something? If I want to disable what do I do?
Thanks, Shahaji

On Mon, Jul 6, 2020 at 10:30 PM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

It seems to be caused by some periodic task.  In the pmd thread, pmd auto load 
balance would be done periodically.
/* Time in microseconds of the interval in which rxq processing cycles used
* in rxq to pmd assignments is measured and stored. */
#define PMD_RXQ_INTERVAL_LEN 1000LL

Would you like to disable it if it is not necessary?

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Monday, July 6, 2020 8:24 PM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Yanqin,
The drops are random intervals, sometimes I can run for minutes without drops. 
The case is very borderline with when CPUs are close to 99% and with around 
1000 flows. We see the drops once every 10-15 seconds and its random in nature. 
If I use one ring per core the drops go away, if I enable EMC then the drops go 
away etc.
Thanks, Shahaji

On Mon, Jul 6, 2020 at 5:27 AM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

I have not measured context switch overhead, but I feel it should be 
acceptable. Because 10Mpps throughput with zero-packet drop(20s) could be 
achieved in some arm server.  Maybe you could make performance profiling on 
your test bench to find out the root cause of performance degradation of  
multi-rings.

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Thursday, July 2, 2020 9:27 PM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Thanks Yanqin,
I am not seeing any context switches beyond 40usec in our do nothing loop test. 
But when OvS packets multiple rings(queues) on the same CPU and the number of 
packet it starts batching (MAX_BURST_SIZE) the toops will will take more time, 
I can see rings getting getting filled up. And then its a feedback loop. CPUs 
are running close to 100% any disturbance at that point I think is too much.
Do you have any data that you use to monitor OvS. I am doing all the above 
experiments without OvS.
Thanks, Shahaji

On Thu, Jul 2, 2020 at 4:43 AM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I have 
no idea of why it caused packet loss because it should be only a small overhead 
when rcu_nocbs is enabled .

Best Regards,
Wei Yanqin

===

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Thursday, July 2, 2020 6:11 AM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Yanqin,
I added the patch you gave me to my s

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-06 Thread Shahaji Bhosle via dev
Thanks Yangin,
What does this define mean? Every 10 second some kind of book keeping of
the packet processing cycles ? Are you saying to make this even bigger in
time. 1000 seconds or something? If I want to disable what do I do?
Thanks, Shahaji

On Mon, Jul 6, 2020 at 10:30 PM Yanqin Wei  wrote:

> Hi Shahaji,
>
>
>
> It seems to be caused by some periodic task.  In the pmd thread, pmd auto
> load balance would be done periodically.
>
> /* Time in microseconds of the interval in which rxq processing cycles used
>
> * in rxq to pmd assignments is measured and stored. */
>
> #define PMD_RXQ_INTERVAL_LEN 1000LL
>
>
>
> Would you like to disable it if it is not necessary?
>
>
>
> Best Regards,
>
> Wei Yanqin
>
>
>
> *From:* Shahaji Bhosle 
> *Sent:* Monday, July 6, 2020 8:24 PM
> *To:* Yanqin Wei 
> *Cc:* Flavio Leitner ; ovs-dev@openvswitch.org; nd <
> n...@arm.com>; Ilya Maximets ; Lee Reed <
> lee.r...@broadcom.com>; Vinay Gupta ; Alex
> Barba 
> *Subject:* Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP
> (iperf3)
>
>
>
> Hi Yanqin,
>
> The drops are random intervals, sometimes I can run for minutes without
> drops. The case is very borderline with when CPUs are close to 99% and with
> around 1000 flows. We see the drops once every 10-15 seconds and its random
> in nature. If I use one ring per core the drops go away, if I enable EMC
> then the drops go away etc.
>
> Thanks, Shahaji
>
>
>
> On Mon, Jul 6, 2020 at 5:27 AM Yanqin Wei  wrote:
>
> Hi Shahaji,
>
>
>
> I have not measured context switch overhead, but I feel it should be
> acceptable. Because 10Mpps throughput with zero-packet drop(20s) could be
> achieved in some arm server.  Maybe you could make performance profiling on
> your test bench to find out the root cause of performance degradation of
>  multi-rings.
>
>
>
> Best Regards,
>
> Wei Yanqin
>
>
>
> *From:* Shahaji Bhosle 
> *Sent:* Thursday, July 2, 2020 9:27 PM
> *To:* Yanqin Wei 
> *Cc:* Flavio Leitner ; ovs-dev@openvswitch.org; nd <
> n...@arm.com>; Ilya Maximets ; Lee Reed <
> lee.r...@broadcom.com>; Vinay Gupta ; Alex
> Barba 
> *Subject:* Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP
> (iperf3)
>
>
>
> Thanks Yanqin,
>
> I am not seeing any context switches beyond 40usec in our do nothing loop
> test. But when OvS packets multiple rings(queues) on the same CPU and the
> number of packet it starts batching (MAX_BURST_SIZE) the toops will will
> take more time, I can see rings getting getting filled up. And then its a
> feedback loop. CPUs are running close to 100% any disturbance at that point
> I think is too much.
>
> Do you have any data that you use to monitor OvS. I am doing all the above
> experiments without OvS.
>
> Thanks, Shahaji
>
>
>
> On Thu, Jul 2, 2020 at 4:43 AM Yanqin Wei  wrote:
>
> Hi Shahaji,
>
> IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I
> have no idea of why it caused packet loss because it should be only a small
> overhead when rcu_nocbs is enabled .
>
> Best Regards,
> Wei Yanqin
>
> ===
>
> From: Shahaji Bhosle 
> Sent: Thursday, July 2, 2020 6:11 AM
> To: Yanqin Wei 
> Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd <
> n...@arm.com>; Ilya Maximets ; Lee Reed <
> lee.r...@broadcom.com>; Vinay Gupta ; Alex
> Barba 
> Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP
> (iperf3)
>
> Hi Yanqin,
> I added the patch you gave me to my script which runs a do nothing for
> loop. You can see the spikes in the below plot. 976/1000 times we are
> perfect, but around every 1 second u can see something going wrong. I dont
> see anything wrong in the trace-cmd world.
> Thanks, Shahaji
>
> root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc
> + TARGET=2
> + MASK=4
> + NUM_ITER=1000
> + NUM_MS=100
> + N=3750
> + LOGFILE=loop_1000iter_100ms.log
> + tee loop_1000iter_100ms.log
> + trace-cmd record -p function_graph -e all -M 4 -o
> trace_1000iter_100ms.dat taskset -c 2
> /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
>   plugin 'function_graph'
> Cycles/Second (Hz) = 30
> Nano-seconds per cycle = 0.
>
> Using ISB() before rte_rdtsc()
> num_iter: 1000
> do_nothing_loop for (N)=3750
> Running 1000 iterations of do_nothing_loop for (N)=3750
>
> Average =  100282.193430333 u-secs
> Max =  124777.48867 u-secs
> Min =  10.01767 u-secs
> \u03c3  =1931.352376508 u-secs
>
> Average =  3008

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-06 Thread Yanqin Wei
Hi Shahaji,

It seems to be caused by some periodic task.  In the pmd thread, pmd auto load 
balance would be done periodically.
/* Time in microseconds of the interval in which rxq processing cycles used
* in rxq to pmd assignments is measured and stored. */
#define PMD_RXQ_INTERVAL_LEN 1000LL

Would you like to disable it if it is not necessary?

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
Sent: Monday, July 6, 2020 8:24 PM
To: Yanqin Wei 
Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd 
; Ilya Maximets ; Lee Reed 
; Vinay Gupta ; Alex Barba 

Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Yanqin,
The drops are random intervals, sometimes I can run for minutes without drops. 
The case is very borderline with when CPUs are close to 99% and with around 
1000 flows. We see the drops once every 10-15 seconds and its random in nature. 
If I use one ring per core the drops go away, if I enable EMC then the drops go 
away etc.
Thanks, Shahaji

On Mon, Jul 6, 2020 at 5:27 AM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

I have not measured context switch overhead, but I feel it should be 
acceptable. Because 10Mpps throughput with zero-packet drop(20s) could be 
achieved in some arm server.  Maybe you could make performance profiling on 
your test bench to find out the root cause of performance degradation of  
multi-rings.

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Thursday, July 2, 2020 9:27 PM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Thanks Yanqin,
I am not seeing any context switches beyond 40usec in our do nothing loop test. 
But when OvS packets multiple rings(queues) on the same CPU and the number of 
packet it starts batching (MAX_BURST_SIZE) the toops will will take more time, 
I can see rings getting getting filled up. And then its a feedback loop. CPUs 
are running close to 100% any disturbance at that point I think is too much.
Do you have any data that you use to monitor OvS. I am doing all the above 
experiments without OvS.
Thanks, Shahaji

On Thu, Jul 2, 2020 at 4:43 AM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I have 
no idea of why it caused packet loss because it should be only a small overhead 
when rcu_nocbs is enabled .

Best Regards,
Wei Yanqin

===

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Thursday, July 2, 2020 6:11 AM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Yanqin,
I added the patch you gave me to my script which runs a do nothing for loop. 
You can see the spikes in the below plot. 976/1000 times we are perfect, but 
around every 1 second u can see something going wrong. I dont see anything 
wrong in the trace-cmd world.
Thanks, Shahaji

root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc
+ TARGET=2
+ MASK=4
+ NUM_ITER=1000
+ NUM_MS=100
+ N=3750
+ LOGFILE=loop_1000iter_100ms.log
+ tee loop_1000iter_100ms.log
+ trace-cmd record -p function_graph -e all -M 4 -o trace_1000iter_100ms.dat 
taskset -c 2 /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
  plugin 'function_graph'
Cycles/Second (Hz) = 30
Nano-seconds per cycle = 0.

Using ISB() before rte_rdtsc()
num_iter: 1000
do_nothing_loop for (N)=3750
Running 1000 iterations of do_nothing_loop for (N)=3750

Average =  100282.193430333 u-secs
Max =  124777.48867 u-secs
Min =  10.01767 u-secs
\u03c3  =1931.352376508 u-secs

Average =  300846580.29 cycles
Max =  374332466.00 cycles
Min =  30053.00 cycles
\u03c3  =5794057.13 cycles

#\u03c3 = events
 0 = 976
 1 = 3
 2 = 4
 3 = 3
 4 = 3
 5 = 2
 6 = 2
 7 = 2
 8 = 1
 9 = 1
10 = 1
12 = 2




On Wed, Jul 1, 2020 at 3:57 AM Yanqin Wei 
<mailto:yanqin@arm.com<mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

Adding isb instruction can help rdtsc precise, which sync system counter to 
cntvct_

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-06 Thread Shahaji Bhosle via dev
Hi Yanqin,
The drops are random intervals, sometimes I can run for minutes without
drops. The case is very borderline with when CPUs are close to 99% and with
around 1000 flows. We see the drops once every 10-15 seconds and its random
in nature. If I use one ring per core the drops go away, if I enable EMC
then the drops go away etc.
Thanks, Shahaji

On Mon, Jul 6, 2020 at 5:27 AM Yanqin Wei  wrote:

> Hi Shahaji,
>
>
>
> I have not measured context switch overhead, but I feel it should be
> acceptable. Because 10Mpps throughput with zero-packet drop(20s) could be
> achieved in some arm server.  Maybe you could make performance profiling on
> your test bench to find out the root cause of performance degradation of
>  multi-rings.
>
>
>
> Best Regards,
>
> Wei Yanqin
>
>
>
> *From:* Shahaji Bhosle 
> *Sent:* Thursday, July 2, 2020 9:27 PM
> *To:* Yanqin Wei 
> *Cc:* Flavio Leitner ; ovs-dev@openvswitch.org; nd <
> n...@arm.com>; Ilya Maximets ; Lee Reed <
> lee.r...@broadcom.com>; Vinay Gupta ; Alex
> Barba 
> *Subject:* Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP
> (iperf3)
>
>
>
> Thanks Yanqin,
>
> I am not seeing any context switches beyond 40usec in our do nothing loop
> test. But when OvS packets multiple rings(queues) on the same CPU and the
> number of packet it starts batching (MAX_BURST_SIZE) the toops will will
> take more time, I can see rings getting getting filled up. And then its a
> feedback loop. CPUs are running close to 100% any disturbance at that point
> I think is too much.
>
> Do you have any data that you use to monitor OvS. I am doing all the above
> experiments without OvS.
>
> Thanks, Shahaji
>
>
>
> On Thu, Jul 2, 2020 at 4:43 AM Yanqin Wei  wrote:
>
> Hi Shahaji,
>
> IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I
> have no idea of why it caused packet loss because it should be only a small
> overhead when rcu_nocbs is enabled .
>
> Best Regards,
> Wei Yanqin
>
> ===
>
> From: Shahaji Bhosle 
> Sent: Thursday, July 2, 2020 6:11 AM
> To: Yanqin Wei 
> Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd <
> n...@arm.com>; Ilya Maximets ; Lee Reed <
> lee.r...@broadcom.com>; Vinay Gupta ; Alex
> Barba 
> Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP
> (iperf3)
>
> Hi Yanqin,
> I added the patch you gave me to my script which runs a do nothing for
> loop. You can see the spikes in the below plot. 976/1000 times we are
> perfect, but around every 1 second u can see something going wrong. I dont
> see anything wrong in the trace-cmd world.
> Thanks, Shahaji
>
> root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc
> + TARGET=2
> + MASK=4
> + NUM_ITER=1000
> + NUM_MS=100
> + N=3750
> + LOGFILE=loop_1000iter_100ms.log
> + tee loop_1000iter_100ms.log
> + trace-cmd record -p function_graph -e all -M 4 -o
> trace_1000iter_100ms.dat taskset -c 2
> /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
>   plugin 'function_graph'
> Cycles/Second (Hz) = 30
> Nano-seconds per cycle = 0.
>
> Using ISB() before rte_rdtsc()
> num_iter: 1000
> do_nothing_loop for (N)=3750
> Running 1000 iterations of do_nothing_loop for (N)=3750
>
> Average =  100282.193430333 u-secs
> Max =  124777.48867 u-secs
> Min =  10.01767 u-secs
> \u03c3  =1931.352376508 u-secs
>
> Average =  300846580.29 cycles
> Max =  374332466.00 cycles
> Min =  30053.00 cycles
> \u03c3  =5794057.13 cycles
>
> #\u03c3 = events
>  0 = 976
>  1 = 3
>  2 = 4
>  3 = 3
>  4 = 3
>  5 = 2
>  6 = 2
>  7 = 2
>  8 = 1
>  9 = 1
> 10 = 1
> 12 = 2
>
>
>
>
> On Wed, Jul 1, 2020 at 3:57 AM Yanqin Wei <mailto:yanqin@arm.com>
> wrote:
> Hi Shahaji,
>
> Adding isb instruction can help rdtsc precise, which sync system counter
> to cntvct_el0. There is a patch in DPDK.
> https://patchwork.dpdk.org/patch/66561/
> So it may be not related with intermittent drops you observed.
>
> Best Regards,
> Wei Yanqin
>
> > -----Original Message-
> > From: dev <mailto:ovs-dev-boun...@openvswitch.org> On Behalf Of Shahaji
> Bhosle
> > via dev
> > Sent: Wednesday, July 1, 2020 6:05 AM
> > To: Flavio Leitner <mailto:f...@sysclose.org>
> > Cc: mailto:ovs-dev@openvswitch.org; Ilya Maximets  i.maxim...@samsung.com>;
> > Lee Reed <mailto:lee.r...@broadcom.com>; Vinay Gupta
> > <mailto:vinay.gu...@broadcom.com>; Alex Barba  alex.ba

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-06 Thread Yanqin Wei
Hi Shahaji,

I have not measured context switch overhead, but I feel it should be 
acceptable. Because 10Mpps throughput with zero-packet drop(20s) could be 
achieved in some arm server.  Maybe you could make performance profiling on 
your test bench to find out the root cause of performance degradation of  
multi-rings.

Best Regards,
Wei Yanqin

From: Shahaji Bhosle 
Sent: Thursday, July 2, 2020 9:27 PM
To: Yanqin Wei 
Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd 
; Ilya Maximets ; Lee Reed 
; Vinay Gupta ; Alex Barba 

Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Thanks Yanqin,
I am not seeing any context switches beyond 40usec in our do nothing loop test. 
But when OvS packets multiple rings(queues) on the same CPU and the number of 
packet it starts batching (MAX_BURST_SIZE) the toops will will take more time, 
I can see rings getting getting filled up. And then its a feedback loop. CPUs 
are running close to 100% any disturbance at that point I think is too much.
Do you have any data that you use to monitor OvS. I am doing all the above 
experiments without OvS.
Thanks, Shahaji

On Thu, Jul 2, 2020 at 4:43 AM Yanqin Wei 
mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I have 
no idea of why it caused packet loss because it should be only a small overhead 
when rcu_nocbs is enabled .

Best Regards,
Wei Yanqin

===

From: Shahaji Bhosle 
mailto:shahaji.bho...@broadcom.com>>
Sent: Thursday, July 2, 2020 6:11 AM
To: Yanqin Wei mailto:yanqin@arm.com>>
Cc: Flavio Leitner mailto:f...@sysclose.org>>; 
ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; nd 
mailto:n...@arm.com>>; Ilya Maximets 
mailto:i.maxim...@samsung.com>>; Lee Reed 
mailto:lee.r...@broadcom.com>>; Vinay Gupta 
mailto:vinay.gu...@broadcom.com>>; Alex Barba 
mailto:alex.ba...@broadcom.com>>
Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Yanqin,
I added the patch you gave me to my script which runs a do nothing for loop. 
You can see the spikes in the below plot. 976/1000 times we are perfect, but 
around every 1 second u can see something going wrong. I dont see anything 
wrong in the trace-cmd world.
Thanks, Shahaji

root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc
+ TARGET=2
+ MASK=4
+ NUM_ITER=1000
+ NUM_MS=100
+ N=3750
+ LOGFILE=loop_1000iter_100ms.log
+ tee loop_1000iter_100ms.log
+ trace-cmd record -p function_graph -e all -M 4 -o trace_1000iter_100ms.dat 
taskset -c 2 /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
  plugin 'function_graph'
Cycles/Second (Hz) = 30
Nano-seconds per cycle = 0.

Using ISB() before rte_rdtsc()
num_iter: 1000
do_nothing_loop for (N)=3750
Running 1000 iterations of do_nothing_loop for (N)=3750

Average =  100282.193430333 u-secs
Max =  124777.48867 u-secs
Min =  10.01767 u-secs
\u03c3  =1931.352376508 u-secs

Average =  300846580.29 cycles
Max =  374332466.00 cycles
Min =  30053.00 cycles
\u03c3  =5794057.13 cycles

#\u03c3 = events
 0 = 976
 1 = 3
 2 = 4
 3 = 3
 4 = 3
 5 = 2
 6 = 2
 7 = 2
 8 = 1
 9 = 1
10 = 1
12 = 2




On Wed, Jul 1, 2020 at 3:57 AM Yanqin Wei 
<mailto:yanqin@arm.com<mailto:yanqin@arm.com>> wrote:
Hi Shahaji,

Adding isb instruction can help rdtsc precise, which sync system counter to 
cntvct_el0. There is a patch in DPDK. https://patchwork.dpdk.org/patch/66561/
So it may be not related with intermittent drops you observed.

Best Regards,
Wei Yanqin

> -Original Message-
> From: dev 
> <mailto:ovs-dev-boun...@openvswitch.org<mailto:ovs-dev-boun...@openvswitch.org>>
>  On Behalf Of Shahaji Bhosle
> via dev
> Sent: Wednesday, July 1, 2020 6:05 AM
> To: Flavio Leitner <mailto:f...@sysclose.org<mailto:f...@sysclose.org>>
> Cc: mailto:ovs-dev@openvswitch.org<mailto:ovs-dev@openvswitch.org>; Ilya 
> Maximets <mailto:i.maxim...@samsung.com<mailto:i.maxim...@samsung.com>>;
> Lee Reed <mailto:lee.r...@broadcom.com<mailto:lee.r...@broadcom.com>>; Vinay 
> Gupta
> <mailto:vinay.gu...@broadcom.com<mailto:vinay.gu...@broadcom.com>>; Alex 
> Barba <mailto:alex.ba...@broadcom.com<mailto:alex.ba...@broadcom.com>>
> Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP 
> (iperf3)
>
> Hi Flavio,
> I still see intermittent drops with rcu_nocbs. So I wrote that do_nothing()
> loop..to avoid all the other distractions to see if Linux is messing with the 
> OVS
> loop just to see what is going on. The interesting thing I see the case *BOLD*
> below where I use an ISB() instruction my STD deviation is well within Both 
> the

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-02 Thread Shahaji Bhosle via dev
Thanks Yanqin,
I am not seeing any context switches beyond 40usec in our do nothing loop
test. But when OvS packets multiple rings(queues) on the same CPU and the
number of packet it starts batching (MAX_BURST_SIZE) the toops will will
take more time, I can see rings getting getting filled up. And then its a
feedback loop. CPUs are running close to 100% any disturbance at that point
I think is too much.
Do you have any data that you use to monitor OvS. I am doing all the above
experiments without OvS.
Thanks, Shahaji

On Thu, Jul 2, 2020 at 4:43 AM Yanqin Wei  wrote:

> Hi Shahaji,
>
> IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I
> have no idea of why it caused packet loss because it should be only a small
> overhead when rcu_nocbs is enabled .
>
> Best Regards,
> Wei Yanqin
>
> ===
>
> From: Shahaji Bhosle 
> Sent: Thursday, July 2, 2020 6:11 AM
> To: Yanqin Wei 
> Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd <
> n...@arm.com>; Ilya Maximets ; Lee Reed <
> lee.r...@broadcom.com>; Vinay Gupta ; Alex
> Barba 
> Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP
> (iperf3)
>
> Hi Yanqin,
> I added the patch you gave me to my script which runs a do nothing for
> loop. You can see the spikes in the below plot. 976/1000 times we are
> perfect, but around every 1 second u can see something going wrong. I dont
> see anything wrong in the trace-cmd world.
> Thanks, Shahaji
>
> root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc
> + TARGET=2
> + MASK=4
> + NUM_ITER=1000
> + NUM_MS=100
> + N=3750
> + LOGFILE=loop_1000iter_100ms.log
> + tee loop_1000iter_100ms.log
> + trace-cmd record -p function_graph -e all -M 4 -o
> trace_1000iter_100ms.dat taskset -c 2
> /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
>   plugin 'function_graph'
> Cycles/Second (Hz) = 30
> Nano-seconds per cycle = 0.
>
> Using ISB() before rte_rdtsc()
> num_iter: 1000
> do_nothing_loop for (N)=3750
> Running 1000 iterations of do_nothing_loop for (N)=3750
>
> Average =  100282.193430333 u-secs
> Max =  124777.48867 u-secs
> Min =  10.01767 u-secs
> \u03c3  =1931.352376508 u-secs
>
> Average =  300846580.29 cycles
> Max =  374332466.00 cycles
> Min =  30053.00 cycles
> \u03c3  =5794057.13 cycles
>
> #\u03c3 = events
>  0 = 976
>  1 = 3
>  2 = 4
>  3 = 3
>  4 = 3
>  5 = 2
>  6 = 2
>  7 = 2
>  8 = 1
>  9 = 1
> 10 = 1
> 12 = 2
>
>
>
>
> On Wed, Jul 1, 2020 at 3:57 AM Yanqin Wei <mailto:yanqin@arm.com>
> wrote:
> Hi Shahaji,
>
> Adding isb instruction can help rdtsc precise, which sync system counter
> to cntvct_el0. There is a patch in DPDK.
> https://patchwork.dpdk.org/patch/66561/
> So it may be not related with intermittent drops you observed.
>
> Best Regards,
> Wei Yanqin
>
> > -Original Message-
> > From: dev <mailto:ovs-dev-boun...@openvswitch.org> On Behalf Of Shahaji
> Bhosle
> > via dev
> > Sent: Wednesday, July 1, 2020 6:05 AM
> > To: Flavio Leitner <mailto:f...@sysclose.org>
> > Cc: mailto:ovs-dev@openvswitch.org; Ilya Maximets  i.maxim...@samsung.com>;
> > Lee Reed <mailto:lee.r...@broadcom.com>; Vinay Gupta
> > <mailto:vinay.gu...@broadcom.com>; Alex Barba  alex.ba...@broadcom.com>
> > Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP
> (iperf3)
> >
> > Hi Flavio,
> > I still see intermittent drops with rcu_nocbs. So I wrote that
> do_nothing()
> > loop..to avoid all the other distractions to see if Linux is messing
> with the OVS
> > loop just to see what is going on. The interesting thing I see the case
> *BOLD*
> > below where I use an ISB() instruction my STD deviation is well within
> Both the
> > results are basically DO NOTHING FOR 100msec and see what happens to
> > time :) Thanks, Shahaji
> >
> > static inline uint64_t
> > *rte_get_tsc_cycles*(void)
> > {
> > uint64_t tsc;
> > #ifdef USE_ISB
> > asm volatile("*isb*; mrs %0, pmccntr_el0" : "=r"(tsc)); #else asm
> > volatile("mrs %0, pmccntr_el0" : "=r"(tsc)); #endif return tsc; } #endif
> > /*RTE_ARM_EAL_RDTSC_USE_PMU*/
> >
> > ==
> > usleep(100);
> > for (volatile int i=0; i > rte_get_tsc_cycles();
> > /* do nothig for 1us second */
> > *#ifdef USE_ISB*
> > for(volatile int j=0; j < num_us; j++);   *<<<<<&l

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-02 Thread Yanqin Wei
Hi Shahaji,

IIUC, 1Hz time tick cannot be disabled even if full dynticks, right? But I have 
no idea of why it caused packet loss because it should be only a small overhead 
when rcu_nocbs is enabled .

Best Regards,
Wei Yanqin

===

From: Shahaji Bhosle  
Sent: Thursday, July 2, 2020 6:11 AM
To: Yanqin Wei 
Cc: Flavio Leitner ; ovs-dev@openvswitch.org; nd 
; Ilya Maximets ; Lee Reed 
; Vinay Gupta ; Alex Barba 

Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

Hi Yanqin, 
I added the patch you gave me to my script which runs a do nothing for loop. 
You can see the spikes in the below plot. 976/1000 times we are perfect, but 
around every 1 second u can see something going wrong. I dont see anything 
wrong in the trace-cmd world.
Thanks, Shahaji

root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc 
+ TARGET=2
+ MASK=4
+ NUM_ITER=1000
+ NUM_MS=100
+ N=3750
+ LOGFILE=loop_1000iter_100ms.log
+ tee loop_1000iter_100ms.log
+ trace-cmd record -p function_graph -e all -M 4 -o trace_1000iter_100ms.dat 
taskset -c 2 /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
  plugin 'function_graph'
Cycles/Second (Hz) = 30
Nano-seconds per cycle = 0.

Using ISB() before rte_rdtsc()
num_iter: 1000
do_nothing_loop for (N)=3750 
Running 1000 iterations of do_nothing_loop for (N)=3750

Average =          100282.193430333 u-secs
Max =          124777.48867 u-secs
Min =          10.01767 u-secs
\u03c3  =            1931.352376508 u-secs

Average =              300846580.29 cycles
Max =              374332466.00 cycles
Min =              30053.00 cycles
\u03c3  =                5794057.13 cycles

#\u03c3 = events
 0 = 976
 1 = 3
 2 = 4
 3 = 3
 4 = 3
 5 = 2
 6 = 2
 7 = 2
 8 = 1
 9 = 1
10 = 1
12 = 2




On Wed, Jul 1, 2020 at 3:57 AM Yanqin Wei <mailto:yanqin@arm.com> wrote:
Hi Shahaji,

Adding isb instruction can help rdtsc precise, which sync system counter to 
cntvct_el0. There is a patch in DPDK. https://patchwork.dpdk.org/patch/66561/
So it may be not related with intermittent drops you observed.

Best Regards,
Wei Yanqin

> -Original Message-
> From: dev <mailto:ovs-dev-boun...@openvswitch.org> On Behalf Of Shahaji Bhosle
> via dev
> Sent: Wednesday, July 1, 2020 6:05 AM
> To: Flavio Leitner <mailto:f...@sysclose.org>
> Cc: mailto:ovs-dev@openvswitch.org; Ilya Maximets 
> <mailto:i.maxim...@samsung.com>;
> Lee Reed <mailto:lee.r...@broadcom.com>; Vinay Gupta
> <mailto:vinay.gu...@broadcom.com>; Alex Barba <mailto:alex.ba...@broadcom.com>
> Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP 
> (iperf3)
>
> Hi Flavio,
> I still see intermittent drops with rcu_nocbs. So I wrote that do_nothing()
> loop..to avoid all the other distractions to see if Linux is messing with the 
> OVS
> loop just to see what is going on. The interesting thing I see the case *BOLD*
> below where I use an ISB() instruction my STD deviation is well within Both 
> the
> results are basically DO NOTHING FOR 100msec and see what happens to
> time :) Thanks, Shahaji
>
> static inline uint64_t
> *rte_get_tsc_cycles*(void)
> {
> uint64_t tsc;
> #ifdef USE_ISB
> asm volatile("*isb*; mrs %0, pmccntr_el0" : "=r"(tsc)); #else asm
> volatile("mrs %0, pmccntr_el0" : "=r"(tsc)); #endif return tsc; } #endif
> /*RTE_ARM_EAL_RDTSC_USE_PMU*/
>
> ==
> usleep(100);
> for (volatile int i=0; i rte_get_tsc_cycles();
> /* do nothig for 1us second */
> *#ifdef USE_ISB*
> for(volatile int j=0; j < num_us; j++);       *<<<<<<<<<<<< THIS IS MESSED
> UP, 100msec do nothing, I am getting 2033 usec STD DEVIATION* #else
> *for(volatile int j=0; j < num_us; j++);       <<<<<<<<<<<< THIS LOOP HAS
> VERY LOW STD DEVIATION*
> * rte_isb();*
> #endif
> volatile uint64_t tsc_end = rte_get_tsc_cycles(); cycles[i] = tsc_end - 
> tsc_start; }
> usleep(100); calc_avg_var_stddev(num_iter, [0]);
> ===
> *#ifdef USE_ISB*
> root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc
> + TARGET=2
> + MASK=4
> + NUM_ITER=1000
> + NUM_MS=100
> + N=3750
> + LOGFILE=loop_1000iter_100ms.log
> + tee loop_1000iter_100ms.log
> + trace-cmd record -p function_graph -e all -M 4 -o
> trace_1000iter_100ms.dat taskset -c 2
> /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
>   plugin 'function_graph'
> Cycles/Second (Hz) = 30
> Nano-seconds per cycle = 0.
>
> Using ISB() before rte_rdtsc()
> num_iter: 1000
> do_nothing_loop for (N)=3750
> Running 1000 iterations of do_nothing_loop for (N)=3750
>
> Average =    

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-07-01 Thread Yanqin Wei
Hi Shahaji,

Adding isb instruction can help rdtsc precise, which sync system counter to 
cntvct_el0. There is a patch in DPDK. https://patchwork.dpdk.org/patch/66561/
So it may be not related with intermittent drops you observed.

Best Regards,
Wei Yanqin

> -Original Message-
> From: dev  On Behalf Of Shahaji Bhosle
> via dev
> Sent: Wednesday, July 1, 2020 6:05 AM
> To: Flavio Leitner 
> Cc: ovs-dev@openvswitch.org; Ilya Maximets ;
> Lee Reed ; Vinay Gupta
> ; Alex Barba 
> Subject: Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP 
> (iperf3)
>
> Hi Flavio,
> I still see intermittent drops with rcu_nocbs. So I wrote that do_nothing()
> loop..to avoid all the other distractions to see if Linux is messing with the 
> OVS
> loop just to see what is going on. The interesting thing I see the case *BOLD*
> below where I use an ISB() instruction my STD deviation is well within Both 
> the
> results are basically DO NOTHING FOR 100msec and see what happens to
> time :) Thanks, Shahaji
>
> static inline uint64_t
> *rte_get_tsc_cycles*(void)
> {
> uint64_t tsc;
> #ifdef USE_ISB
> asm volatile("*isb*; mrs %0, pmccntr_el0" : "=r"(tsc)); #else asm
> volatile("mrs %0, pmccntr_el0" : "=r"(tsc)); #endif return tsc; } #endif
> /*RTE_ARM_EAL_RDTSC_USE_PMU*/
>
> ==
> usleep(100);
> for (volatile int i=0; i rte_get_tsc_cycles();
> /* do nothig for 1us second */
> *#ifdef USE_ISB*
> for(volatile int j=0; j < num_us; j++);   *<<<<<<<<<<<< THIS IS MESSED
> UP, 100msec do nothing, I am getting 2033 usec STD DEVIATION* #else
> *for(volatile int j=0; j < num_us; j++);   <<<<<<<<<<<< THIS LOOP HAS
> VERY LOW STD DEVIATION*
> * rte_isb();*
> #endif
> volatile uint64_t tsc_end = rte_get_tsc_cycles(); cycles[i] = tsc_end - 
> tsc_start; }
> usleep(100); calc_avg_var_stddev(num_iter, [0]);
> ===
> *#ifdef USE_ISB*
> root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_rdtsc
> + TARGET=2
> + MASK=4
> + NUM_ITER=1000
> + NUM_MS=100
> + N=3750
> + LOGFILE=loop_1000iter_100ms.log
> + tee loop_1000iter_100ms.log
> + trace-cmd record -p function_graph -e all -M 4 -o
> trace_1000iter_100ms.dat taskset -c 2
> /home/root/arm_stb_user_loop_isb_rdtsc 1000 3750
>   plugin 'function_graph'
> Cycles/Second (Hz) = 30
> Nano-seconds per cycle = 0.
>
> Using ISB() before rte_rdtsc()
> num_iter: 1000
> do_nothing_loop for (N)=3750
> Running 1000 iterations of do_nothing_loop for (N)=3750
>
> Average =  100328.158561667 u-secs
> Max =  123024.79533 u-secs
> Min =  10.01767 u-secs
> *\sigma  =2033.118969489 u-secs*
>
> Average =  300984475.69 cycles
> Max =  369074386.00 cycles
> Min =  30053.00 cycles
> \sigma  =6099356.91 cycles
>
> #\sigma = events
>  0 = 968
>  1 = 8
>  2 = 5
>  3 = 3
>  4 = 3
>  5 = 3
>  6 = 3
>  8 = 3
> 10 = 3
> 11 = 1
>
> *#ELSE*
> root@bcm958802a8046c:~/vinay_rx/dynticks-testing# ./run_isb_loop
> + TARGET=2
> + MASK=4
> + NUM_ITER=1000
> + NUM_MS=100
> + N=7316912
> + LOGFILE=loop_1000iter_100ms.log
> + tee loop_1000iter_100ms.log
> + trace-cmd record -p function_graph -e all -M 4 -o
> trace_1000iter_100ms.dat taskset -c 2
> /home/root/arm_stb_user_loop_isb_loop
> 1000 7316912
>   plugin 'function_graph'
> Cycles/Second (Hz) = 30
> Nano-seconds per cycle = 0.
>
> NO ISB() before rte_rdtsc()
> num_iter: 1000
> do_nothing_loop for (N)=7316912
> Running 1000 iterations of do_nothing_loop for (N)=7316912
>
> Average =   9.863256333 u-secs
> Max =  100052.79033 u-secs
> Min =   7.80733 u-secs
> *\u03c3 =   6.497043982 u-secs*
>
> Average =  29589.77 cycles
> Max =  300158371.00 cycles
> Min =  23422.00 cycles
> \u03c3 =  19491.13 cycles
>
> #\u03c3 = events
>  0 = 900
>  2 = 79
>  4 = 17
>  5 = 3
>  8 = 1
>
>
> On Tue, Jun 30, 2020 at 4:42 PM Flavio Leitner  wrote:
>
> >
> >
> > Hi Shahaji,
> >
> > Did it help with the rcu_nocbs?
> >
> > fbl
> >
> > On Tue, Jun 30, 2020 at 12:56:27PM -0400, Shahaji Bhosle wrote:
> > > Thanks Flavio,
> > > Are there any special requirements for RCU on ARM vs x86.
> > >
> > > I am following what the above document is saying...Do you think I
> > > need to d

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-30 Thread Shahaji Bhosle via dev
Hi Flavio,
I still see intermittent drops with rcu_nocbs. So I wrote that do_nothing()
loop..to avoid all the other distractions to see if Linux is messing with
the OVS loop just to see what is going on. The interesting thing I see the
case *BOLD* below where I use an ISB() instruction my STD deviation is well
within Both the results are basically DO NOTHING FOR 100msec and see what
happens to time :)
Thanks, Shahaji

static inline uint64_t
*rte_get_tsc_cycles*(void)
{
uint64_t tsc;
#ifdef USE_ISB
asm volatile("*isb*; mrs %0, pmccntr_el0" : "=r"(tsc));
#else
asm volatile("mrs %0, pmccntr_el0" : "=r"(tsc));
#endif
return tsc;
}
#endif /*RTE_ARM_EAL_RDTSC_USE_PMU*/

==
usleep(100);
for (volatile int i=0; i wrote:

>
>
> Hi Shahaji,
>
> Did it help with the rcu_nocbs?
>
> fbl
>
> On Tue, Jun 30, 2020 at 12:56:27PM -0400, Shahaji Bhosle wrote:
> > Thanks Flavio,
> > Are there any special requirements for RCU on ARM vs x86.
> >
> > I am following what the above document is saying...Do you think I need to
> > do something more than the below?
> > Thanks again and appreciate the help. Shahaji
> >
> > 1. Isolate the CPU cores
> > *isolcpus=1,2,3,4,5,6,7 nohz_full=1-7 rcu_nocbs=1-7*
> > 2. Setting CONFIG_NO_HZ_FULL=y
> > root@bcm958802a8046c:~/vinay_rx/dynticks-testing# zcat /proc/config.gz
> > |grep HZ
> > CONFIG_NO_HZ_COMMON=y
> > # CONFIG_HZ_PERIODIC is not set
> > # CONFIG_NO_HZ_IDLE is not set
> > *CONFIG_NO_HZ_FULL*=y
> > # CONFIG_NO_HZ_FULL_ALL is not set
> > # CONFIG_NO_HZ is not set
> > # CONFIG_HZ_100 is not set
> > CONFIG_HZ_250=y
> > # CONFIG_HZ_300 is not set
> > # CONFIG_HZ_1000 is not set
> > CONFIG_HZ=250
> >
> >
> >
> > On Tue, Jun 30, 2020 at 12:50 PM Flavio Leitner 
> wrote:
> >
> > >
> > > Right, you might want to review Documentation/timers/no_hz.rst from
> > > the kernel sources and look for RCU implications section where
> > > it explains how to move RCU callbacks.
> > >
> > > fbl
> > >
> > > On Tue, Jun 30, 2020 at 12:08:05PM -0400, Shahaji Bhosle wrote:
> > > > Hi Flavio,
> > > > I wrote a small program which has do_nothing for loop and I measure
> the
> > > > timestamps across the do nothing loop. I am seeing 3% of the time
> around
> > > > the 1 second mark when the arch_timer fires I get the timestamps to
> be
> > > off
> > > > by 25% of the exprected value. I ran trace-cmd to see what is going
> on
> > > and
> > > > see the below. Looks like some issue with *gic_handle_irg*(), not
> seeing
> > > > tihs behaviour on x86 host, something special with ARM v8.
> > > > Thanks, Shahaji
> > > >
> > > >   %21.77  (14181) arm_stb_user_lorcu_dyntick #922
> > > >  |
> > > >  --- *rcu_dyntick*
> > > > |
> > > > |--%46.85-- gic_handle_irq  # 432
> > > > |
> > > > |--%23.32-- context_tracking_user_exit  # 215
> > > > |
> > > > |--%22.34-- context_tracking_user_enter  # 206
> > > > |
> > > > |--%2.60-- SyS_execve  # 24
> > > > |
> > > > |--%1.30-- do_page_fault  # 12
> > > > |
> > > > |--%0.65-- SyS_write  # 6
> > > > |
> > > > |--%0.65-- schedule  # 6
> > > > |
> > > > |--%0.65-- SyS_nanosleep  # 6
> > > > |
> > > > |--%0.65-- syscall_trace_enter  # 6
> > > > |
> > > > |--%0.65-- SyS_faccessat  # 6
> > > >
> > > >   %5.01  (14181) arm_stb_user_lorcu_utilization #212
> > > >  |
> > > >  --- *rcu_utilization*
> > > > |
> > > > |--%96.23-- gic_handle_irq  # 204
> > > > |
> > > > |--%1.89-- SyS_nanosleep  # 4
> > > > |
> > > > |--%0.94-- SyS_exit_group  # 2
> > > > |
> > > > |--%0.94-- do_notify_resume  # 2
> > > >
> > > >   %4.86  (14181) arm_stb_user_lo  user_exit #206
> > > >  |
> > > >  --- *user_exit*
> > > >   context_tracking_user_exit
> > > >
> > > >   %4.86  (14181) arm_stb_user_lo context_tracking_user_exit #206
> > > >  |
> > > >  --- context_tracking_user_exit
> > > >
> > > >   %4.86  (14181) arm_stb_user_locontext_tracking_user_enter #206
> > > >  |
> > > >  --- context_tracking_user_enter
> > > >
> > > >   %4.86  (14181) arm_stb_user_lo user_enter #206
> > > >  |
> > > >  --- *user_enter*
> > > >   context_tracking_user_enter
> > > >
> > > >   %2.95  (14181) arm_stb_user_lo gic_handle_irq #125
> > > >  |
> > > >  --- gic_handle_irq
> > > >
> > > >
> > > > On Tue, Jun 30, 2020 at 9:45 AM Flavio Leitner 
> wrote:
> > > >
> > > > > On Tue, Jun 02, 2020 at 12:56:51PM -0700, Vinay Gupta wrote:
> > > > > > Hi Flavio,
> > > > > >
> > > > > > Thanks for your reply.
> > > > > > I have captured the suggested 

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-30 Thread Flavio Leitner



Hi Shahaji,

Did it help with the rcu_nocbs? 

fbl

On Tue, Jun 30, 2020 at 12:56:27PM -0400, Shahaji Bhosle wrote:
> Thanks Flavio,
> Are there any special requirements for RCU on ARM vs x86.
> 
> I am following what the above document is saying...Do you think I need to
> do something more than the below?
> Thanks again and appreciate the help. Shahaji
> 
> 1. Isolate the CPU cores
> *isolcpus=1,2,3,4,5,6,7 nohz_full=1-7 rcu_nocbs=1-7*
> 2. Setting CONFIG_NO_HZ_FULL=y
> root@bcm958802a8046c:~/vinay_rx/dynticks-testing# zcat /proc/config.gz
> |grep HZ
> CONFIG_NO_HZ_COMMON=y
> # CONFIG_HZ_PERIODIC is not set
> # CONFIG_NO_HZ_IDLE is not set
> *CONFIG_NO_HZ_FULL*=y
> # CONFIG_NO_HZ_FULL_ALL is not set
> # CONFIG_NO_HZ is not set
> # CONFIG_HZ_100 is not set
> CONFIG_HZ_250=y
> # CONFIG_HZ_300 is not set
> # CONFIG_HZ_1000 is not set
> CONFIG_HZ=250
> 
> 
> 
> On Tue, Jun 30, 2020 at 12:50 PM Flavio Leitner  wrote:
> 
> >
> > Right, you might want to review Documentation/timers/no_hz.rst from
> > the kernel sources and look for RCU implications section where
> > it explains how to move RCU callbacks.
> >
> > fbl
> >
> > On Tue, Jun 30, 2020 at 12:08:05PM -0400, Shahaji Bhosle wrote:
> > > Hi Flavio,
> > > I wrote a small program which has do_nothing for loop and I measure the
> > > timestamps across the do nothing loop. I am seeing 3% of the time around
> > > the 1 second mark when the arch_timer fires I get the timestamps to be
> > off
> > > by 25% of the exprected value. I ran trace-cmd to see what is going on
> > and
> > > see the below. Looks like some issue with *gic_handle_irg*(), not seeing
> > > tihs behaviour on x86 host, something special with ARM v8.
> > > Thanks, Shahaji
> > >
> > >   %21.77  (14181) arm_stb_user_lorcu_dyntick #922
> > >  |
> > >  --- *rcu_dyntick*
> > > |
> > > |--%46.85-- gic_handle_irq  # 432
> > > |
> > > |--%23.32-- context_tracking_user_exit  # 215
> > > |
> > > |--%22.34-- context_tracking_user_enter  # 206
> > > |
> > > |--%2.60-- SyS_execve  # 24
> > > |
> > > |--%1.30-- do_page_fault  # 12
> > > |
> > > |--%0.65-- SyS_write  # 6
> > > |
> > > |--%0.65-- schedule  # 6
> > > |
> > > |--%0.65-- SyS_nanosleep  # 6
> > > |
> > > |--%0.65-- syscall_trace_enter  # 6
> > > |
> > > |--%0.65-- SyS_faccessat  # 6
> > >
> > >   %5.01  (14181) arm_stb_user_lorcu_utilization #212
> > >  |
> > >  --- *rcu_utilization*
> > > |
> > > |--%96.23-- gic_handle_irq  # 204
> > > |
> > > |--%1.89-- SyS_nanosleep  # 4
> > > |
> > > |--%0.94-- SyS_exit_group  # 2
> > > |
> > > |--%0.94-- do_notify_resume  # 2
> > >
> > >   %4.86  (14181) arm_stb_user_lo  user_exit #206
> > >  |
> > >  --- *user_exit*
> > >   context_tracking_user_exit
> > >
> > >   %4.86  (14181) arm_stb_user_lo context_tracking_user_exit #206
> > >  |
> > >  --- context_tracking_user_exit
> > >
> > >   %4.86  (14181) arm_stb_user_locontext_tracking_user_enter #206
> > >  |
> > >  --- context_tracking_user_enter
> > >
> > >   %4.86  (14181) arm_stb_user_lo user_enter #206
> > >  |
> > >  --- *user_enter*
> > >   context_tracking_user_enter
> > >
> > >   %2.95  (14181) arm_stb_user_lo gic_handle_irq #125
> > >  |
> > >  --- gic_handle_irq
> > >
> > >
> > > On Tue, Jun 30, 2020 at 9:45 AM Flavio Leitner  wrote:
> > >
> > > > On Tue, Jun 02, 2020 at 12:56:51PM -0700, Vinay Gupta wrote:
> > > > > Hi Flavio,
> > > > >
> > > > > Thanks for your reply.
> > > > > I have captured the suggested information but do not see anything
> > that
> > > > > could cause the packet drops.
> > > > > Can you please take a look at the below data and see if you can find
> > > > > something unusual ?
> > > > > The PMDs are running on CPU 1,2,3,4 and CPU 1-7 are isolated cores.
> > > > >
> > > > >
> > > >
> > ---
> > > > > root@bcm958802a8046c:~# cstats ; sleep 10; cycles
> > > > > pmd thread numa_id 0 core_id 1:
> > > > >   idle cycles: 99140849 (7.93%)
> > > > >   processing cycles: 1151423715 (92.07%)
> > > > >   avg cycles per packet: 116.94 (1250564564/10693918)
> > > > >   avg processing cycles per packet: 107.67 (1151423715/10693918)
> > > > > pmd thread numa_id 0 core_id 2:
> > > > >   idle cycles: 118373662 (9.47%)
> > > > >   processing cycles: 1132193442 (90.53%)
> > > > >   avg cycles 

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-30 Thread Shahaji Bhosle via dev
Thanks Flavio,
Are there any special requirements for RCU on ARM vs x86.

I am following what the above document is saying...Do you think I need to
do something more than the below?
Thanks again and appreciate the help. Shahaji

1. Isolate the CPU cores
*isolcpus=1,2,3,4,5,6,7 nohz_full=1-7 rcu_nocbs=1-7*
2. Setting CONFIG_NO_HZ_FULL=y
root@bcm958802a8046c:~/vinay_rx/dynticks-testing# zcat /proc/config.gz
|grep HZ
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
*CONFIG_NO_HZ_FULL*=y
# CONFIG_NO_HZ_FULL_ALL is not set
# CONFIG_NO_HZ is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250



On Tue, Jun 30, 2020 at 12:50 PM Flavio Leitner  wrote:

>
> Right, you might want to review Documentation/timers/no_hz.rst from
> the kernel sources and look for RCU implications section where
> it explains how to move RCU callbacks.
>
> fbl
>
> On Tue, Jun 30, 2020 at 12:08:05PM -0400, Shahaji Bhosle wrote:
> > Hi Flavio,
> > I wrote a small program which has do_nothing for loop and I measure the
> > timestamps across the do nothing loop. I am seeing 3% of the time around
> > the 1 second mark when the arch_timer fires I get the timestamps to be
> off
> > by 25% of the exprected value. I ran trace-cmd to see what is going on
> and
> > see the below. Looks like some issue with *gic_handle_irg*(), not seeing
> > tihs behaviour on x86 host, something special with ARM v8.
> > Thanks, Shahaji
> >
> >   %21.77  (14181) arm_stb_user_lorcu_dyntick #922
> >  |
> >  --- *rcu_dyntick*
> > |
> > |--%46.85-- gic_handle_irq  # 432
> > |
> > |--%23.32-- context_tracking_user_exit  # 215
> > |
> > |--%22.34-- context_tracking_user_enter  # 206
> > |
> > |--%2.60-- SyS_execve  # 24
> > |
> > |--%1.30-- do_page_fault  # 12
> > |
> > |--%0.65-- SyS_write  # 6
> > |
> > |--%0.65-- schedule  # 6
> > |
> > |--%0.65-- SyS_nanosleep  # 6
> > |
> > |--%0.65-- syscall_trace_enter  # 6
> > |
> > |--%0.65-- SyS_faccessat  # 6
> >
> >   %5.01  (14181) arm_stb_user_lorcu_utilization #212
> >  |
> >  --- *rcu_utilization*
> > |
> > |--%96.23-- gic_handle_irq  # 204
> > |
> > |--%1.89-- SyS_nanosleep  # 4
> > |
> > |--%0.94-- SyS_exit_group  # 2
> > |
> > |--%0.94-- do_notify_resume  # 2
> >
> >   %4.86  (14181) arm_stb_user_lo  user_exit #206
> >  |
> >  --- *user_exit*
> >   context_tracking_user_exit
> >
> >   %4.86  (14181) arm_stb_user_lo context_tracking_user_exit #206
> >  |
> >  --- context_tracking_user_exit
> >
> >   %4.86  (14181) arm_stb_user_locontext_tracking_user_enter #206
> >  |
> >  --- context_tracking_user_enter
> >
> >   %4.86  (14181) arm_stb_user_lo user_enter #206
> >  |
> >  --- *user_enter*
> >   context_tracking_user_enter
> >
> >   %2.95  (14181) arm_stb_user_lo gic_handle_irq #125
> >  |
> >  --- gic_handle_irq
> >
> >
> > On Tue, Jun 30, 2020 at 9:45 AM Flavio Leitner  wrote:
> >
> > > On Tue, Jun 02, 2020 at 12:56:51PM -0700, Vinay Gupta wrote:
> > > > Hi Flavio,
> > > >
> > > > Thanks for your reply.
> > > > I have captured the suggested information but do not see anything
> that
> > > > could cause the packet drops.
> > > > Can you please take a look at the below data and see if you can find
> > > > something unusual ?
> > > > The PMDs are running on CPU 1,2,3,4 and CPU 1-7 are isolated cores.
> > > >
> > > >
> > >
> ---
> > > > root@bcm958802a8046c:~# cstats ; sleep 10; cycles
> > > > pmd thread numa_id 0 core_id 1:
> > > >   idle cycles: 99140849 (7.93%)
> > > >   processing cycles: 1151423715 (92.07%)
> > > >   avg cycles per packet: 116.94 (1250564564/10693918)
> > > >   avg processing cycles per packet: 107.67 (1151423715/10693918)
> > > > pmd thread numa_id 0 core_id 2:
> > > >   idle cycles: 118373662 (9.47%)
> > > >   processing cycles: 1132193442 (90.53%)
> > > >   avg cycles per packet: 124.39 (1250567104/10053309)
> > > >   avg processing cycles per packet: 112.62 (1132193442/10053309)
> > > > pmd thread numa_id 0 core_id 3:
> > > >   idle cycles: 53805933 (4.30%)
> > > >   processing cycles: 1196762002 (95.70%)
> > > >   avg cycles per packet: 107.35 (1250567935/11649948)
> > > >   avg processing cycles per packet: 102.73 (1196762002/11649948)
> > > 

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-30 Thread Flavio Leitner


Right, you might want to review Documentation/timers/no_hz.rst from
the kernel sources and look for RCU implications section where
it explains how to move RCU callbacks.

fbl

On Tue, Jun 30, 2020 at 12:08:05PM -0400, Shahaji Bhosle wrote:
> Hi Flavio,
> I wrote a small program which has do_nothing for loop and I measure the
> timestamps across the do nothing loop. I am seeing 3% of the time around
> the 1 second mark when the arch_timer fires I get the timestamps to be off
> by 25% of the exprected value. I ran trace-cmd to see what is going on and
> see the below. Looks like some issue with *gic_handle_irg*(), not seeing
> tihs behaviour on x86 host, something special with ARM v8.
> Thanks, Shahaji
> 
>   %21.77  (14181) arm_stb_user_lorcu_dyntick #922
>  |
>  --- *rcu_dyntick*
> |
> |--%46.85-- gic_handle_irq  # 432
> |
> |--%23.32-- context_tracking_user_exit  # 215
> |
> |--%22.34-- context_tracking_user_enter  # 206
> |
> |--%2.60-- SyS_execve  # 24
> |
> |--%1.30-- do_page_fault  # 12
> |
> |--%0.65-- SyS_write  # 6
> |
> |--%0.65-- schedule  # 6
> |
> |--%0.65-- SyS_nanosleep  # 6
> |
> |--%0.65-- syscall_trace_enter  # 6
> |
> |--%0.65-- SyS_faccessat  # 6
> 
>   %5.01  (14181) arm_stb_user_lorcu_utilization #212
>  |
>  --- *rcu_utilization*
> |
> |--%96.23-- gic_handle_irq  # 204
> |
> |--%1.89-- SyS_nanosleep  # 4
> |
> |--%0.94-- SyS_exit_group  # 2
> |
> |--%0.94-- do_notify_resume  # 2
> 
>   %4.86  (14181) arm_stb_user_lo  user_exit #206
>  |
>  --- *user_exit*
>   context_tracking_user_exit
> 
>   %4.86  (14181) arm_stb_user_lo context_tracking_user_exit #206
>  |
>  --- context_tracking_user_exit
> 
>   %4.86  (14181) arm_stb_user_locontext_tracking_user_enter #206
>  |
>  --- context_tracking_user_enter
> 
>   %4.86  (14181) arm_stb_user_lo user_enter #206
>  |
>  --- *user_enter*
>   context_tracking_user_enter
> 
>   %2.95  (14181) arm_stb_user_lo gic_handle_irq #125
>  |
>  --- gic_handle_irq
> 
> 
> On Tue, Jun 30, 2020 at 9:45 AM Flavio Leitner  wrote:
> 
> > On Tue, Jun 02, 2020 at 12:56:51PM -0700, Vinay Gupta wrote:
> > > Hi Flavio,
> > >
> > > Thanks for your reply.
> > > I have captured the suggested information but do not see anything that
> > > could cause the packet drops.
> > > Can you please take a look at the below data and see if you can find
> > > something unusual ?
> > > The PMDs are running on CPU 1,2,3,4 and CPU 1-7 are isolated cores.
> > >
> > >
> > ---
> > > root@bcm958802a8046c:~# cstats ; sleep 10; cycles
> > > pmd thread numa_id 0 core_id 1:
> > >   idle cycles: 99140849 (7.93%)
> > >   processing cycles: 1151423715 (92.07%)
> > >   avg cycles per packet: 116.94 (1250564564/10693918)
> > >   avg processing cycles per packet: 107.67 (1151423715/10693918)
> > > pmd thread numa_id 0 core_id 2:
> > >   idle cycles: 118373662 (9.47%)
> > >   processing cycles: 1132193442 (90.53%)
> > >   avg cycles per packet: 124.39 (1250567104/10053309)
> > >   avg processing cycles per packet: 112.62 (1132193442/10053309)
> > > pmd thread numa_id 0 core_id 3:
> > >   idle cycles: 53805933 (4.30%)
> > >   processing cycles: 1196762002 (95.70%)
> > >   avg cycles per packet: 107.35 (1250567935/11649948)
> > >   avg processing cycles per packet: 102.73 (1196762002/11649948)
> > > pmd thread numa_id 0 core_id 4:
> > >   idle cycles: 189102938 (15.12%)
> > >   processing cycles: 1061463293 (84.88%)
> > >   avg cycles per packet: 143.47 (1250566231/8716828)
> > >   avg processing cycles per packet: 121.77 (1061463293/8716828)
> > > pmd thread numa_id 0 core_id 5:
> > > pmd thread numa_id 0 core_id 6:
> > > pmd thread numa_id 0 core_id 7:
> >
> >
> > The core_id 3 is high loaded, and then it's more likely to show
> > the drop issue when some other event happens.
> >
> > I think you need to run perf as I recommended before and see if
> > there are context switches happening and why they are happening.
> >
> > If a context switch happens, it's either because the core is not
> > well isolated or some other thing is going on. It will help to
> > understand why the queue wasn't serviced for a certain amount of
> > time.
> >
> > The issue is that running perf might introduce some load, so you
> > will need adjust the traffic rate 

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-30 Thread Shahaji Bhosle via dev
Hi Flavio,
I wrote a small program which has do_nothing for loop and I measure the
timestamps across the do nothing loop. I am seeing 3% of the time around
the 1 second mark when the arch_timer fires I get the timestamps to be off
by 25% of the exprected value. I ran trace-cmd to see what is going on and
see the below. Looks like some issue with *gic_handle_irg*(), not seeing
tihs behaviour on x86 host, something special with ARM v8.
Thanks, Shahaji

  %21.77  (14181) arm_stb_user_lorcu_dyntick #922
 |
 --- *rcu_dyntick*
|
|--%46.85-- gic_handle_irq  # 432
|
|--%23.32-- context_tracking_user_exit  # 215
|
|--%22.34-- context_tracking_user_enter  # 206
|
|--%2.60-- SyS_execve  # 24
|
|--%1.30-- do_page_fault  # 12
|
|--%0.65-- SyS_write  # 6
|
|--%0.65-- schedule  # 6
|
|--%0.65-- SyS_nanosleep  # 6
|
|--%0.65-- syscall_trace_enter  # 6
|
|--%0.65-- SyS_faccessat  # 6

  %5.01  (14181) arm_stb_user_lorcu_utilization #212
 |
 --- *rcu_utilization*
|
|--%96.23-- gic_handle_irq  # 204
|
|--%1.89-- SyS_nanosleep  # 4
|
|--%0.94-- SyS_exit_group  # 2
|
|--%0.94-- do_notify_resume  # 2

  %4.86  (14181) arm_stb_user_lo  user_exit #206
 |
 --- *user_exit*
  context_tracking_user_exit

  %4.86  (14181) arm_stb_user_lo context_tracking_user_exit #206
 |
 --- context_tracking_user_exit

  %4.86  (14181) arm_stb_user_locontext_tracking_user_enter #206
 |
 --- context_tracking_user_enter

  %4.86  (14181) arm_stb_user_lo user_enter #206
 |
 --- *user_enter*
  context_tracking_user_enter

  %2.95  (14181) arm_stb_user_lo gic_handle_irq #125
 |
 --- gic_handle_irq


On Tue, Jun 30, 2020 at 9:45 AM Flavio Leitner  wrote:

> On Tue, Jun 02, 2020 at 12:56:51PM -0700, Vinay Gupta wrote:
> > Hi Flavio,
> >
> > Thanks for your reply.
> > I have captured the suggested information but do not see anything that
> > could cause the packet drops.
> > Can you please take a look at the below data and see if you can find
> > something unusual ?
> > The PMDs are running on CPU 1,2,3,4 and CPU 1-7 are isolated cores.
> >
> >
> ---
> > root@bcm958802a8046c:~# cstats ; sleep 10; cycles
> > pmd thread numa_id 0 core_id 1:
> >   idle cycles: 99140849 (7.93%)
> >   processing cycles: 1151423715 (92.07%)
> >   avg cycles per packet: 116.94 (1250564564/10693918)
> >   avg processing cycles per packet: 107.67 (1151423715/10693918)
> > pmd thread numa_id 0 core_id 2:
> >   idle cycles: 118373662 (9.47%)
> >   processing cycles: 1132193442 (90.53%)
> >   avg cycles per packet: 124.39 (1250567104/10053309)
> >   avg processing cycles per packet: 112.62 (1132193442/10053309)
> > pmd thread numa_id 0 core_id 3:
> >   idle cycles: 53805933 (4.30%)
> >   processing cycles: 1196762002 (95.70%)
> >   avg cycles per packet: 107.35 (1250567935/11649948)
> >   avg processing cycles per packet: 102.73 (1196762002/11649948)
> > pmd thread numa_id 0 core_id 4:
> >   idle cycles: 189102938 (15.12%)
> >   processing cycles: 1061463293 (84.88%)
> >   avg cycles per packet: 143.47 (1250566231/8716828)
> >   avg processing cycles per packet: 121.77 (1061463293/8716828)
> > pmd thread numa_id 0 core_id 5:
> > pmd thread numa_id 0 core_id 6:
> > pmd thread numa_id 0 core_id 7:
>
>
> The core_id 3 is high loaded, and then it's more likely to show
> the drop issue when some other event happens.
>
> I think you need to run perf as I recommended before and see if
> there are context switches happening and why they are happening.
>
> If a context switch happens, it's either because the core is not
> well isolated or some other thing is going on. It will help to
> understand why the queue wasn't serviced for a certain amount of
> time.
>
> The issue is that running perf might introduce some load, so you
> will need adjust the traffic rate accordingly.
>
> HTH,
> fbl
>
>
>
> >
> >
> > *Runtime summary*  comm  parent   sched-in
> > run-timemin-run avg-run max-run  stddev  migrations
> >   (count)   (msec) (msec)
> >(msec)  (msec)   %
> >
> -
> > ksoftirqd/0[7]   2  10.079  

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-30 Thread Flavio Leitner
On Tue, Jun 02, 2020 at 12:56:51PM -0700, Vinay Gupta wrote:
> Hi Flavio,
> 
> Thanks for your reply.
> I have captured the suggested information but do not see anything that
> could cause the packet drops.
> Can you please take a look at the below data and see if you can find
> something unusual ?
> The PMDs are running on CPU 1,2,3,4 and CPU 1-7 are isolated cores.
> 
> ---
> root@bcm958802a8046c:~# cstats ; sleep 10; cycles
> pmd thread numa_id 0 core_id 1:
>   idle cycles: 99140849 (7.93%)
>   processing cycles: 1151423715 (92.07%)
>   avg cycles per packet: 116.94 (1250564564/10693918)
>   avg processing cycles per packet: 107.67 (1151423715/10693918)
> pmd thread numa_id 0 core_id 2:
>   idle cycles: 118373662 (9.47%)
>   processing cycles: 1132193442 (90.53%)
>   avg cycles per packet: 124.39 (1250567104/10053309)
>   avg processing cycles per packet: 112.62 (1132193442/10053309)
> pmd thread numa_id 0 core_id 3:
>   idle cycles: 53805933 (4.30%)
>   processing cycles: 1196762002 (95.70%)
>   avg cycles per packet: 107.35 (1250567935/11649948)
>   avg processing cycles per packet: 102.73 (1196762002/11649948)
> pmd thread numa_id 0 core_id 4:
>   idle cycles: 189102938 (15.12%)
>   processing cycles: 1061463293 (84.88%)
>   avg cycles per packet: 143.47 (1250566231/8716828)
>   avg processing cycles per packet: 121.77 (1061463293/8716828)
> pmd thread numa_id 0 core_id 5:
> pmd thread numa_id 0 core_id 6:
> pmd thread numa_id 0 core_id 7:


The core_id 3 is high loaded, and then it's more likely to show
the drop issue when some other event happens.

I think you need to run perf as I recommended before and see if
there are context switches happening and why they are happening.

If a context switch happens, it's either because the core is not
well isolated or some other thing is going on. It will help to
understand why the queue wasn't serviced for a certain amount of
time.

The issue is that running perf might introduce some load, so you
will need adjust the traffic rate accordingly.

HTH,
fbl



> 
> 
> *Runtime summary*  comm  parent   sched-in
> run-timemin-run avg-run max-run  stddev  migrations
>   (count)   (msec) (msec)
>(msec)  (msec)   %
> -
> ksoftirqd/0[7]   2  10.079  0.079
> 0.079   0.0790.00   0
>   rcu_sched[8]   2 140.067  0.002
> 0.004   0.0099.96   0
>rcuos/4[38]   2  60.027  0.002
> 0.004   0.008   20.97   0
>rcuos/5[45]   2  40.018  0.004
> 0.004   0.0056.63   0
>kworker/0:1[71]   2 120.156  0.008
> 0.013   0.0196.72   0
>  mmcqd/0[1230]   2  30.054  0.001
> 0.018   0.031   47.29   0
> kworker/0:1H[1248]   2  10.006  0.006
> 0.006   0.0060.00   0
>kworker/u16:2[1547]   2 160.045  0.001
> 0.002   0.012   26.19   0
> ntpd[5282]   1  10.063  0.063
> 0.063   0.0630.00   0
> watchdog[6988]   1  20.089  0.012
> 0.044   0.076   72.26   0
> ovs-vswitchd[9239]   1  20.326  0.152
> 0.163   0.1736.45   0
>revalidator8[9309/9239]9239  21.260  0.607
> 0.630   0.6523.58   0
>perf[27150]   27140  10.000  0.000
> 0.000   0.0000.00   0
> 
> Terminated tasks:
>   sleep[27151]   27150  41.002  0.015
> 0.250   0.677   58.22   0
> 
> Idle stats:
> CPU  0 idle for999.814  msec  ( 99.84%)
> 
> 
> 
> *CPU  1 idle entire time windowCPU  2 idle entire time windowCPU  3
> idle entire time windowCPU  4 idle entire time window*
> CPU  5 idle for500.326  msec  ( 49.96%)
> CPU  6 idle entire time window
> CPU  7 idle entire time window
> 
> Total number of unique tasks: 14
> Total number of context switches: 115
>Total run time (msec):  3.198
> Total scheduling time (msec): 1001.425  (x 8)
> (END)
> 
> 
> 
> *02:16:22  UID  TGID   TID%usr %system  %guest   %wait
>  %CPU   CPU  Command *02:16:230  9239 -  100.000.00
>0.000.00  100.00 5  ovs-vswitchd
> 02:16:23 

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-02 Thread Vinay Gupta via dev
Hi Flavio,

Thanks for your reply.
I have captured the suggested information but do not see anything that
could cause the packet drops.
Can you please take a look at the below data and see if you can find
something unusual ?
The PMDs are running on CPU 1,2,3,4 and CPU 1-7 are isolated cores.

---
root@bcm958802a8046c:~# cstats ; sleep 10; cycles
pmd thread numa_id 0 core_id 1:
  idle cycles: 99140849 (7.93%)
  processing cycles: 1151423715 (92.07%)
  avg cycles per packet: 116.94 (1250564564/10693918)
  avg processing cycles per packet: 107.67 (1151423715/10693918)
pmd thread numa_id 0 core_id 2:
  idle cycles: 118373662 (9.47%)
  processing cycles: 1132193442 (90.53%)
  avg cycles per packet: 124.39 (1250567104/10053309)
  avg processing cycles per packet: 112.62 (1132193442/10053309)
pmd thread numa_id 0 core_id 3:
  idle cycles: 53805933 (4.30%)
  processing cycles: 1196762002 (95.70%)
  avg cycles per packet: 107.35 (1250567935/11649948)
  avg processing cycles per packet: 102.73 (1196762002/11649948)
pmd thread numa_id 0 core_id 4:
  idle cycles: 189102938 (15.12%)
  processing cycles: 1061463293 (84.88%)
  avg cycles per packet: 143.47 (1250566231/8716828)
  avg processing cycles per packet: 121.77 (1061463293/8716828)
pmd thread numa_id 0 core_id 5:
pmd thread numa_id 0 core_id 6:
pmd thread numa_id 0 core_id 7:


*Runtime summary*  comm  parent   sched-in
run-timemin-run avg-run max-run  stddev  migrations
  (count)   (msec) (msec)
   (msec)  (msec)   %
-
ksoftirqd/0[7]   2  10.079  0.079
0.079   0.0790.00   0
  rcu_sched[8]   2 140.067  0.002
0.004   0.0099.96   0
   rcuos/4[38]   2  60.027  0.002
0.004   0.008   20.97   0
   rcuos/5[45]   2  40.018  0.004
0.004   0.0056.63   0
   kworker/0:1[71]   2 120.156  0.008
0.013   0.0196.72   0
 mmcqd/0[1230]   2  30.054  0.001
0.018   0.031   47.29   0
kworker/0:1H[1248]   2  10.006  0.006
0.006   0.0060.00   0
   kworker/u16:2[1547]   2 160.045  0.001
0.002   0.012   26.19   0
ntpd[5282]   1  10.063  0.063
0.063   0.0630.00   0
watchdog[6988]   1  20.089  0.012
0.044   0.076   72.26   0
ovs-vswitchd[9239]   1  20.326  0.152
0.163   0.1736.45   0
   revalidator8[9309/9239]9239  21.260  0.607
0.630   0.6523.58   0
   perf[27150]   27140  10.000  0.000
0.000   0.0000.00   0

Terminated tasks:
  sleep[27151]   27150  41.002  0.015
0.250   0.677   58.22   0

Idle stats:
CPU  0 idle for999.814  msec  ( 99.84%)



*CPU  1 idle entire time windowCPU  2 idle entire time windowCPU  3
idle entire time windowCPU  4 idle entire time window*
CPU  5 idle for500.326  msec  ( 49.96%)
CPU  6 idle entire time window
CPU  7 idle entire time window

Total number of unique tasks: 14
Total number of context switches: 115
   Total run time (msec):  3.198
Total scheduling time (msec): 1001.425  (x 8)
(END)



*02:16:22  UID  TGID   TID%usr %system  %guest   %wait
 %CPU   CPU  Command *02:16:230  9239 -  100.000.00
   0.000.00  100.00 5  ovs-vswitchd
02:16:230 -  92392.000.000.000.00
 2.00 5  |__ovs-vswitchd
02:16:230 -  92400.000.000.000.00
 0.00 0  |__vfio-sync
02:16:230 -  92410.000.000.000.00
 0.00 5  |__eal-intr-thread
02:16:230 -  92420.000.000.000.00
 0.00 5  |__dpdk_watchdog1
02:16:230 -  92440.000.000.000.00
 0.00 5  |__urcu2
02:16:230 -  92790.000.000.000.00
 0.00 5  |__ct_clean3
02:16:230 -  93080.000.000.000.00
 0.00 5  |__handler9
02:16:230 -  93090.000.000.000.00
 0.00 5  |__revalidator8
02:16:230 -  93280.00   

Re: [ovs-dev] 10-25 packet drops every few (10-50) seconds TCP (iperf3)

2020-06-02 Thread Flavio Leitner
On Mon, Jun 01, 2020 at 07:27:09PM -0400, Shahaji Bhosle via dev wrote:
> Hi Ben/Ilya,
> Hope you guys are doing well and staying safe. I have been chasing a weird
> problem with small drops and I think that is causing lots of TCP
> retransmission.
> 
> Setup details
> iPerf3(1k-5K
> Servers)<--DPDK2:OvS+DPDK(VxLAN:BOND)[DPDK0+DPDK1)<2x25G<
> [DPDK0+DPDK1)(VxLAN:BOND)OVS+DPDKDPDK2<---iPerf3(Clients)
> 
> All the Drops are ring drops on BONDed functions on the server side.  I
> have 4 CPUs each with 3PMD threads, DPDK0, DPDK1 and DPDK2 all running with
> 4 Rx rings each.
> 
> What is interesting is when I give each Rx rings its own CPU the drops go
> away. Or if I set cother_config:emc-insert-inv-prob=1 the drops go away.
> But I need to scale up the number of flows so trying to run this with EMC
> disabled.
> 
> I can tell that the rings are not getting serviced for 30-40usec because of
> some kind context switch or interrupts on these cores. I have tried to do
> the usual isolation, nohz_full rcu_nocbs etc. Move all the interrupts away
> from these cores etc. But nothing helps. I mean it improves, but the drops
> still happen.

When you disable the EMC (or reduce its efficiency) the per packet cost
increases, then it becomes more sensitive to variations. If you share
a CPU with multiple queues, you decrease the amount of time available
to process the queue. In either case, there will be less room to tolerate
variations.

Well, you might want to use 'perf' and monitor for the scheduling events
and then based on the stack trace see what is causing it and try to
prevent it.

For example:
# perf record -e sched:sched_switch -a -g sleep 1

For instance, you might see that another NIC used for management has
IRQs assigned to one isolated CPU. You can move it to another CPU to
reduce the noise, etc...

Another suggestion is look at PMD thread idle statistics because it
will tell you how much "extra" room you have left. As it approaches
to 0, more fine tuned your setup needs to be to avoid drops.

HTH,
-- 
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev