Re: [ovs-dev] [PATCH] RFC for support of PMD Auto load balancing

2018-10-28 Thread Nitin Katiyar
Thanks Kevin for reviewing it. I will look into your comments and send the new 
version for review.

I would like to clarify that it samples load every 10 seconds and if the 
criterion for triggering dry run (i.e load threshold and/or drops) is met for 
consecutive 6 iterations then only it will trigger dry run. For this 
rxq->overloading_pmd is required.

Regards,
Nitin

-Original Message-
From: Kevin Traynor [mailto:ktray...@redhat.com] 
Sent: Friday, October 26, 2018 3:53 PM
To: Nitin Katiyar ; ovs-dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH] RFC for support of PMD Auto load balancing

Hi Nitin,

Thanks for your work on this and sharing the RFC. Initial comments below.

On 10/11/2018 08:59 PM, Nitin Katiyar wrote:
> Port rx queues that have not been statically assigned to PMDs are 
> currently assigned based on periodically sampled load measurements.
> The assignment is performed at specific instances – port addition, 
> port deletion, upon reassignment request via CLI etc.
> 
> Over time it can cause uneven load among the PMDs due to change in 
> traffic pattern and thus resulting in lower overall throughout.
> 
> This patch enables the support of auto load balancing of PMDs based on 
> measured load of RX queues. Each PMD measures the processing load for 
> each of its associated queues every 10 seconds. If the aggregated PMD 
> load exceeds a configured threshold for 6 consecutive intervals and if 
> there are receive packet drops at the NIC the PMD considers itself to be 
> overloaded.
> 
> If any PMD considers itself to be overloaded, a dry-run of the PMD 
> assignment algorithm is performed by OVS main thread. The dry-run does 
> NOT change the existing queue to PMD assignments.
> 
> If the resultant mapping of dry-run indicates an improved distribution 
> of the load then the actual reassignment will be performed. The 
> automatic rebalancing will be disabled by default and has to be 
> enabled via configuration option. Load thresholds, improvement factor 
> etc are also configurable.
> 
> Following example commands can be used to set the auto-lb params:
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-thresh="80"
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-min-improvement="5"
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-drop-check="true"
> 

As you mentioned in follow up, this will never be perfect, so there needs to be 
a way that the user can limit it happening even if the criteria above is met. 
Something like allowing the user to set a max number of rebalances for some 
time period. e.g. pmd-auto-lb-max-num and pmd-auto-lb-time.

> Co-authored-by: Rohith Basavaraja 
> 
> Signed-off-by: Nitin Katiyar 
> Signed-off-by: Rohith Basavaraja 
> ---
>  lib/dpif-netdev.c | 589 
> +++---
>  1 file changed, 561 insertions(+), 28 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 
> e322f55..28593cc 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -80,6 +80,26 @@
>  
>  VLOG_DEFINE_THIS_MODULE(dpif_netdev);
>  
> +/* Auto Load Balancing Defaults */
> +#define ACCEPT_IMPROVE_DEFAULT   (25)
> +#define PMD_LOAD_THRE_DEFAULT(99)
> +#define PMD_AUTO_LB_DISABLE  false
> +#define SKIP_DROP_CHECK_DEFAULT  false
> +
> +//TODO: Should we make it configurable??
> +#define PMD_MIN_NUM_DROPS(1)
> +#define PMD_MIN_NUM_QFILLS   (1)

It seems like a very small default. This would indicate that one dropped pkt or 
one vhost q full is considered enough to do a dry run.

> +#define PMD_REBALANCE_POLL_TIMER_INTERVAL 6
> +
> +extern uint32_t log_q_thr;
> +
> +static bool pmd_auto_lb = PMD_AUTO_LB_DISABLE; static bool 
> +auto_lb_skip_drop_check = SKIP_DROP_CHECK_DEFAULT; static float 
> +auto_lb_pmd_load_ther = PMD_LOAD_THRE_DEFAULT; static unsigned int 
> +auto_lb_accept_improve = ACCEPT_IMPROVE_DEFAULT; static long long int 
> +pmd_rebalance_poll_timer = 0;
> +

I think these can be in 'struct dp_netdev' like the other config items instead 
of globals

> +
>  #define FLOW_DUMP_MAX_BATCH 50
>  /* Use per thread recirc_depth to prevent recirculation loop. */  
> #define MAX_RECIRC_DEPTH 6 @@ -393,6 +413,8 @@ enum 
> rxq_cycles_counter_type {
> interval. */
>  RXQ_CYCLES_PROC_HIST,   /* Total cycles of all intervals that are 
> used
> during rxq to pmd assignment. */
> +RXQ_CYCLES_IDLE_CURR,   /* Cycles spent in idling. */
> +RXQ_CYCLES_IDLE_HIST,   /* Total cycles of all idle intervals. */

I'm not sure if it's really needed to meas

Re: [ovs-dev] [PATCH] RFC for support of PMD Auto load balancing

2018-10-26 Thread Kevin Traynor
Hi Nitin,

Thanks for your work on this and sharing the RFC. Initial comments below.

On 10/11/2018 08:59 PM, Nitin Katiyar wrote:
> Port rx queues that have not been statically assigned to PMDs are currently
> assigned based on periodically sampled load measurements.
> The assignment is performed at specific instances – port addition, port
> deletion, upon reassignment request via CLI etc.
> 
> Over time it can cause uneven load among the PMDs due to change in traffic
> pattern and thus resulting in lower overall throughout.
> 
> This patch enables the support of auto load balancing of PMDs based
> on measured load of RX queues. Each PMD measures the processing load for
> each of its associated queues every 10 seconds. If the aggregated PMD load
> exceeds a configured threshold for 6 consecutive intervals and if there are
> receive packet drops at the NIC the PMD considers itself to be overloaded.
> 
> If any PMD considers itself to be overloaded, a dry-run of the PMD
> assignment algorithm is performed by OVS main thread. The dry-run
> does NOT change the existing queue to PMD assignments.
> 
> If the resultant mapping of dry-run indicates an improved distribution
> of the load then the actual reassignment will be performed. The automatic
> rebalancing will be disabled by default and has to be enabled via
> configuration option. Load thresholds, improvement factor etc are also
> configurable.
> 
> Following example commands can be used to set the auto-lb params:
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-thresh="80"
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-min-improvement="5"
> ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-drop-check="true"
> 

As you mentioned in follow up, this will never be perfect, so there
needs to be a way that the user can limit it happening even if the
criteria above is met. Something like allowing the user to set a max
number of rebalances for some time period. e.g. pmd-auto-lb-max-num and
pmd-auto-lb-time.

> Co-authored-by: Rohith Basavaraja 
> 
> Signed-off-by: Nitin Katiyar 
> Signed-off-by: Rohith Basavaraja 
> ---
>  lib/dpif-netdev.c | 589 
> +++---
>  1 file changed, 561 insertions(+), 28 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index e322f55..28593cc 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -80,6 +80,26 @@
>  
>  VLOG_DEFINE_THIS_MODULE(dpif_netdev);
>  
> +/* Auto Load Balancing Defaults */
> +#define ACCEPT_IMPROVE_DEFAULT   (25)
> +#define PMD_LOAD_THRE_DEFAULT(99)
> +#define PMD_AUTO_LB_DISABLE  false
> +#define SKIP_DROP_CHECK_DEFAULT  false
> +
> +//TODO: Should we make it configurable??
> +#define PMD_MIN_NUM_DROPS(1)
> +#define PMD_MIN_NUM_QFILLS   (1)

It seems like a very small default. This would indicate that one dropped
pkt or one vhost q full is considered enough to do a dry run.

> +#define PMD_REBALANCE_POLL_TIMER_INTERVAL 6
> +
> +extern uint32_t log_q_thr;
> +
> +static bool pmd_auto_lb = PMD_AUTO_LB_DISABLE;
> +static bool auto_lb_skip_drop_check = SKIP_DROP_CHECK_DEFAULT;
> +static float auto_lb_pmd_load_ther = PMD_LOAD_THRE_DEFAULT;
> +static unsigned int auto_lb_accept_improve = ACCEPT_IMPROVE_DEFAULT;
> +static long long int pmd_rebalance_poll_timer = 0;
> +

I think these can be in 'struct dp_netdev' like the other config items
instead of globals

> +
>  #define FLOW_DUMP_MAX_BATCH 50
>  /* Use per thread recirc_depth to prevent recirculation loop. */
>  #define MAX_RECIRC_DEPTH 6
> @@ -393,6 +413,8 @@ enum rxq_cycles_counter_type {
> interval. */
>  RXQ_CYCLES_PROC_HIST,   /* Total cycles of all intervals that are 
> used
> during rxq to pmd assignment. */
> +RXQ_CYCLES_IDLE_CURR,   /* Cycles spent in idling. */
> +RXQ_CYCLES_IDLE_HIST,   /* Total cycles of all idle intervals. */

I'm not sure if it's really needed to measure this, or you can just use
'pmd->intrvl_cycles - sum(rxq intrvl's on that pmd)' like in
pmd_info_show_rxq(). It would be worth to try with that and see if
there's any noticable difference.

>  RXQ_N_CYCLES
>  };
>  
> @@ -429,6 +451,14 @@ static struct ovsthread_once offload_thread_once
>  
>  #define XPS_TIMEOUT 50LL/* In microseconds. */
>  
> +typedef struct {
> +unsigned long long prev_drops;
> +} q_drops;

Doesn't seem like the struct is needed here

> +typedef struct {
> +unsigned int num_vhost_qfill;
> +unsigned int prev_num_vhost_qfill;
> +} vhost_qfill;
> +
>  /* Contained by struct dp_netdev_port's 'rxqs' member.  */
>  struct dp_netdev_rxq {
>  struct dp_netdev_port *port;
> @@ -439,6 +469,10 @@ struct dp_netdev_rxq {
>particular core. */
>  unsigned intrvl_idx;   /* Write index for 'cycles_intrvl'. */
>  

Re: [ovs-dev] [PATCH] RFC for support of PMD Auto load balancing

2018-10-22 Thread Nitin Katiyar
Hi,
Gentle reminder for review.

Regards,
Nitin

-Original Message-
From: Nitin Katiyar 
Sent: Friday, October 12, 2018 10:49 AM
To: ovs-dev@openvswitch.org
Cc: Rohith Basavaraja 
Subject: RE: [PATCH] RFC for support of PMD Auto load balancing

Hi,
I forgot to mention that this patch does not handle frequent rx scheduling of 
queues due to auto load balancing. That is something we had identified and 
changes need to be done to dampen the frequent scheduling of rx queues across 
PMDs.

Regards,
Nitin

-Original Message-
From: Nitin Katiyar
Sent: Friday, October 12, 2018 1:30 AM
To: ovs-dev@openvswitch.org
Cc: Nitin Katiyar ; Rohith Basavaraja 

Subject: [PATCH] RFC for support of PMD Auto load balancing

Port rx queues that have not been statically assigned to PMDs are currently 
assigned based on periodically sampled load measurements.
The assignment is performed at specific instances – port addition, port 
deletion, upon reassignment request via CLI etc.

Over time it can cause uneven load among the PMDs due to change in traffic 
pattern and thus resulting in lower overall throughout.

This patch enables the support of auto load balancing of PMDs based on measured 
load of RX queues. Each PMD measures the processing load for each of its 
associated queues every 10 seconds. If the aggregated PMD load exceeds a 
configured threshold for 6 consecutive intervals and if there are receive 
packet drops at the NIC the PMD considers itself to be overloaded.

If any PMD considers itself to be overloaded, a dry-run of the PMD assignment 
algorithm is performed by OVS main thread. The dry-run does NOT change the 
existing queue to PMD assignments.

If the resultant mapping of dry-run indicates an improved distribution of the 
load then the actual reassignment will be performed. The automatic rebalancing 
will be disabled by default and has to be enabled via configuration option. 
Load thresholds, improvement factor etc are also configurable.

Following example commands can be used to set the auto-lb params:
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-thresh="80"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-min-improvement="5"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-drop-check="true"

Co-authored-by: Rohith Basavaraja 

Signed-off-by: Nitin Katiyar 
Signed-off-by: Rohith Basavaraja 
---
 lib/dpif-netdev.c | 589 +++---
 1 file changed, 561 insertions(+), 28 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index e322f55..28593cc 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -80,6 +80,26 @@
 
 VLOG_DEFINE_THIS_MODULE(dpif_netdev);
 
+/* Auto Load Balancing Defaults */
+#define ACCEPT_IMPROVE_DEFAULT   (25)
+#define PMD_LOAD_THRE_DEFAULT(99)
+#define PMD_AUTO_LB_DISABLE  false
+#define SKIP_DROP_CHECK_DEFAULT  false
+
+//TODO: Should we make it configurable??
+#define PMD_MIN_NUM_DROPS(1)
+#define PMD_MIN_NUM_QFILLS   (1)
+#define PMD_REBALANCE_POLL_TIMER_INTERVAL 6
+
+extern uint32_t log_q_thr;
+
+static bool pmd_auto_lb = PMD_AUTO_LB_DISABLE; static bool 
+auto_lb_skip_drop_check = SKIP_DROP_CHECK_DEFAULT; static float 
+auto_lb_pmd_load_ther = PMD_LOAD_THRE_DEFAULT; static unsigned int 
+auto_lb_accept_improve = ACCEPT_IMPROVE_DEFAULT; static long long int 
+pmd_rebalance_poll_timer = 0;
+
+
 #define FLOW_DUMP_MAX_BATCH 50
 /* Use per thread recirc_depth to prevent recirculation loop. */  #define 
MAX_RECIRC_DEPTH 6 @@ -393,6 +413,8 @@ enum rxq_cycles_counter_type {
interval. */
 RXQ_CYCLES_PROC_HIST,   /* Total cycles of all intervals that are used
during rxq to pmd assignment. */
+RXQ_CYCLES_IDLE_CURR,   /* Cycles spent in idling. */
+RXQ_CYCLES_IDLE_HIST,   /* Total cycles of all idle intervals. */
 RXQ_N_CYCLES
 };
 
@@ -429,6 +451,14 @@ static struct ovsthread_once offload_thread_once
 
 #define XPS_TIMEOUT 50LL/* In microseconds. */
 
+typedef struct {
+unsigned long long prev_drops;
+} q_drops;
+typedef struct {
+unsigned int num_vhost_qfill;
+unsigned int prev_num_vhost_qfill;
+} vhost_qfill;
+
 /* Contained by struct dp_netdev_port's 'rxqs' member.  */  struct 
dp_netdev_rxq {
 struct dp_netdev_port *port;
@@ -439,6 +469,10 @@ struct dp_netdev_rxq {
   particular core. */
 unsigned intrvl_idx;   /* Write index for 'cycles_intrvl'. */
 struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
+struct dp_netdev_pmd_thread *dry_run_pmd;
+   /* During auto lb trigger, pmd thread
+  associated with this q during dry
+  run. */
 bool is_vhost; /* Is rxq of a vhost port. 

Re: [ovs-dev] [PATCH] RFC for support of PMD Auto load balancing

2018-10-11 Thread Nitin Katiyar
Hi,
I forgot to mention that this patch does not handle frequent rx scheduling of 
queues due to auto load balancing. That is something we had identified and 
changes need to be done to dampen the frequent scheduling of rx queues across 
PMDs.

Regards,
Nitin

-Original Message-
From: Nitin Katiyar 
Sent: Friday, October 12, 2018 1:30 AM
To: ovs-dev@openvswitch.org
Cc: Nitin Katiyar ; Rohith Basavaraja 

Subject: [PATCH] RFC for support of PMD Auto load balancing

Port rx queues that have not been statically assigned to PMDs are currently 
assigned based on periodically sampled load measurements.
The assignment is performed at specific instances – port addition, port 
deletion, upon reassignment request via CLI etc.

Over time it can cause uneven load among the PMDs due to change in traffic 
pattern and thus resulting in lower overall throughout.

This patch enables the support of auto load balancing of PMDs based on measured 
load of RX queues. Each PMD measures the processing load for each of its 
associated queues every 10 seconds. If the aggregated PMD load exceeds a 
configured threshold for 6 consecutive intervals and if there are receive 
packet drops at the NIC the PMD considers itself to be overloaded.

If any PMD considers itself to be overloaded, a dry-run of the PMD assignment 
algorithm is performed by OVS main thread. The dry-run does NOT change the 
existing queue to PMD assignments.

If the resultant mapping of dry-run indicates an improved distribution of the 
load then the actual reassignment will be performed. The automatic rebalancing 
will be disabled by default and has to be enabled via configuration option. 
Load thresholds, improvement factor etc are also configurable.

Following example commands can be used to set the auto-lb params:
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-thresh="80"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-min-improvement="5"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-drop-check="true"

Co-authored-by: Rohith Basavaraja 

Signed-off-by: Nitin Katiyar 
Signed-off-by: Rohith Basavaraja 
---
 lib/dpif-netdev.c | 589 +++---
 1 file changed, 561 insertions(+), 28 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index e322f55..28593cc 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -80,6 +80,26 @@
 
 VLOG_DEFINE_THIS_MODULE(dpif_netdev);
 
+/* Auto Load Balancing Defaults */
+#define ACCEPT_IMPROVE_DEFAULT   (25)
+#define PMD_LOAD_THRE_DEFAULT(99)
+#define PMD_AUTO_LB_DISABLE  false
+#define SKIP_DROP_CHECK_DEFAULT  false
+
+//TODO: Should we make it configurable??
+#define PMD_MIN_NUM_DROPS(1)
+#define PMD_MIN_NUM_QFILLS   (1)
+#define PMD_REBALANCE_POLL_TIMER_INTERVAL 6
+
+extern uint32_t log_q_thr;
+
+static bool pmd_auto_lb = PMD_AUTO_LB_DISABLE; static bool 
+auto_lb_skip_drop_check = SKIP_DROP_CHECK_DEFAULT; static float 
+auto_lb_pmd_load_ther = PMD_LOAD_THRE_DEFAULT; static unsigned int 
+auto_lb_accept_improve = ACCEPT_IMPROVE_DEFAULT; static long long int 
+pmd_rebalance_poll_timer = 0;
+
+
 #define FLOW_DUMP_MAX_BATCH 50
 /* Use per thread recirc_depth to prevent recirculation loop. */  #define 
MAX_RECIRC_DEPTH 6 @@ -393,6 +413,8 @@ enum rxq_cycles_counter_type {
interval. */
 RXQ_CYCLES_PROC_HIST,   /* Total cycles of all intervals that are used
during rxq to pmd assignment. */
+RXQ_CYCLES_IDLE_CURR,   /* Cycles spent in idling. */
+RXQ_CYCLES_IDLE_HIST,   /* Total cycles of all idle intervals. */
 RXQ_N_CYCLES
 };
 
@@ -429,6 +451,14 @@ static struct ovsthread_once offload_thread_once
 
 #define XPS_TIMEOUT 50LL/* In microseconds. */
 
+typedef struct {
+unsigned long long prev_drops;
+} q_drops;
+typedef struct {
+unsigned int num_vhost_qfill;
+unsigned int prev_num_vhost_qfill;
+} vhost_qfill;
+
 /* Contained by struct dp_netdev_port's 'rxqs' member.  */  struct 
dp_netdev_rxq {
 struct dp_netdev_port *port;
@@ -439,6 +469,10 @@ struct dp_netdev_rxq {
   particular core. */
 unsigned intrvl_idx;   /* Write index for 'cycles_intrvl'. */
 struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
+struct dp_netdev_pmd_thread *dry_run_pmd;
+   /* During auto lb trigger, pmd thread
+  associated with this q during dry
+  run. */
 bool is_vhost; /* Is rxq of a vhost port. */
 
 /* Counters of cycles spent successfully polling and processing pkts. */ 
@@ -446,6 +480,16 @@ struct dp_netdev_rxq {
 /* We store PMD_RXQ_INTERVAL_MAX intervals of data for an rxq and then
sum them to yield the cycles used for 

[ovs-dev] [PATCH] RFC for support of PMD Auto load balancing

2018-10-11 Thread Nitin Katiyar
Port rx queues that have not been statically assigned to PMDs are currently
assigned based on periodically sampled load measurements.
The assignment is performed at specific instances – port addition, port
deletion, upon reassignment request via CLI etc.

Over time it can cause uneven load among the PMDs due to change in traffic
pattern and thus resulting in lower overall throughout.

This patch enables the support of auto load balancing of PMDs based
on measured load of RX queues. Each PMD measures the processing load for
each of its associated queues every 10 seconds. If the aggregated PMD load
exceeds a configured threshold for 6 consecutive intervals and if there are
receive packet drops at the NIC the PMD considers itself to be overloaded.

If any PMD considers itself to be overloaded, a dry-run of the PMD
assignment algorithm is performed by OVS main thread. The dry-run
does NOT change the existing queue to PMD assignments.

If the resultant mapping of dry-run indicates an improved distribution
of the load then the actual reassignment will be performed. The automatic
rebalancing will be disabled by default and has to be enabled via
configuration option. Load thresholds, improvement factor etc are also
configurable.

Following example commands can be used to set the auto-lb params:
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-thresh="80"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-min-improvement="5"
ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-drop-check="true"

Co-authored-by: Rohith Basavaraja 

Signed-off-by: Nitin Katiyar 
Signed-off-by: Rohith Basavaraja 
---
 lib/dpif-netdev.c | 589 +++---
 1 file changed, 561 insertions(+), 28 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index e322f55..28593cc 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -80,6 +80,26 @@
 
 VLOG_DEFINE_THIS_MODULE(dpif_netdev);
 
+/* Auto Load Balancing Defaults */
+#define ACCEPT_IMPROVE_DEFAULT   (25)
+#define PMD_LOAD_THRE_DEFAULT(99)
+#define PMD_AUTO_LB_DISABLE  false
+#define SKIP_DROP_CHECK_DEFAULT  false
+
+//TODO: Should we make it configurable??
+#define PMD_MIN_NUM_DROPS(1)
+#define PMD_MIN_NUM_QFILLS   (1)
+#define PMD_REBALANCE_POLL_TIMER_INTERVAL 6
+
+extern uint32_t log_q_thr;
+
+static bool pmd_auto_lb = PMD_AUTO_LB_DISABLE;
+static bool auto_lb_skip_drop_check = SKIP_DROP_CHECK_DEFAULT;
+static float auto_lb_pmd_load_ther = PMD_LOAD_THRE_DEFAULT;
+static unsigned int auto_lb_accept_improve = ACCEPT_IMPROVE_DEFAULT;
+static long long int pmd_rebalance_poll_timer = 0;
+
+
 #define FLOW_DUMP_MAX_BATCH 50
 /* Use per thread recirc_depth to prevent recirculation loop. */
 #define MAX_RECIRC_DEPTH 6
@@ -393,6 +413,8 @@ enum rxq_cycles_counter_type {
interval. */
 RXQ_CYCLES_PROC_HIST,   /* Total cycles of all intervals that are used
during rxq to pmd assignment. */
+RXQ_CYCLES_IDLE_CURR,   /* Cycles spent in idling. */
+RXQ_CYCLES_IDLE_HIST,   /* Total cycles of all idle intervals. */
 RXQ_N_CYCLES
 };
 
@@ -429,6 +451,14 @@ static struct ovsthread_once offload_thread_once
 
 #define XPS_TIMEOUT 50LL/* In microseconds. */
 
+typedef struct {
+unsigned long long prev_drops;
+} q_drops;
+typedef struct {
+unsigned int num_vhost_qfill;
+unsigned int prev_num_vhost_qfill;
+} vhost_qfill;
+
 /* Contained by struct dp_netdev_port's 'rxqs' member.  */
 struct dp_netdev_rxq {
 struct dp_netdev_port *port;
@@ -439,6 +469,10 @@ struct dp_netdev_rxq {
   particular core. */
 unsigned intrvl_idx;   /* Write index for 'cycles_intrvl'. */
 struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
+struct dp_netdev_pmd_thread *dry_run_pmd;
+   /* During auto lb trigger, pmd thread
+  associated with this q during dry
+  run. */
 bool is_vhost; /* Is rxq of a vhost port. */
 
 /* Counters of cycles spent successfully polling and processing pkts. */
@@ -446,6 +480,16 @@ struct dp_netdev_rxq {
 /* We store PMD_RXQ_INTERVAL_MAX intervals of data for an rxq and then
sum them to yield the cycles used for an rxq. */
 atomic_ullong cycles_intrvl[PMD_RXQ_INTERVAL_MAX];
+
+/* Following param are used to determine the load on the PMD
+ * for automatic load balance
+ */
+atomic_ullong idle_intrvl[PMD_RXQ_INTERVAL_MAX];
+union {
+q_drops rxq_drops;
+vhost_qfill rxq_vhost_qfill;
+} rxq_drops_or_qfill;
+atomic_uint   overloading_pmd;
 };
 
 /* A port in a netdev-based datapath. */
@@ -682,6 +726,12 @@ struct dp_netdev_pmd_thread {
 struct ovs_mutex