On 14/01/2021 19:49, Stokes, Ian wrote:
>> From: Christophe Fontaine <[email protected]>
>>
>> Two important parts of how PMD auto load balance operates are how
>> loaded a core needs to be and how much improvement is estimated
>> before a PMD auto load balance can trigger.
>>
>> Previously they were hardcoded to 95% loaded and 25% variance
>> improvement.
>>
>> These default values may not be suitable for all use cases and
>> we may want to use a more (or less) aggressive rebalance, either
>> on the pmd load threshold or on the minimum variance improvement
>> threshold.
>>
>> The defaults are not changed, but "pmd-auto-lb-pmd-load" and
>> "pmd-auto-lb-improvement" parameters are added to override the
>> defaults.
>>
>> $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-pmd-load="70"
>> $ ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-improvement="20"
>>
> 
> Thanks for the patch Kevin,
> 
> Have tested over the last few days and it looks good to me. The only note I 
> would have is
> in addition to updating the vswitch.xml, I would think it was worth noting 
> the two new parameters
> in the vswitch documentation also (possibly giving an example of their usage).
> 
> https://docs.openvswitch.org/en/latest/topics/dpdk/pmd/#automatic-assignment-of-port-rx-queue-to-pmd-threads-experimental
> 
> Right now the documentation just refers to the default 95% threshold and 25% 
> improvement.
> 

Thanks for testing and reviewing. Yes, this part of the doc should be
updated at least.

> So even an additional note to specify how the threshold and improvements can 
> be changed would suffice I would think.
> 
> This doesn't have to block merge, but would appreciate a follow up patch to 
> improve the docs soon.
> 
> Minor comment for discussion below.
> 
>> Signed-off-by: Christophe Fontaine <[email protected]>
>> Co-Authored-by: Kevin Traynor <[email protected]>
>> Signed-off-by: Kevin Traynor <[email protected]>
>> Acked-by: David Marchand <[email protected]>
>> ---
>>  NEWS                 |  1 +
>>  lib/dpif-netdev.c    | 54 +++++++++++++++++++++++++++++++++++++-----
>> --
>>  vswitchd/vswitch.xml | 29 ++++++++++++++++++++++--
>>  3 files changed, 74 insertions(+), 10 deletions(-)
>>
>> diff --git a/NEWS b/NEWS
>> index 7e291a180..7994e434d 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -17,4 +17,5 @@ Post-v2.14.0
>>       * New 'options:dpdk-vf-mac' field for DPDK interface of VF ports,
>>         that allows configuring the MAC address of a VF representor.
>> +     * Add parameters to configure PMD auto load balance behaviour.
>>     - The environment variable OVS_UNBOUND_CONF, if set, is now used
>>       as the DNS resolver's (unbound) configuration file.
>> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
>> index d19d5bbff..bf2112815 100644
>> --- a/lib/dpif-netdev.c
>> +++ b/lib/dpif-netdev.c
>> @@ -86,7 +86,7 @@ VLOG_DEFINE_THIS_MODULE(dpif_netdev);
>>
>>  /* Auto Load Balancing Defaults */
>> -#define ALB_ACCEPTABLE_IMPROVEMENT       25
>> -#define ALB_PMD_LOAD_THRESHOLD           95
>> -#define ALB_PMD_REBALANCE_POLL_INTERVAL  1 /* 1 Min */
>> +#define ALB_IMPROVEMENT_THRESHOLD    25
>> +#define ALB_LOAD_THRESHOLD           95
>> +#define ALB_REBALANCE_INTERVAL       1 /* 1 Min */
>>  #define MIN_TO_MSEC                  60000
>>
>> @@ -301,4 +301,6 @@ struct pmd_auto_lb {
>>      uint64_t rebalance_intvl;
>>      uint64_t rebalance_poll_timer;
>> +    uint8_t rebalance_improve_thresh;
>> +    atomic_uint8_t rebalance_load_thresh;
>>  };
>>
>> @@ -4205,4 +4207,5 @@ set_pmd_auto_lb(struct dp_netdev *dp)
>>      struct dp_netdev_pmd_thread *pmd;
>>      struct pmd_auto_lb *pmd_alb = &dp->pmd_alb;
>> +    uint8_t rebalance_load_thresh;
>>
>>      bool enable_alb = false;
>> @@ -4234,7 +4237,14 @@ set_pmd_auto_lb(struct dp_netdev *dp)
>>          pmd_alb->is_enabled = enable_alb;
>>          if (pmd_alb->is_enabled) {
>> +            atomic_read_relaxed(&pmd_alb->rebalance_load_thresh,
>> +                                &rebalance_load_thresh);
>>              VLOG_INFO("PMD auto load balance is enabled "
>> -                      "interval %"PRIu64" mins",
>> -                       pmd_alb->rebalance_intvl / MIN_TO_MSEC);
>> +                      "interval %"PRIu64" mins, "
>> +                      "pmd load threshold %"PRIu8"%%, "
>> +                      "improvement threshold %"PRIu8"%%",
>> +                       pmd_alb->rebalance_intvl / MIN_TO_MSEC,
>> +                       rebalance_load_thresh,
>> +                       pmd_alb->rebalance_improve_thresh);
>> +
>>          } else {
>>              pmd_alb->rebalance_poll_timer = 0;
>> @@ -4260,4 +4270,6 @@ dpif_netdev_set_config(struct dpif *dpif, const
>> struct smap *other_config)
>>      uint32_t tx_flush_interval, cur_tx_flush_interval;
>>      uint64_t rebalance_intvl;
>> +    uint8_t rebalance_load, cur_rebalance_load;
>> +    uint8_t rebalance_improve;
>>
>>      tx_flush_interval = smap_get_int(other_config, "tx-flush-interval",
>> @@ -4337,5 +4349,5 @@ dpif_netdev_set_config(struct dpif *dpif, const
>> struct smap *other_config)
>>
>>      rebalance_intvl = smap_get_int(other_config, "pmd-auto-lb-rebal-
>> interval",
>> -                              ALB_PMD_REBALANCE_POLL_INTERVAL);
>> +                                   ALB_REBALANCE_INTERVAL);
>>
>>      /* Input is in min, convert it to msec. */
>> @@ -4349,4 +4361,27 @@ dpif_netdev_set_config(struct dpif *dpif, const
>> struct smap *other_config)
>>      }
>>
>> +    rebalance_improve = smap_get_int(other_config, "pmd-auto-lb-
>> improvement",
>> +                                     ALB_IMPROVEMENT_THRESHOLD);
>> +    if (rebalance_improve > 100) {
>> +        rebalance_improve = ALB_IMPROVEMENT_THRESHOLD;
>> +    }
>> +    if (rebalance_improve != pmd_alb->rebalance_improve_thresh) {
>> +        pmd_alb->rebalance_improve_thresh = rebalance_improve;
>> +        VLOG_INFO("PMD auto load balance improvement threshold set to "
>> +                  "%"PRIu8"%%\n", rebalance_improve);
>> +    }
>> +
>> +    rebalance_load = smap_get_int(other_config, "pmd-auto-lb-pmd-load",
>> +                             ALB_LOAD_THRESHOLD);
>> +    if (rebalance_load > 100) {
>> +        rebalance_load = ALB_LOAD_THRESHOLD;
>> +    }
>> +    atomic_read_relaxed(&pmd_alb->rebalance_load_thresh,
>> &cur_rebalance_load);
>> +    if (rebalance_load != cur_rebalance_load) {
>> +        atomic_store_relaxed(&pmd_alb->rebalance_load_thresh,
>> +                             rebalance_load);
>> +        VLOG_INFO("PMD auto load balance pmd load threshold set to "
>> +                "%"PRIu8"%%\n", rebalance_load);
>> +    }
>>      set_pmd_auto_lb(dp);
>>      return 0;
>> @@ -5677,5 +5712,5 @@ pmd_rebalance_dry_run(struct dp_netdev *dp)
>>                  ((curr_variance - new_variance) * 100) / curr_variance;
>>          }
>> -        if (improvement < ALB_ACCEPTABLE_IMPROVEMENT) {
>> +        if (improvement < dp->pmd_alb.rebalance_improve_thresh) {
>>              ret = false;
>>          }
>> @@ -8712,4 +8747,5 @@ dp_netdev_pmd_try_optimize(struct
>> dp_netdev_pmd_thread *pmd,
>>      if (pmd->ctx.now > pmd->rxq_next_cycle_store) {
>>          uint64_t curr_tsc;
>> +        uint8_t rebalance_load_trigger;
>>          struct pmd_auto_lb *pmd_alb = &pmd->dp->pmd_alb;
>>          if (pmd_alb->is_enabled && !pmd->isolated
>> @@ -8728,5 +8764,7 @@ dp_netdev_pmd_try_optimize(struct
>> dp_netdev_pmd_thread *pmd,
>>              }
>>
>> -            if (pmd_load >= ALB_PMD_LOAD_THRESHOLD) {
>> +            atomic_read_relaxed(&pmd_alb->rebalance_load_thresh,
>> +                                &rebalance_load_trigger);
>> +            if (pmd_load >= rebalance_load_trigger) {
>>                  atomic_count_inc(&pmd->pmd_overloaded);
>>              } else {
>> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
>> index 89a876796..76d528ea7 100644
>> --- a/vswitchd/vswitch.xml
>> +++ b/vswitchd/vswitch.xml
>> @@ -654,6 +654,6 @@
>>          <p>
>>           Configures PMD Auto Load Balancing that allows automatic assignment
>> of
>> -         RX queues to PMDs if any of PMDs is overloaded (i.e. processing 
>> cycles
>> -         > 95%).
>> +         RX queues to PMDs if any of PMDs is overloaded (i.e. a processing
>> +         cycles > <ref column="other_config" key="pmd-auto-lb-pmd-load"/>).
>>          </p>
>>          <p>
>> @@ -691,4 +691,29 @@
>>          </p>
>>        </column>
>> +      <column name="other_config" key="pmd-auto-lb-pmd-load"
>> +              type='{"type": "integer", "minInteger": 0, "maxInteger": 
>> 100}'>
>> +        <p>
>> +         Specifies the minimum pmd load threshold (% of used cycled) of
>> +         any non-isolated pmds when a PMD auto load balance may be
>> triggered.
>> +        </p>
>> +        <p>
>> +         The default value is <code>95%</code>.
>> +        </p>
>> +      </column>
>> +      <column name="other_config" key="pmd-auto-lb-improvement"
>> +              type='{"type": "integer", "minInteger": 0, "maxInteger": 
>> 100}'>
>> +        <p>
>> +         Specifies the minimum evaluated % improvement in load distribution
>> +         across the non-isolated pmds that will allow a PMD auto load 
>> balance
>> +         to occur.
>> +        </p>
>> +        <p>
>> +         Note, setting this parameter to 0 will always allow an auto load
>> +         balance to occur regardless of estimated improvement or not.
> 
> Is there any reason you would want the 0 option above? Surely it would just 
> become disruptive to the deployment.
> 
> I know you could argue an enforced minimum of 1% would not make much of a 
> difference, but at least it would not re-distribute regardless of a gain?
> Just curious as to the thinking behind it.
> 

I've seen some users who talk about checking load only and calling
rebalance every x mins. I would say it's not useful in most cases but
the improvement is just an estimate and I didn't want to add special
handling to disallow a user from being able to rebalance on load only.

> 
> Regards
> Ian
> 
>> +        </p>
>> +        <p>
>> +         The default value is <code>25%</code>.
>> +        </p>
>> +      </column>
>>        <column name="other_config" key="userspace-tso-enable"
>>                type='{"type": "boolean"}'>
>> --
>> 2.26.2
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to