On 09/27/2018 07:48 AM, Sriharsha Basavapatna via dev wrote: > This is the third patch in the patch-set to support dynamic rebalancing > of offloaded flows. > > The dynamic rebalancing functionality is implemented in this patch. The > ukeys that are not scheduled for deletion are obtained and passed as input > to the rebalancing routine. The rebalancing is done in the context of > revalidation leader thread, after all other revalidator threads are > done with gathering rebalancing data for flows. > > For each netdev that is in OOR state, a list of flows - both offloaded > and non-offloaded (pending) - is obtained using the ukeys. For each netdev > that is in OOR state, the flows are grouped and sorted into offloaded and > pending flows. The offloaded flows are sorted in descending order of > pps-rate, while pending flows are sorted in ascending order of pps-rate. > > The rebalancing is done in two phases. In the first phase, we try to > offload all pending flows and if that succeeds, the OOR state on the device > is cleared. If some (or none) of the pending flows could not be offloaded, > then we start replacing an offloaded flow that has a lower pps-rate than > a pending flow, until there are no more pending flows with a higher rate > than an offloaded flow. The flows that are replaced from the device are > added into kernel datapath. > > A new OVS configuration parameter "offload-rebalance", is added to ovsdb. > The default value of this is "false". To enable this feature, set the > value of this parameter to "true", which provides packets-per-second > rate based policy to dynamically offload and un-offload flows. > > Note: This option can be enabled only when 'hw-offload' policy is enabled. > It also requires 'tc-policy' to be set to 'skip_sw'; otherwise, flow > offload errors (specifically ENOSPC error this feature depends on) reported > by an offloaded device are supressed by TC-Flower kernel module. > > Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapa...@broadcom.com> > Co-authored-by: Venkat Duvvuru <venkatkumar.duvv...@broadcom.com> > Signed-off-by: Venkat Duvvuru <venkatkumar.duvv...@broadcom.com> > Reviewed-by: Sathya Perla <sathya.pe...@broadcom.com> > ---
A few of minor comments below, nothing worth holding up a merge over - could be addressed in follow up if required. <snip> > > +static void > +udpif_run_flow_rebalance(struct udpif *udpif) > +{ > + long long int now = 0; > + > + /* Don't rebalance if OFFL_REBAL_INTVL_MSEC have not elapsed */ > + now = time_msec(); > + if (now < udpif->offload_rebalance_time + OFFL_REBAL_INTVL_MSEC) { > + return; > + } > + > + if (!netdev_any_oor()) { won't you continually check this when there are no oor netdevs as offload_rebalance_time is not updated - how about setting 'offload_rebalance_time = now' here too, or you want it to check continually? > + return; > + } > + > + VLOG_DBG("Offload rebalance: Found OOR netdevs"); Seems like this could easily be a persistent state, do you want to log it each time this code runs? perhaps just when the state changes? > + udpif->offload_rebalance_time = now; > + udpif_flow_rebalance(udpif); > +} > + > static void * > udpif_revalidator(void *arg) > { > @@ -933,6 +963,9 @@ udpif_revalidator(void *arg) > > dpif_flow_dump_destroy(udpif->dump); > seq_change(udpif->dump_seq); > + if (netdev_is_offload_rebalance_policy_enabled()) { > + udpif_run_flow_rebalance(udpif); > + } > > duration = MAX(time_msec() - start_time, 1); > udpif->dump_duration = duration; > @@ -977,7 +1010,7 @@ udpif_revalidator(void *arg) > > return NULL; > } > - > + > static enum upcall_type > classify_upcall(enum dpif_upcall_type type, const struct nlattr *userdata, > struct user_action_cookie *cookie) > @@ -1579,7 +1612,7 @@ handle_upcalls(struct udpif *udpif, struct upcall > *upcalls, > for (i = 0; i < n_ops; i++) { > opsp[n_opsp++] = &ops[i].dop; > } > - dpif_operate(udpif->dpif, opsp, n_opsp); > + dpif_operate(udpif->dpif, opsp, n_opsp, DPIF_OFFLOAD_AUTO); > for (i = 0; i < n_ops; i++) { > struct udpif_key *ukey = ops[i].ukey; > > @@ -1671,13 +1704,14 @@ ukey_create__(const struct nlattr *key, size_t > key_len, > ukey->state = UKEY_CREATED; > ukey->state_thread = ovsthread_id_self(); > ukey->state_where = OVS_SOURCE_LOCATOR; > - ukey->created = time_msec(); > + ukey->created = ukey->flow_time = time_msec(); > memset(&ukey->stats, 0, sizeof ukey->stats); > ukey->stats.used = used; > ukey->xcache = NULL; > > ukey->offloaded = false; > ukey->flow_time = 0; this is leftover from 2/3 and can be removed, flow_time is set a few lines above. > + ukey->in_netdev = NULL; > ukey->flow_packets = ukey->flow_backlog_packets = 0; > > ukey->key_recirc_id = key_recirc_id; > @@ -2329,7 +2363,7 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, > size_t n_ops) > for (i = 0; i < n_ops; i++) { > opsp[i] = &ops[i].dop; > } > - dpif_operate(udpif->dpif, opsp, n_ops); > + dpif_operate(udpif->dpif, opsp, n_ops, DPIF_OFFLOAD_AUTO); > > for (i = 0; i < n_ops; i++) { > struct ukey_op *op = &ops[i]; > @@ -2455,6 +2489,57 @@ reval_op_init(struct ukey_op *op, enum reval_result > result, > } > } > <snip> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev