Re: [ovs-dev] [PATCH v8 0/3] Support dynamic rebalancing of offloaded flows

Eelco Chaudron Thu, 18 Oct 2018 12:58:16 -0700



On 18 Oct 2018, at 18:13, Sriharsha Basavapatna via dev wrote:

With the current OVS offload design, when an offload-device fails toadd aflow rule and returns an error, OVS adds the rule to the kerneldatapath.The flow gets processed by the kernel datapath for the entire life ofthatflow. This is fine when an error is returned by the device due to lackof
support for certain keys or actions.
But when an error is returned due to temporary conditions such as lackofresources to add a flow rule, the flow continues to be processed bykerneleven when resources become available later. That is, those flows nevergetoffloaded again. This problem becomes more pronounced when a flow thathasbeen initially offloaded may have a smaller packet rate than a laterflow
that could not be offloaded due to lack of resources. This leads to
inefficient use of HW resources and wastage of host CPU cycles.
This patch-set addresses this issue by providing a way to detecttemporary
offload resource constraints (Out-Of-Resource or OOR condition) and to
selectively and dynamically offload flows with a higherpackets-per-second(pps) rate. This dynamic rebalancing is done periodically on netdevsthatare in OOR state until resources become available to offload allpending
flows.

The patch-set involves the following changes at a high level:

1. Detection of Out-Of-Resources (OOR) condition on an offload-capable
   netdev.
2. Gathering flow offload selection criteria for all flows on an OORnetdev;
   i.e, packets-per-second (pps) rate of flows for offloaded and
   non-offloaded (pending) flows.
3. Dynamically replacing offloaded flows with a lower pps-rate, with
   non-offloaded flows with a higher pps-rate, on an OOR netdev. A new
   OpenvSwitch configuration option - "offload-rebalance" to enable
   this policy.

Cost/benefits data points:

1. Rough cost of the new rebalancing, in terms of CPU time:
Ran a test that replaced 256 low pps-rate flows(pings) with 256highpps-rate flows(iperf), in a system with 4 cpus (Intel Xeon E5 @2.40GHz;2 cores with hw threads enabled, rest disabled). The data showedthat cpuutilization increased by about ~20%. This increase occurs duringthespecific second in which rebalancing is done. And subsequently(from thenext second), cpu utilization decreases significantly due tooffloadingof higher pps-rate flows. So effectively there's a bump in cpuutilizationat the time of rebalancing, that is more than compensated byreduced cpu
   utilization once the right flows get offloaded.

2. Rough benefits to the user in terms of offload performance:
The benefits to the user is reduced cpu utilization in the host,sincehigher pps-rate flows get offloaded, replacing lower pps-rateflows.Replacing a single offloaded flood ping flow with an iperf flow(multipleconnections), shows that the cpu %used that was originally 100% onasingle cpu (rebalancing disabled) goes down to 35% (rebalancingenabled).
   That is, cpu utilization decreased 65% after rebalancing.

3. Circumstances under which the benefits would show up:

   The rebalancing benefits would show up once offload resources are
exhausted and new flows with higher pps-rate are initiated, thatwould
   otherwise be handled by kernel datapath costing host cpu cycles.
This can be observed using 'ovs appctl dpctl/ dump-flows' command.Priorto rebalancing, any high pps-rate flows that couldn't be offloadeddue toresource crunch would show up in the output of 'dump-flowstype=ovs' and
   after rebalancing such flows would appear in the output of
   'dump-flows type=offloaded'.

Before I review the individual patches, hope to do this tomorrow, I havesome general concerns/comments.

Once committed will this feature be marked double experimental? Justwant to clarify if it goes in as part of HW offload the experimental tagfor this is not removed once HW offload becomes mainstream.

Currently, in OOR state both phases (insert, and rebalance) are ran.Rather than have offload-rebalance true, or false. Maybe we can have,disable, retry, retry-rebalance. I think only retrying might bebeneficial as well, not changing existing flows.

My main objection against offloading flows at a later stage is packetre-ordering. As soon as you move from kernel datapath to hardwareoffload packet might be sent out of order. This is mainly a problem forTCP streams, which do not have the problem if the stream is directlyoffloaded at the start.

To make this even worse is that the current implementation has noprotection for flows fighting to be offloaded, i.e. ping/ponging from HWto SW, hence causing a lot of out of order packets.

Based on the above I was wondering if any tests where done to measurethe out of order packets/jitter on rebalancing?

Last implementation item I have is the packet throughput through thekernel datapath is rather low, <200Kpps. This low throughput might makeit hard to determine which not offloaded flow might be the best suitedfor HW offload. The kernel might drop packets for a flow with a wayhigher potential than the packets that do make it through the kernel. Ido not have a solution for this, but I guess it worth keeping in mindwhen this gets enabled.

As a general question, where other solutions being considered to copewith inadequate resources? Maybe a solution that will give the operatormore control over what would be offloaded? Like giving all configuredflows a relative priority. Either at individual flow creation or somegeneral overlay, i.e. all TCP flows for example.h



Cheers,

Eelco
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v8 0/3] Support dynamic rebalancing of offloaded flows

Reply via email to