Hi Ilya, Thanx for taking a look. Please see inline.
Thanx Manu On 18/06/18, 4:04 PM, "Ilya Maximets" <[email protected]> wrote: > Hi Hi, I just wanted to clarify few things about RSS hash. See inline. One more thing: Despite of usual OVS bonding, this implementation doesn't support shifting the load between ports. Am I right? This could be an issue, because few heavy flows could be mapped to a single port, while other ports will be underloaded. This will be a bad case for tunnelling where we have only few heavy flows. As I understood, this version of bonding doesn't support any load statistics. [manu] Yes that’s correct. This implementation does not yet support accumulation of per slave stats (in "struct bond_entry"). Since load balancing is done without using the dp_hashed flows, rule level stats can't be used and bond_rebalance() won't take effect. I was planning to add per-slave stats collection/accumulation in OVS_ACTION_ATTR_LB_OUTPUT handling. This will be done in another patch set. Best regards, Ilya Maximets. > Problem: > -------- > In OVS-DPDK, flows with output over a bond interface of type “balance-tcp” > (using a hash on TCP/UDP 5-tuple) get translated by the ofproto layer into > "HASH" and "RECIRC" datapath actions. After recirculation, the packet is > forwarded to the bond member port based on 8-bits of the datapath hash > value computed through dp_hash. This causes performance degradation in the > following ways: > > 1. L4-Hash computation in software is CPU intensive, it consumes > considerable CPU cycles of the PMD. RSS is in use in most cases in current master and 2.9. Details below. [manu] OK Thanx. I was working on an earlier version of OVS and didn’t notice it while porting to master. > > 2. The recirculation of the packet implies another lookup of the packet’s > flow key in the exact match cache (EMC) and potentially Megaflow classifier > (DPCLS). This is the biggest cost factor. > > 3. The recirculated packets have a new “RSS” hash and compete with the > original packets for the scarce number of EMC slots. This implies more > EMC misses and potentially EMC thrashing causing costly DPCLS lookups. > > 4. The 256 extra megaflow entries per bond for dp_hash bond selection put > additional load on the revalidation threads. > > Owing to this performance degradation, deployments stick to “balance-slb” > bond mode even though it does not do active-active load balancing for > VXLAN- and GRE-tunnelled traffic because all tunnel packet have the same > source MAC address. > > Proposed optimization: > ---------------------- > This proposal has 2 main optimizations in balance-tcp handling at egress. > > 1. When feasible, re-use the existing L4 RSS-hash of the packet for bond > selection instead of computing another L4-hash in software. This is already done. See commit 95a6cb3497c3 ("odp-execute: Reuse rss hash in OVS_ACTION_ATTR_HASH.") It was done a year ago and, currently, if RSS is available it's used for OVS_ACTION_ATTR_HASH while balanced bonding handling. So, at least, you should reword a lot of RSS related comments around the code. [manu] With this I think OVS_ACTION_ATTR_HASH can be reused and only OVS_ACTION_ATTR_RECIRC action can be replaced with OVS_ACTION_ATTR_LB_OUTPUT. So it will be "HASH + LB-OUTPUT" instead of existing "HASH + RECIRC". Will evaluate this and then send v2 diffs. > > 2. Introduce a new load-balancing output action instead of recirculation: > > Maintain one table per-bond (could just be an array of uint16's) and > Program it the same way internal flows are created today for each possible > hash value(256 entries) from ofproto layer. Use this table to load-balance > flows as part of output action processing. > > Currently xlate_normal() -> output_normal() -> bond_update_post_recirc_rules() > -> bond_may_recirc() and compose_output_action__() generate > “dp_hash(hash_l4(0))” and “recirc(<RecircID>)” actions. In this case the > RecircID identifies the bond. For the recirculated packets the ofproto layer > installs megaflow entries that match on RecircID and masked dp_hash and send > them to the corresponding output port. > > Instead, we will now generate a new action "lb_output(bond,<bond id>)" which > combines hash computation (only if needed, else re-use RSS hash) and inline > load-balancing over the bond. This action is used *only* for balance-tcp bonds > in OVS-DPDK datapath (the OVS kernel datapath remains unchanged). _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
