On 12/22/21 06:38, Sriharsha Basavapatna wrote: > On Tue, Nov 2, 2021 at 1:48 AM Ilya Maximets <[email protected]> wrote: >> >> On 10/30/21 09:04, Sriharsha Basavapatna wrote: >>> On Mon, Aug 9, 2021 at 6:28 PM Ilya Maximets <[email protected]> wrote: >>>> >>>> There are cases where users might want simple forwarding or drop rules >>>> for all packets received from a specific port, e.g :: >>>> >>>> "in_port=1,actions=2" >>>> "in_port=2,actions=IN_PORT" >>>> "in_port=3,vlan_tci=0x1234/0x1fff,actions=drop" >>>> "in_port=4,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:3" >>>> >>>> There are also cases where complex OpenFlow rules can be simplified >>>> down to datapath flows with very simple match criteria. >>>> >>>> In theory, for very simple forwarding, OVS doesn't need to parse >>>> packets at all in order to follow these rules. "Simple match" lookup >>>> optimization is intended to speed up packet forwarding in these cases. >>>> >>>> Design: >>>> >>>> Due to various implementation constraints userspace datapath has >>>> following flow fields always in exact match (i.e. it's required to >>>> match at least these fields of a packet even if the OF rule doesn't >>>> need that): >>>> >>>> - recirc_id >>>> - in_port >>>> - packet_type >>>> - dl_type >>>> - vlan_tci (CFI + VID) - in most cases >>>> - nw_frag - for ip packets >>>> >>>> Not all of these fields are related to packet itself. We already >>>> know the current 'recirc_id' and the 'in_port' before starting the >>>> packet processing. It also seems safe to assume that we're working >>>> with Ethernet packets. So, for the simple OF rule we need to match >>>> only on 'dl_type', 'vlan_tci' and 'nw_frag'. >>>> >>>> 'in_port', 'dl_type', 'nw_frag' and 13 bits of 'vlan_tci' can be >>>> combined in a single 64bit integer (mark) that can be used as a >>>> hash in hash map. We are using only VID and CFI form the 'vlan_tci', >>>> flows that need to match on PCP will not qualify for the optimization. >>>> Workaround for matching on non-existence of vlan updated to match on >>>> CFI and VID only in order to qualify for the optimization. CFI is >>>> always set by OVS if vlan is present in a packet, so there is no need >>>> to match on PCP in this case. 'nw_frag' takes 2 bits of PCP inside >>>> the simple match mark. >>>> >>>> New per-PMD flow table 'simple_match_table' introduced to store >>>> simple match flows only. 'dp_netdev_flow_add' adds flow to the >>>> usual 'flow_table' and to the 'simple_match_table' if the flow >>>> meets following constraints: >>>> >>>> - 'recirc_id' in flow match is 0. >>>> - 'packet_type' in flow match is Ethernet. >>>> - Flow wildcards contains only minimal set of non-wildcarded fields >>>> (listed above). >>>> >>>> If the number of flows for current 'in_port' in a regular 'flow_table' >>>> equals number of flows for current 'in_port' in a 'simple_match_table', >>>> we may use simple match optimization, because all the flows we have >>>> are simple match flows. This means that we only need to parse >>>> 'dl_type', 'vlan_tci' and 'nw_frag' to perform packet matching. >>>> Now we make the unique flow mark from the 'in_port', 'dl_type', >>>> 'nw_frag' and 'vlan_tci' and looking for it in the 'simple_match_table'. >>>> On successful lookup we don't need to run full 'miniflow_extract()'. >>>> >>>> Unsuccessful lookup technically means that we have no suitable flow >>>> in the datapath and upcall will be required. So, in this case EMC and >>>> SMC lookups are disabled. We may optimize this path in the future by >>>> bypassing the dpcls lookup too. >>>> >>>> Performance improvement of this solution on a 'simple match' flows >>>> should be comparable with partial HW offloading, because it parses same >>>> packet fields and uses similar flow lookup scheme. >>>> However, unlike partial HW offloading, it works for all port types >>>> including virtual ones. >>>> >>>> Performance results when compared to EMC: >>>> >>>> Test setup: >>>> >>>> virtio-user OVS virtio-user >>>> Testpmd1 ------------> pmd1 ------------> Testpmd2 >>>> (txonly) x<------ pmd2 <------------ (mac swap) >>>> >>>> Single stream of 64byte packets. Actions: >>>> in_port=vhost0,actions=vhost1 >>>> in_port=vhost1,actions=vhost0 >>>> >>>> Stats collected from pmd1 and pmd2, so there are 2 scenarios: >>>> Virt-to-Virt : Testpmd1 ------> pmd1 ------> Testpmd2. >>>> Virt-to-NoCopy : Testpmd2 ------> pmd2 --->x Testpmd1. >>>> Here the packet sent from pmd2 to Testpmd1 is always dropped, because >>>> the virtqueue is full since Testpmd1 is in txonly mode and doesn't >>>> receive any packets. This should be closer to the performance of a >>>> VM-to-Phy scenario. >>>> >>>> Test performed on machine with Intel Xeon CPU E5-2690 v4 @ 2.60GHz. >>>> Table below represents improvement in throughput when compared to EMC. >>>> >>>> +----------------+------------------------+------------------------+ >>>> | | Default (-g -O2) | "-Ofast -march=native" | >>>> | Scenario +------------+-----------+------------+-----------+ >>>> | | GCC | Clang | GCC | Clang | >>>> +----------------+------------+-----------+------------+-----------+ >>>> | Virt-to-Virt | +18.9% | +25.5% | +10.8% | +16.7% | >>>> | Virt-to-NoCopy | +24.3% | +33.7% | +14.9% | +22.0% | >>>> +----------------+------------+-----------+------------+-----------+ >>>> >>>> For Phy-to-Phy case performance improvement should be even higher, but >>>> it's not the main use-case for this functionality. Performance >>>> difference for the non-simple flows is within a margin of error. >>>> >>>> Signed-off-by: Ilya Maximets <[email protected]> >>>> --- >>>> >>>> Version 3: >>>> - Removed artificial restriction on flow actions. Any set of actions >>>> is supported. Hence the patch re-named from "direct output" to >>>> "simple match". >>>> - VLAN VID+CFI added to the mark. Previous implementation from v1/v2 >>>> was not correct in a way that it ignored match on VLAN if match on >>>> non-existence of VLAN was requested. Workaround for netlink >>>> expression of the match on non-existance of VLAN improved to avoid >>>> match on PCP, so we have extra bits inside the mark to fit >>>> fragmentation flags. Technically, we can keep only match on CFI, >>>> but we have enough bits, so we don't need to reduce the mask >>>> further. We parsed VLAN header from a packet anyway in v1 and v2, >>>> so this doesn't really affect performance of the solution, but >>>> increases the variety of flows that can be optimized. >>>> - Since we're matching on VLAN, the optimization now works for flows >>>> with VLAN matches (VID+CFI, no PCP matches). >>>> - Added statistics, documentation and performance test results. >>>> >>>> Version 2 (Harsha): >>>> - Rebase on 2.15 >>>> - Added a coverage counter. >>>> - >>>> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ >>>> >>>> Documentation/topics/dpdk/bridge.rst | 24 ++ >>>> NEWS | 3 + >>>> lib/dpif-netdev-avx512.c | 3 +- >>>> lib/dpif-netdev-perf.c | 41 ++-- >>>> lib/dpif-netdev-perf.h | 1 + >>>> lib/dpif-netdev-private-flow.h | 5 +- >>>> lib/dpif-netdev-private-thread.h | 13 +- >>>> lib/dpif-netdev-unixctl.man | 12 +- >>>> lib/dpif-netdev.c | 316 +++++++++++++++++++++++---- >>>> lib/flow.c | 22 +- >>>> lib/flow.h | 3 +- >>>> lib/netdev-offload-dpdk.c | 2 +- >>>> tests/dpif-netdev.at | 16 +- >>>> tests/nsh.at | 2 +- >>>> tests/pmd.at | 6 +- >>>> 15 files changed, 386 insertions(+), 83 deletions(-) >>>> >> >> <snip> >> >>> >>> Hi Ilya, >>> >>> We are trying to test this feature and running into an issue in which >>> the ofctl rule (in_port=1,actions=2) is resulting in a >>> datapath rule with the 'dl_dst' inserted in the match field. So the >>> datapath rule ends up failing the simple match criteria. >>> >>> The test case involves forwarding traffic from the VM (virtio) via the >>> vhost-user port and VxLAN tunnel in OVS. >>> >>> br-vhost br-vxlan >>> br-phy >>> vhost-user<-->VF VF-Rep<-->VxLAN uplink-port >>> >>> The simple match ofctl rule is added to the br-vhost bridge to forward >>> traffic to/from the vhost-user port. The problem disappears if the >>> vxlan bridge is removed. >>> >>> Debugging this further, I see this code in >>> xlate_actions() .... --> compose_output_action__(): >>> >>> } else if (terminate_native_tunnel(ctx, flow, wc, >>> &odp_tnl_port)) { >>> >>> In terminate_native_tunnel(): >>> >>> if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) { >>> *tnl_port = tnl_port_map_lookup(flow, wc); >>> >>> When I dump wc->masks.dl_dst before and after tnl_port_map_lookup(), >>> it changes from all zeros to all 0xff (wc to em). >>> >>> So it seems like while there's a tunnel configuration, though the >>> ctx->xbridge in question is the vhost bridge, >>> ovs_native_tunneling_is_on() returns true and in tnl_port_map_lookup() >>> the wc mask gets updated for dl_dst. >>> >>> Thanks, >>> -Harsha >>> >> >> Hi. Thanks for reporting the issue. I think, I have a solution for it. >> Please, take a look at the following patch: >> >> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ >> >> Bets regards, Ilya Maximets. > > Thanks Ilya, with the above fix we are able to see simple-match hits > with vxlan configuration too. > Acked-by: Sriharsha Basavapatna <[email protected]>
Thanks for review and all the testing! Applied. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
