On Tue, Nov 2, 2021 at 1:48 AM Ilya Maximets <[email protected]> wrote: > > On 10/30/21 09:04, Sriharsha Basavapatna wrote: > > On Mon, Aug 9, 2021 at 6:28 PM Ilya Maximets <[email protected]> wrote: > >> > >> There are cases where users might want simple forwarding or drop rules > >> for all packets received from a specific port, e.g :: > >> > >> "in_port=1,actions=2" > >> "in_port=2,actions=IN_PORT" > >> "in_port=3,vlan_tci=0x1234/0x1fff,actions=drop" > >> "in_port=4,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:3" > >> > >> There are also cases where complex OpenFlow rules can be simplified > >> down to datapath flows with very simple match criteria. > >> > >> In theory, for very simple forwarding, OVS doesn't need to parse > >> packets at all in order to follow these rules. "Simple match" lookup > >> optimization is intended to speed up packet forwarding in these cases. > >> > >> Design: > >> > >> Due to various implementation constraints userspace datapath has > >> following flow fields always in exact match (i.e. it's required to > >> match at least these fields of a packet even if the OF rule doesn't > >> need that): > >> > >> - recirc_id > >> - in_port > >> - packet_type > >> - dl_type > >> - vlan_tci (CFI + VID) - in most cases > >> - nw_frag - for ip packets > >> > >> Not all of these fields are related to packet itself. We already > >> know the current 'recirc_id' and the 'in_port' before starting the > >> packet processing. It also seems safe to assume that we're working > >> with Ethernet packets. So, for the simple OF rule we need to match > >> only on 'dl_type', 'vlan_tci' and 'nw_frag'. > >> > >> 'in_port', 'dl_type', 'nw_frag' and 13 bits of 'vlan_tci' can be > >> combined in a single 64bit integer (mark) that can be used as a > >> hash in hash map. We are using only VID and CFI form the 'vlan_tci', > >> flows that need to match on PCP will not qualify for the optimization. > >> Workaround for matching on non-existence of vlan updated to match on > >> CFI and VID only in order to qualify for the optimization. CFI is > >> always set by OVS if vlan is present in a packet, so there is no need > >> to match on PCP in this case. 'nw_frag' takes 2 bits of PCP inside > >> the simple match mark. > >> > >> New per-PMD flow table 'simple_match_table' introduced to store > >> simple match flows only. 'dp_netdev_flow_add' adds flow to the > >> usual 'flow_table' and to the 'simple_match_table' if the flow > >> meets following constraints: > >> > >> - 'recirc_id' in flow match is 0. > >> - 'packet_type' in flow match is Ethernet. > >> - Flow wildcards contains only minimal set of non-wildcarded fields > >> (listed above). > >> > >> If the number of flows for current 'in_port' in a regular 'flow_table' > >> equals number of flows for current 'in_port' in a 'simple_match_table', > >> we may use simple match optimization, because all the flows we have > >> are simple match flows. This means that we only need to parse > >> 'dl_type', 'vlan_tci' and 'nw_frag' to perform packet matching. > >> Now we make the unique flow mark from the 'in_port', 'dl_type', > >> 'nw_frag' and 'vlan_tci' and looking for it in the 'simple_match_table'. > >> On successful lookup we don't need to run full 'miniflow_extract()'. > >> > >> Unsuccessful lookup technically means that we have no suitable flow > >> in the datapath and upcall will be required. So, in this case EMC and > >> SMC lookups are disabled. We may optimize this path in the future by > >> bypassing the dpcls lookup too. > >> > >> Performance improvement of this solution on a 'simple match' flows > >> should be comparable with partial HW offloading, because it parses same > >> packet fields and uses similar flow lookup scheme. > >> However, unlike partial HW offloading, it works for all port types > >> including virtual ones. > >> > >> Performance results when compared to EMC: > >> > >> Test setup: > >> > >> virtio-user OVS virtio-user > >> Testpmd1 ------------> pmd1 ------------> Testpmd2 > >> (txonly) x<------ pmd2 <------------ (mac swap) > >> > >> Single stream of 64byte packets. Actions: > >> in_port=vhost0,actions=vhost1 > >> in_port=vhost1,actions=vhost0 > >> > >> Stats collected from pmd1 and pmd2, so there are 2 scenarios: > >> Virt-to-Virt : Testpmd1 ------> pmd1 ------> Testpmd2. > >> Virt-to-NoCopy : Testpmd2 ------> pmd2 --->x Testpmd1. > >> Here the packet sent from pmd2 to Testpmd1 is always dropped, because > >> the virtqueue is full since Testpmd1 is in txonly mode and doesn't > >> receive any packets. This should be closer to the performance of a > >> VM-to-Phy scenario. > >> > >> Test performed on machine with Intel Xeon CPU E5-2690 v4 @ 2.60GHz. > >> Table below represents improvement in throughput when compared to EMC. > >> > >> +----------------+------------------------+------------------------+ > >> | | Default (-g -O2) | "-Ofast -march=native" | > >> | Scenario +------------+-----------+------------+-----------+ > >> | | GCC | Clang | GCC | Clang | > >> +----------------+------------+-----------+------------+-----------+ > >> | Virt-to-Virt | +18.9% | +25.5% | +10.8% | +16.7% | > >> | Virt-to-NoCopy | +24.3% | +33.7% | +14.9% | +22.0% | > >> +----------------+------------+-----------+------------+-----------+ > >> > >> For Phy-to-Phy case performance improvement should be even higher, but > >> it's not the main use-case for this functionality. Performance > >> difference for the non-simple flows is within a margin of error. > >> > >> Signed-off-by: Ilya Maximets <[email protected]> > >> --- > >> > >> Version 3: > >> - Removed artificial restriction on flow actions. Any set of actions > >> is supported. Hence the patch re-named from "direct output" to > >> "simple match". > >> - VLAN VID+CFI added to the mark. Previous implementation from v1/v2 > >> was not correct in a way that it ignored match on VLAN if match on > >> non-existence of VLAN was requested. Workaround for netlink > >> expression of the match on non-existance of VLAN improved to avoid > >> match on PCP, so we have extra bits inside the mark to fit > >> fragmentation flags. Technically, we can keep only match on CFI, > >> but we have enough bits, so we don't need to reduce the mask > >> further. We parsed VLAN header from a packet anyway in v1 and v2, > >> so this doesn't really affect performance of the solution, but > >> increases the variety of flows that can be optimized. > >> - Since we're matching on VLAN, the optimization now works for flows > >> with VLAN matches (VID+CFI, no PCP matches). > >> - Added statistics, documentation and performance test results. > >> > >> Version 2 (Harsha): > >> - Rebase on 2.15 > >> - Added a coverage counter. > >> - > >> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ > >> > >> Documentation/topics/dpdk/bridge.rst | 24 ++ > >> NEWS | 3 + > >> lib/dpif-netdev-avx512.c | 3 +- > >> lib/dpif-netdev-perf.c | 41 ++-- > >> lib/dpif-netdev-perf.h | 1 + > >> lib/dpif-netdev-private-flow.h | 5 +- > >> lib/dpif-netdev-private-thread.h | 13 +- > >> lib/dpif-netdev-unixctl.man | 12 +- > >> lib/dpif-netdev.c | 316 +++++++++++++++++++++++---- > >> lib/flow.c | 22 +- > >> lib/flow.h | 3 +- > >> lib/netdev-offload-dpdk.c | 2 +- > >> tests/dpif-netdev.at | 16 +- > >> tests/nsh.at | 2 +- > >> tests/pmd.at | 6 +- > >> 15 files changed, 386 insertions(+), 83 deletions(-) > >> > > <snip> > > > > > Hi Ilya, > > > > We are trying to test this feature and running into an issue in which > > the ofctl rule (in_port=1,actions=2) is resulting in a > > datapath rule with the 'dl_dst' inserted in the match field. So the > > datapath rule ends up failing the simple match criteria. > > > > The test case involves forwarding traffic from the VM (virtio) via the > > vhost-user port and VxLAN tunnel in OVS. > > > > br-vhost br-vxlan > > br-phy > > vhost-user<-->VF VF-Rep<-->VxLAN uplink-port > > > > The simple match ofctl rule is added to the br-vhost bridge to forward > > traffic to/from the vhost-user port. The problem disappears if the > > vxlan bridge is removed. > > > > Debugging this further, I see this code in > > xlate_actions() .... --> compose_output_action__(): > > > > } else if (terminate_native_tunnel(ctx, flow, wc, > > &odp_tnl_port)) { > > > > In terminate_native_tunnel(): > > > > if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) { > > *tnl_port = tnl_port_map_lookup(flow, wc); > > > > When I dump wc->masks.dl_dst before and after tnl_port_map_lookup(), > > it changes from all zeros to all 0xff (wc to em). > > > > So it seems like while there's a tunnel configuration, though the > > ctx->xbridge in question is the vhost bridge, > > ovs_native_tunneling_is_on() returns true and in tnl_port_map_lookup() > > the wc mask gets updated for dl_dst. > > > > Thanks, > > -Harsha > > > > Hi. Thanks for reporting the issue. I think, I have a solution for it. > Please, take a look at the following patch: > > https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/ > > Bets regards, Ilya Maximets.
Thanks Ilya, with the above fix we are able to see simple-match hits with vxlan configuration too. Acked-by: Sriharsha Basavapatna <[email protected]> Thanks, -Harsha -- This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
