On 10/30/21 09:04, Sriharsha Basavapatna wrote:
> On Mon, Aug 9, 2021 at 6:28 PM Ilya Maximets <[email protected]> wrote:
>>
>> There are cases where users might want simple forwarding or drop rules
>> for all packets received from a specific port, e.g ::
>>
>>   "in_port=1,actions=2"
>>   "in_port=2,actions=IN_PORT"
>>   "in_port=3,vlan_tci=0x1234/0x1fff,actions=drop"
>>   "in_port=4,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:3"
>>
>> There are also cases where complex OpenFlow rules can be simplified
>> down to datapath flows with very simple match criteria.
>>
>> In theory, for very simple forwarding, OVS doesn't need to parse
>> packets at all in order to follow these rules.  "Simple match" lookup
>> optimization is intended to speed up packet forwarding in these cases.
>>
>> Design:
>>
>> Due to various implementation constraints userspace datapath has
>> following flow fields always in exact match (i.e. it's required to
>> match at least these fields of a packet even if the OF rule doesn't
>> need that):
>>
>>   - recirc_id
>>   - in_port
>>   - packet_type
>>   - dl_type
>>   - vlan_tci (CFI + VID) - in most cases
>>   - nw_frag - for ip packets
>>
>> Not all of these fields are related to packet itself.  We already
>> know the current 'recirc_id' and the 'in_port' before starting the
>> packet processing.  It also seems safe to assume that we're working
>> with Ethernet packets.  So, for the simple OF rule we need to match
>> only on 'dl_type', 'vlan_tci' and 'nw_frag'.
>>
>> 'in_port', 'dl_type', 'nw_frag' and 13 bits of 'vlan_tci' can be
>> combined in a single 64bit integer (mark) that can be used as a
>> hash in hash map.  We are using only VID and CFI form the 'vlan_tci',
>> flows that need to match on PCP will not qualify for the optimization.
>> Workaround for matching on non-existence of vlan updated to match on
>> CFI and VID only in order to qualify for the optimization.  CFI is
>> always set by OVS if vlan is present in a packet, so there is no need
>> to match on PCP in this case.  'nw_frag' takes 2 bits of PCP inside
>> the simple match mark.
>>
>> New per-PMD flow table 'simple_match_table' introduced to store
>> simple match flows only.  'dp_netdev_flow_add' adds flow to the
>> usual 'flow_table' and to the 'simple_match_table' if the flow
>> meets following constraints:
>>
>>   - 'recirc_id' in flow match is 0.
>>   - 'packet_type' in flow match is Ethernet.
>>   - Flow wildcards contains only minimal set of non-wildcarded fields
>>     (listed above).
>>
>> If the number of flows for current 'in_port' in a regular 'flow_table'
>> equals number of flows for current 'in_port' in a 'simple_match_table',
>> we may use simple match optimization, because all the flows we have
>> are simple match flows.  This means that we only need to parse
>> 'dl_type', 'vlan_tci' and 'nw_frag' to perform packet matching.
>> Now we make the unique flow mark from the 'in_port', 'dl_type',
>> 'nw_frag' and 'vlan_tci' and looking for it in the 'simple_match_table'.
>> On successful lookup we don't need to run full 'miniflow_extract()'.
>>
>> Unsuccessful lookup technically means that we have no suitable flow
>> in the datapath and upcall will be required.  So, in this case EMC and
>> SMC lookups are disabled.  We may optimize this path in the future by
>> bypassing the dpcls lookup too.
>>
>> Performance improvement of this solution on a 'simple match' flows
>> should be comparable with partial HW offloading, because it parses same
>> packet fields and uses similar flow lookup scheme.
>> However, unlike partial HW offloading, it works for all port types
>> including virtual ones.
>>
>> Performance results when compared to EMC:
>>
>> Test setup:
>>
>>              virtio-user   OVS    virtio-user
>>   Testpmd1  ------------>  pmd1  ------------>  Testpmd2
>>   (txonly)       x<------  pmd2  <------------ (mac swap)
>>
>> Single stream of 64byte packets.  Actions:
>>   in_port=vhost0,actions=vhost1
>>   in_port=vhost1,actions=vhost0
>>
>> Stats collected from pmd1 and pmd2, so there are 2 scenarios:
>> Virt-to-Virt   :     Testpmd1 ------> pmd1 ------> Testpmd2.
>> Virt-to-NoCopy :     Testpmd2 ------> pmd2 --->x   Testpmd1.
>> Here the packet sent from pmd2 to Testpmd1 is always dropped, because
>> the virtqueue is full since Testpmd1 is in txonly mode and doesn't
>> receive any packets.  This should be closer to the performance of a
>> VM-to-Phy scenario.
>>
>> Test performed on machine with Intel Xeon CPU E5-2690 v4 @ 2.60GHz.
>> Table below represents improvement in throughput when compared to EMC.
>>
>>  +----------------+------------------------+------------------------+
>>  |                |    Default (-g -O2)    | "-Ofast -march=native" |
>>  |   Scenario     +------------+-----------+------------+-----------+
>>  |                |     GCC    |   Clang   |     GCC    |   Clang   |
>>  +----------------+------------+-----------+------------+-----------+
>>  | Virt-to-Virt   |    +18.9%  |   +25.5%  |    +10.8%  |   +16.7%  |
>>  | Virt-to-NoCopy |    +24.3%  |   +33.7%  |    +14.9%  |   +22.0%  |
>>  +----------------+------------+-----------+------------+-----------+
>>
>> For Phy-to-Phy case performance improvement should be even higher, but
>> it's not the main use-case for this functionality.  Performance
>> difference for the non-simple flows is within a margin of error.
>>
>> Signed-off-by: Ilya Maximets <[email protected]>
>> ---
>>
>> Version 3:
>>   - Removed artificial restriction on flow actions.  Any set of actions
>>     is supported.  Hence the patch re-named from "direct output" to
>>     "simple match".
>>   - VLAN VID+CFI added to the mark.  Previous implementation from v1/v2
>>     was not correct in a way that it ignored match on VLAN if match on
>>     non-existence of VLAN was requested.  Workaround for netlink
>>     expression of the match on non-existance of VLAN improved to avoid
>>     match on PCP, so we have extra bits inside the mark to fit
>>     fragmentation flags.  Technically, we can keep only match on CFI,
>>     but we have enough bits, so we don't need to reduce the mask
>>     further.  We parsed VLAN header from a packet anyway in v1 and v2,
>>     so this doesn't really affect performance of the solution, but
>>     increases the variety of flows that can be optimized.
>>   - Since we're matching on VLAN, the optimization now works for flows
>>     with VLAN matches (VID+CFI, no PCP matches).
>>   - Added statistics, documentation and performance test results.
>>
>> Version 2 (Harsha):
>>   - Rebase on 2.15
>>   - Added a coverage counter.
>>   - 
>> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
>>
>>  Documentation/topics/dpdk/bridge.rst |  24 ++
>>  NEWS                                 |   3 +
>>  lib/dpif-netdev-avx512.c             |   3 +-
>>  lib/dpif-netdev-perf.c               |  41 ++--
>>  lib/dpif-netdev-perf.h               |   1 +
>>  lib/dpif-netdev-private-flow.h       |   5 +-
>>  lib/dpif-netdev-private-thread.h     |  13 +-
>>  lib/dpif-netdev-unixctl.man          |  12 +-
>>  lib/dpif-netdev.c                    | 316 +++++++++++++++++++++++----
>>  lib/flow.c                           |  22 +-
>>  lib/flow.h                           |   3 +-
>>  lib/netdev-offload-dpdk.c            |   2 +-
>>  tests/dpif-netdev.at                 |  16 +-
>>  tests/nsh.at                         |   2 +-
>>  tests/pmd.at                         |   6 +-
>>  15 files changed, 386 insertions(+), 83 deletions(-)
>>

<snip>

> 
> Hi Ilya,
> 
> We are trying to test this feature and running into an issue in which
> the ofctl rule (in_port=1,actions=2) is resulting in a
> datapath rule with the 'dl_dst' inserted in the match field. So the
> datapath rule ends up failing the simple match criteria.
> 
> The test case involves forwarding traffic from the VM (virtio) via the
> vhost-user port and VxLAN tunnel in OVS.
> 
>             br-vhost                          br-vxlan
>         br-phy
>         vhost-user<-->VF       VF-Rep<-->VxLAN            uplink-port
> 
> The simple match ofctl rule is added to the br-vhost bridge to forward
> traffic to/from the vhost-user port. The problem disappears if the
> vxlan bridge is removed.
> 
> Debugging this further, I see this code in
>         xlate_actions() .... --> compose_output_action__():
> 
>         } else if (terminate_native_tunnel(ctx, flow, wc,
>                                            &odp_tnl_port)) {
> 
>         In terminate_native_tunnel():
> 
>         if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
>                 *tnl_port = tnl_port_map_lookup(flow, wc);
> 
> When I dump wc->masks.dl_dst before and after tnl_port_map_lookup(),
> it changes from all zeros to all 0xff (wc to em).
> 
> So it seems like while there's a tunnel configuration, though the
> ctx->xbridge in question is the vhost bridge,
> ovs_native_tunneling_is_on() returns true and in tnl_port_map_lookup()
> the wc mask gets updated for dl_dst.
> 
> Thanks,
> -Harsha
> 

Hi.  Thanks for reporting the issue.  I think, I have a solution for it.
Please, take a look at the following patch:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/

Bets regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to