On 12/22/21 06:38, Sriharsha Basavapatna wrote:
> On Tue, Nov 2, 2021 at 1:48 AM Ilya Maximets <[email protected]> wrote:
>>
>> On 10/30/21 09:04, Sriharsha Basavapatna wrote:
>>> On Mon, Aug 9, 2021 at 6:28 PM Ilya Maximets <[email protected]> wrote:
>>>>
>>>> There are cases where users might want simple forwarding or drop rules
>>>> for all packets received from a specific port, e.g ::
>>>>
>>>>   "in_port=1,actions=2"
>>>>   "in_port=2,actions=IN_PORT"
>>>>   "in_port=3,vlan_tci=0x1234/0x1fff,actions=drop"
>>>>   "in_port=4,actions=push_vlan:0x8100,set_field:4196->vlan_vid,output:3"
>>>>
>>>> There are also cases where complex OpenFlow rules can be simplified
>>>> down to datapath flows with very simple match criteria.
>>>>
>>>> In theory, for very simple forwarding, OVS doesn't need to parse
>>>> packets at all in order to follow these rules.  "Simple match" lookup
>>>> optimization is intended to speed up packet forwarding in these cases.
>>>>
>>>> Design:
>>>>
>>>> Due to various implementation constraints userspace datapath has
>>>> following flow fields always in exact match (i.e. it's required to
>>>> match at least these fields of a packet even if the OF rule doesn't
>>>> need that):
>>>>
>>>>   - recirc_id
>>>>   - in_port
>>>>   - packet_type
>>>>   - dl_type
>>>>   - vlan_tci (CFI + VID) - in most cases
>>>>   - nw_frag - for ip packets
>>>>
>>>> Not all of these fields are related to packet itself.  We already
>>>> know the current 'recirc_id' and the 'in_port' before starting the
>>>> packet processing.  It also seems safe to assume that we're working
>>>> with Ethernet packets.  So, for the simple OF rule we need to match
>>>> only on 'dl_type', 'vlan_tci' and 'nw_frag'.
>>>>
>>>> 'in_port', 'dl_type', 'nw_frag' and 13 bits of 'vlan_tci' can be
>>>> combined in a single 64bit integer (mark) that can be used as a
>>>> hash in hash map.  We are using only VID and CFI form the 'vlan_tci',
>>>> flows that need to match on PCP will not qualify for the optimization.
>>>> Workaround for matching on non-existence of vlan updated to match on
>>>> CFI and VID only in order to qualify for the optimization.  CFI is
>>>> always set by OVS if vlan is present in a packet, so there is no need
>>>> to match on PCP in this case.  'nw_frag' takes 2 bits of PCP inside
>>>> the simple match mark.
>>>>
>>>> New per-PMD flow table 'simple_match_table' introduced to store
>>>> simple match flows only.  'dp_netdev_flow_add' adds flow to the
>>>> usual 'flow_table' and to the 'simple_match_table' if the flow
>>>> meets following constraints:
>>>>
>>>>   - 'recirc_id' in flow match is 0.
>>>>   - 'packet_type' in flow match is Ethernet.
>>>>   - Flow wildcards contains only minimal set of non-wildcarded fields
>>>>     (listed above).
>>>>
>>>> If the number of flows for current 'in_port' in a regular 'flow_table'
>>>> equals number of flows for current 'in_port' in a 'simple_match_table',
>>>> we may use simple match optimization, because all the flows we have
>>>> are simple match flows.  This means that we only need to parse
>>>> 'dl_type', 'vlan_tci' and 'nw_frag' to perform packet matching.
>>>> Now we make the unique flow mark from the 'in_port', 'dl_type',
>>>> 'nw_frag' and 'vlan_tci' and looking for it in the 'simple_match_table'.
>>>> On successful lookup we don't need to run full 'miniflow_extract()'.
>>>>
>>>> Unsuccessful lookup technically means that we have no suitable flow
>>>> in the datapath and upcall will be required.  So, in this case EMC and
>>>> SMC lookups are disabled.  We may optimize this path in the future by
>>>> bypassing the dpcls lookup too.
>>>>
>>>> Performance improvement of this solution on a 'simple match' flows
>>>> should be comparable with partial HW offloading, because it parses same
>>>> packet fields and uses similar flow lookup scheme.
>>>> However, unlike partial HW offloading, it works for all port types
>>>> including virtual ones.
>>>>
>>>> Performance results when compared to EMC:
>>>>
>>>> Test setup:
>>>>
>>>>              virtio-user   OVS    virtio-user
>>>>   Testpmd1  ------------>  pmd1  ------------>  Testpmd2
>>>>   (txonly)       x<------  pmd2  <------------ (mac swap)
>>>>
>>>> Single stream of 64byte packets.  Actions:
>>>>   in_port=vhost0,actions=vhost1
>>>>   in_port=vhost1,actions=vhost0
>>>>
>>>> Stats collected from pmd1 and pmd2, so there are 2 scenarios:
>>>> Virt-to-Virt   :     Testpmd1 ------> pmd1 ------> Testpmd2.
>>>> Virt-to-NoCopy :     Testpmd2 ------> pmd2 --->x   Testpmd1.
>>>> Here the packet sent from pmd2 to Testpmd1 is always dropped, because
>>>> the virtqueue is full since Testpmd1 is in txonly mode and doesn't
>>>> receive any packets.  This should be closer to the performance of a
>>>> VM-to-Phy scenario.
>>>>
>>>> Test performed on machine with Intel Xeon CPU E5-2690 v4 @ 2.60GHz.
>>>> Table below represents improvement in throughput when compared to EMC.
>>>>
>>>>  +----------------+------------------------+------------------------+
>>>>  |                |    Default (-g -O2)    | "-Ofast -march=native" |
>>>>  |   Scenario     +------------+-----------+------------+-----------+
>>>>  |                |     GCC    |   Clang   |     GCC    |   Clang   |
>>>>  +----------------+------------+-----------+------------+-----------+
>>>>  | Virt-to-Virt   |    +18.9%  |   +25.5%  |    +10.8%  |   +16.7%  |
>>>>  | Virt-to-NoCopy |    +24.3%  |   +33.7%  |    +14.9%  |   +22.0%  |
>>>>  +----------------+------------+-----------+------------+-----------+
>>>>
>>>> For Phy-to-Phy case performance improvement should be even higher, but
>>>> it's not the main use-case for this functionality.  Performance
>>>> difference for the non-simple flows is within a margin of error.
>>>>
>>>> Signed-off-by: Ilya Maximets <[email protected]>
>>>> ---
>>>>
>>>> Version 3:
>>>>   - Removed artificial restriction on flow actions.  Any set of actions
>>>>     is supported.  Hence the patch re-named from "direct output" to
>>>>     "simple match".
>>>>   - VLAN VID+CFI added to the mark.  Previous implementation from v1/v2
>>>>     was not correct in a way that it ignored match on VLAN if match on
>>>>     non-existence of VLAN was requested.  Workaround for netlink
>>>>     expression of the match on non-existance of VLAN improved to avoid
>>>>     match on PCP, so we have extra bits inside the mark to fit
>>>>     fragmentation flags.  Technically, we can keep only match on CFI,
>>>>     but we have enough bits, so we don't need to reduce the mask
>>>>     further.  We parsed VLAN header from a packet anyway in v1 and v2,
>>>>     so this doesn't really affect performance of the solution, but
>>>>     increases the variety of flows that can be optimized.
>>>>   - Since we're matching on VLAN, the optimization now works for flows
>>>>     with VLAN matches (VID+CFI, no PCP matches).
>>>>   - Added statistics, documentation and performance test results.
>>>>
>>>> Version 2 (Harsha):
>>>>   - Rebase on 2.15
>>>>   - Added a coverage counter.
>>>>   - 
>>>> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
>>>>
>>>>  Documentation/topics/dpdk/bridge.rst |  24 ++
>>>>  NEWS                                 |   3 +
>>>>  lib/dpif-netdev-avx512.c             |   3 +-
>>>>  lib/dpif-netdev-perf.c               |  41 ++--
>>>>  lib/dpif-netdev-perf.h               |   1 +
>>>>  lib/dpif-netdev-private-flow.h       |   5 +-
>>>>  lib/dpif-netdev-private-thread.h     |  13 +-
>>>>  lib/dpif-netdev-unixctl.man          |  12 +-
>>>>  lib/dpif-netdev.c                    | 316 +++++++++++++++++++++++----
>>>>  lib/flow.c                           |  22 +-
>>>>  lib/flow.h                           |   3 +-
>>>>  lib/netdev-offload-dpdk.c            |   2 +-
>>>>  tests/dpif-netdev.at                 |  16 +-
>>>>  tests/nsh.at                         |   2 +-
>>>>  tests/pmd.at                         |   6 +-
>>>>  15 files changed, 386 insertions(+), 83 deletions(-)
>>>>
>>
>> <snip>
>>
>>>
>>> Hi Ilya,
>>>
>>> We are trying to test this feature and running into an issue in which
>>> the ofctl rule (in_port=1,actions=2) is resulting in a
>>> datapath rule with the 'dl_dst' inserted in the match field. So the
>>> datapath rule ends up failing the simple match criteria.
>>>
>>> The test case involves forwarding traffic from the VM (virtio) via the
>>> vhost-user port and VxLAN tunnel in OVS.
>>>
>>>             br-vhost                          br-vxlan
>>>         br-phy
>>>         vhost-user<-->VF       VF-Rep<-->VxLAN            uplink-port
>>>
>>> The simple match ofctl rule is added to the br-vhost bridge to forward
>>> traffic to/from the vhost-user port. The problem disappears if the
>>> vxlan bridge is removed.
>>>
>>> Debugging this further, I see this code in
>>>         xlate_actions() .... --> compose_output_action__():
>>>
>>>         } else if (terminate_native_tunnel(ctx, flow, wc,
>>>                                            &odp_tnl_port)) {
>>>
>>>         In terminate_native_tunnel():
>>>
>>>         if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
>>>                 *tnl_port = tnl_port_map_lookup(flow, wc);
>>>
>>> When I dump wc->masks.dl_dst before and after tnl_port_map_lookup(),
>>> it changes from all zeros to all 0xff (wc to em).
>>>
>>> So it seems like while there's a tunnel configuration, though the
>>> ctx->xbridge in question is the vhost bridge,
>>> ovs_native_tunneling_is_on() returns true and in tnl_port_map_lookup()
>>> the wc mask gets updated for dl_dst.
>>>
>>> Thanks,
>>> -Harsha
>>>
>>
>> Hi.  Thanks for reporting the issue.  I think, I have a solution for it.
>> Please, take a look at the following patch:
>>   
>> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/
>>
>> Bets regards, Ilya Maximets.
> 
> Thanks Ilya, with the above fix we are able to see simple-match hits
> with vxlan configuration too.
> Acked-by: Sriharsha Basavapatna <[email protected]>

Thanks for review and all the testing!  Applied.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to