Some hightlights in v3
======================
Here is the v3, with 2 major changes and further testings (including
many flows). This took more effort than I thought, thus v3 publication
has been delayed for a while.
The first major change is the mark and id association is done with array
instead of CMAP now. This gives us further performance gain: it could
be up to 70% now (please see the exact number below).
This change also make the code a bit more complex though, due to the
lock issue. RCU is used (not quite sure it's been used rightly though).
For now, RCU only protects the array base address update (due to
reallocate), it doesn't protect the array item (array[i] = xx]) change.
I think it's buggy and I need rethink about it.
The second major change is there is a thread introduced to do the real
flow offload. This is for diminishing the overhead by hw flow offload
installation/deletion at data path. See patch 9 for more detailed info.
In the last discussion, RSS action was suggested to use to get rid of
the QUEUE action workaround. Unfortunately, it didn't work. The flow
creation failed with MLX5 PMD driver, and that's the only driver in
DPDK that supports RSS action so far.
I also tested many flows this time. The result is more exciting: it
could be up to 267% boost, with 512 mega flows (with each has another
512 exact matching flows, thus it's 512*512=256K flows in total), one
core and one queue doing PHY-PHY forwarding. For the offload case, the
performance keeps the same as with one flow only: because the cost of
the mark to flow translation is constant, no matter how many flows
are inserted (as far as they are all offloaded). However, for the
vanilla ovs-dpdk, the more flows, the worse the performance is. In
another word, the more flows, the bigger difference we will see.
There were too many discussions in last version. I'm sorry if I missed
some comments and didn't do the corresponding changes in v3. Please let
me know if I made such mistakes.
And below are the formal cover letter introduction, for someone who
is the first time to see this patchset.
---
Hi,
Here is a joint work from Mellanox and Napatech, to enable the flow hw
offload with the DPDK generic flow interface (rte_flow).
The basic idea is to associate the flow with a mark id (a unit32_t number).
Later, we then get the flow directly from the mark id, bypassing the heavy
emc processing, including miniflow_extract.
The association is done with array in patch 1. It also reuses the flow
APIs introduced while adding the tc offloads. The emc bypassing is done
in patch 2. The flow offload is done in patch 4, which mainly does two
things:
- translate the ovs match to DPDK rte flow patterns
- bind those patterns with a MARK action.
Afterwards, the NIC will set the mark id in every pkt's mbuf when it
matches the flow. That's basically how we could get the flow directly
from the received mbuf.
While testing with PHY-PHY forwarding with one core, one queue and one
flow, I got about 70% performance boost. For PHY-vhost forwarding, I got
about 50% performance boost. It's basically the performance I got with v1,
when the tcp_flags is the ignored. In summary, the CMPA to array change
gives up yet another 16% performance boost.
The major issue mentioned in v1 is also workarounded: the queue index
is never set to 0 blindly anymore, but set to the rxq that first
receives the upcall pkt.
Note that it's disabled by default, which can be enabled by:
$ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
v3: - The mark and id association is done with array instead of CMAP.
- Added a thread to do hw offload operations
- Removed macros completely
- dropped the patch to set FDIR_CONF, which is a workround some
Intel NICs.
- Added a debug patch to show all flow patterns we have created.
- Misc fixes
v2: - workaround the queue action issue
- fixed the tcp_flags being skipped issue, which also fixed the
build warnings
- fixed l2 patterns for Intel nic
- Converted some macros to functions
- did not hardcode the max number of flow/action
- rebased on top of the latest code
Thanks.
--yliu
---
Finn Christensen (2):
netdev-dpdk: implement flow put with rte flow
netdev-dpdk: retry with queue action
Shachar Beiser (1):
dpif-netdev: record rx queue id for the upcall
Yuanhan Liu (6):
dpif-netdev: associate flow with a mark id
dpif-netdev: retrieve flow directly from the flow mark
netdev-dpdk: convert ufid to dpdk flow
netdev-dpdk: remove offloaded flow on deletion
netdev-dpdk: add debug for rte flow patterns
dpif-netdev: do hw flow offload in another thread
lib/dp-packet.h | 14 +
lib/dpif-netdev.c | 421 ++++++++++++++++++++++++++++-
lib/flow.c | 155 ++++++++---
lib/flow.h | 1 +
lib/netdev-dpdk.c | 776 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
lib/netdev.c | 1 +
lib/netdev.h | 7 +
7 files changed, 1331 insertions(+), 44 deletions(-)
--
2.7.4
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev