This patch set is the V2 implementation to combine the CD and DFC design. Both patches intend to refactor datapath to avoid costly sequential subtable search. Rebased on 4299145c10953b5ba125ba2a95caa18e554f3f85
CD and DFC patch sets: CD: [PATCH v2 0/5] dpif-netdev: Cuckoo-Distributor implementation https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/340305.html DFC: [PATCH] dpif-netdev: Refactor datapath flow cache https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341066.html 1. The first commit is a rebase of Jan Scheurich's patch of [PATCH] dpif-netdev: Refactor datapath flow cache with a couple of bug fixes. 2. The second commit is to incorporate CD's way-associative design into DFC to improve the hit rate. 3. The third commit is to change the distributor to cache an index of flow_table entry to improve memory efficiency. 4. The fourth commit is to split DFC into EMC and SMC for better organization. Also the lookup function is rewritten to do batching processing. We did a phy-2-phy test to evaluate the performance improvement with this patch set. The traffic pattern we use is based on Billy's original TREX script: https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/345032.html We augment the script to generate power law distribution of flows to have different bandwidth and to access different subtables. For example, there are n flows each has bandwidth of w, while n/4 flows each has bandwidth of 2w, while n/9 flows each has bandwidth of 3w, and so on (Power Law distribution, y = Cx^-2). For subtable, the second most accessed subtable has 1/2 accesses of the first most accessed subtable, the third most accessed subtable has 1/3 accesses of the first most accessed subtable and so on (Zipf's law). The CD/DFC size is 1 million entries. The speedup results are listed below: #flow #subtable speedup 1000 1 1.015523746 1000 5 1.032199838 1000 10 1.050814738 1000 20 1.081794454 10000 1 1.201704118 10000 5 1.31634144 10000 10 1.402493331 10000 20 1.531133279 100000 1 1.11088487 100000 5 1.458748559 100000 10 1.683044348 100000 20 2.034441401 1000000 1 1.004339563 1000000 5 1.256745291 1000000 10 1.444329892 1000000 20 1.666275853 Both flow traffic and subtable accesses are skewed. The table shows the total number. The most performance improvement happens when flow can totally hit DFC/CD thus bypass the megaflow cache, and when there are multiple subtables. When all flows hit EMC or flow count is larger than CD/DFC size, the performance improvement reduces. v1->v2: 1. Add comment and follow code style for cmap code (Ben's comment) 2. Fix a bug in the first commit that fails multiple unit tests. Since DFC is per PMD not per port, the port mask should be included in rule. 3. Added commit 4. This commit separates DFC to be EMC cache and SMC (signature match cache) for easier optimization and readability. 4. In commit 4, DFC lookup is refactored to do batching lookup. 5. Rebase and other minor changes. RFC->V1: 1. rebase to master head. 2. The last commit is totally rewritten to use the flow_table as indirect table. The CD/DFC distributor will cache the index of flow_table entry. 3. Incorporate commit 2 into commit 1. (Bhanu's comment) 4. Change DFC to be always on in commit 1. (Bhanu's comment) RFC of this patch set: https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/343411.html Yipeng Wang (3): dpif-netdev: Use way-associative cache dpif-netdev: use flow_table as indirect table dpif-netdev: Split DFC cache and code optimization Jan Scheurich (1): dpif-netdev: Refactor datapath flow cache lib/cmap.c | 73 +++++++++ lib/cmap.h | 5 + lib/dpif-netdev-perf.h | 1 + lib/dpif-netdev.c | 426 ++++++++++++++++++++++++++++++++++--------------- 4 files changed, 375 insertions(+), 130 deletions(-) -- 2.7.4 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev