On 1 Jun 2018, at 12:45, Harry van Haaren wrote:
Lookup consists of 3 main stages:
1) hashing of the packet miniflow based on subtable mf bits
2) lookup of that hash into cmap
3) verification of the rule based on the rule mf bits
Before this commit, the iteration of stages 1) and 3) was
totally independant, and the work done was duplicate. The
reason is that the bits in the subtable and the rule are
identical.
This commit optimizes away this duplicate work by caching
the data values retrieved from the packet miniflow, storing
them in a linear array that matches the subtable mf blocks.
Once the cmap lookup has completed, the *same* miniflow
must be iterated, and the cached miniflow block data is now
re-used. This avoids iterating the packet miniflow for the
second time, and reducing the overhead of rule verification.
Performance:
This patch was tested using VSPerf with phy2phy_cont test,
varying the number of flows, and with EMC en/dis-abled.
Results are based on initial testing here - please verify
with your use-case and report back :)
Flows | EMC | Master mpps | +Patch mpps | Perf Gain
----------------------------------------------------
4K | 1 | ~15.1 | ~15.2 | + ~0.6 %
4K | 0 | ~14.6 | ~15.9 | + ~8.2 %
1M | 1 | ~13.5 | ~14.4 | + ~6.5 %
1M | 0 | ~14.6 | ~15.8 | + ~8.2 %
The approx neutral performance with EMC enabled and a low
flow count (4K) is due to that DPCLS is not often hit as
EMC will match many of the flows. In cases where EMC is
disabled, we see a consistent performance improvement as
DPCLS is used for each packet match.
Input and feedback welcomed,
Hi Harry,
As I promised you to do some tests, here are my findings…
For the PVP test which runs at wire speed (XL710 @40g), I see not much
improvement. Note that the “Number of flows” are traffic flows, the
number of installed OpenFlow rules is double, i.e. to and from the VM.
For details on the test see https://github.com/chaudron/ovs_perf
+-----------------+--------+--------+--------+--------+--------+--------+--------+
| Packet size / | 64 | 128 | 256 | 512 | 768 | 1024
| 1514 |
| Number of flows | | | | | |
| |
+-----------------+--------+--------+--------+--------+--------+--------+--------+
| 10 | -1,36% | -0,18% | -2,33% | -1,87% | -0,32% | 2,19%
| 0,37% |
| 1000 | 0,16% | 0,06% | -0,14% | -0,02% | -0,75% | -0,05%
| -0,20% |
| 10000 | -0,12% | 0,07% | 0,14% | 1,14% | 1,49% | 0,29%
| 1,33% |
| 100000 | -0,84% | 0,51% | 0,92% | 0,87% | 1,40% | 1,82%
| 1,22% |
| 1000000 | 0,28% | 3,85% | -0,27% | 6,46% | -0,31% | 4,06%
| -2,48% |
+-----------------+--------+--------+--------+--------+--------+--------+--------+
However using a zero packet loss test (like VSPef) I do see similar
numbers as above:
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
| | 64 | 128 | 256 | 512
| 1024 | 1518 |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
| rx pkts/sec DELTA | 9,58% | 6,02% | 3,66% | 8,96%
| 5,02% | -0,10% |
| rx pkts/sec MASTER | 1397142,077 | 1348985,6 | 1356478,542 |
1254775,104 | 1077042,154 | 768962,1308 |
| rx pkts/sec PATCH | 1545202,762 | 1438959,781 | 1407962,838 |
1378228,358 | 1133955,096 | 768201,7808 |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+
Note: This is a test that uses the NORMAL rule, with 100 MAC address
pairs, and 1M constant flows, with an additional 200K flows changing
every second. This tries to simulate a Mobile NFV scenario.
I’ll try to actually review the patch later this week if time permits,
as I’m preparing for PTO.
Cheers,
Eelco
<SNIP>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev