Any comment on this patchset? Adding Jan in CC.
In one of the last bi-weekly meeting there was some interest in testing this patchset in conjunction with the patch to avoid using EMC for recirculated packets - this is contained inside the patchset https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/335938.html Thanks, -Antonio > -----Original Message----- > From: [email protected] [mailto:[email protected]] > On Behalf Of Wang, Yipeng1 > Sent: Tuesday, July 11, 2017 8:59 PM > To: Darrell Ball <[email protected]>; [email protected] > Subject: Re: [ovs-dev] [PATCH 0/5] dpif-netdev: Cuckoo-Distributor > implementation > > Thank you Darrell for the comments. > > To ones who are interested, this patch is mainly for improving the subtable > lookup process when subtable count is large. We heard about use cases that > the current sequential search of subtables is not efficient enough. With 30 > subtables, this patch could achieve more than 2x speedup. Basically, a hash > table is used to direct the packets to correct sub-table. > > We also plan a replacement policy mechanism for version 2, our initial results > Show another 7% improvement on top of current CD for certain use cases. > > Please feel free to comment and share any thought on this patch. > > Thanks > Yipeng > > > -----Original Message----- > > From: Darrell Ball [mailto:[email protected]] > > Sent: Friday, July 7, 2017 6:37 PM > > To: Wang, Yipeng1 <[email protected]>; [email protected] > > Subject: Re: [ovs-dev] [PATCH 0/5] dpif-netdev: Cuckoo-Distributor > > implementation > > > > I just noticed this patch set has not had much discussion since the RFC > > version. > > It would be nice if the discussion can be revived. > > > > Thanks Darrell > > > > > > On 6/13/17, 4:09 PM, "[email protected] on behalf of > > [email protected]" <[email protected] on behalf of > > [email protected]> wrote: > > > > From: Yipeng Wang <[email protected]> > > > > The Datapath Classifier uses tuple space search for flow classification. > > The rules are arranged into a set of tuples/subtables (each with a > > distinct mask). Each subtable is implemented as a hash table and lookup > > is done with flow keys formed by selecting the bits from the packet > header > > based on each subtable's mask. Tuple space search will sequentially > search > > each subtable until a match is found. With a large number of subtables, > > a > > sequential search of the subtables could consume a lot of CPU cycles. In > > a testbench with a uniform traffic pattern equally distributed across 20 > > subtables, we measured that up to 65% of total execution time is > > attributed > > to the megaflow cache lookup. > > > > This patch presents the idea of the two-layer hierarchical lookup, where > a > > low overhead first level of indirection is accessed first, we call this > > level cuckoo distributor (CD). If a flow key has been inserted in the > flow > > table the first level will indicate with high probability that which > > subtable to look into. A lookup is performed on the second level (the > > target subtable) to retrieve the result. If the key doesn’t have a > > match, > > then we revert back to the sequential search of subtables. The patch is > > partially inspired by earlier concepts proposed in "simTable"[1] and > > "Cuckoo Filter"[2], and DPDK's Cuckoo Hash implementation. > > > > This patch can improve the already existing Subtable Ranking when > > traffic > > data has high entropy. Subtable Ranking helps minimize the number of > > traversed subtables when most of the traffic hit the same subtable. > > However, in the case of high entropy traffic such as traffic coming from > > a physical port, multiple subtables could be hit with a similar > frequency. > > In this case the average subtable lookups per hit would be much greater > > than 1. In addition, CD can adaptively turn off when it finds the > > traffic > > mostly hit one subtable. Thus, CD will not be an overhead when Subtable > > Ranking works well. > > > > Scheme: > > > > ------- > > | CD | > > ------- > > \ > > \ > > ----- ----- ----- > > |sub ||sub |...|sub | > > |table||table| |table| > > ----- ----- ----- > > > > Evaluation: > > > > We create set of rules with various src IP. We feed traffic containing > various > > numbers of flows with various src IP and dst IP. All the flows hit > 10/20/30 > > rules creating 10/20/30 subtables. > > > > The table below shows the preliminary continuous testing results (full > line > > speed test) we collected with a uni-directional phy-to-phy setup. The > > machine we tested on is a Xeon E5 server running with 2.2GHz cores. OvS > > runs with 1 PMD. We use Spirent as the hardware traffic generator. > > > > AVX2 data: > > 20k flows: > > no.subtable: 10 20 30 > > cd-ovs 4267332 3478251 3126763 > > orig-ovs 3260883 2174551 1689981 > > speedup 1.31x 1.60x 1.85x > > > > 100k flows: > > no.subtable: 10 20 30 > > cd-ovs 4015783 3276100 2970645 > > orig-ovs 2692882 1711955 1302321 > > speedup 1.49x 1.91x 2.28x > > > > 1M flows: > > no.subtable: 10 20 30 > > cd-ovs 3895961 3170530 2968555 > > orig-ovs 2683455 1646227 1240501 > > speedup 1.45x 1.92x 2.39x > > > > Scalar data: > > 1M flows: > > no.subtable: 10 20 30 > > cd-ovs 3658328 3028111 2863329 > > orig_ovs 2683455 1646227 1240501 > > speedup 1.36x 1.84x 2.31x > > > > [1] H. Lee and B. Lee, Approaches for improving tuple space search-based > > table lookup, ICTC '15 > > [2] B. Fan, D. G. Andersen, M. Kaminsky, and M. D. Mitzenmacher, > > Cuckoo Filter: Practically Better Than Bloom, CoNEXT '14 > > > > This patch set is created based on commit > > a13784ba95efeb5a1f77253df40d433a1ce60087 > > > > The previous RFC on mailing list are at: > > https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/331834.html > > https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330570.html > > > > Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com> > > Signed-off-by: Charlie Tai <charlie.tai at intel.com> > > Co-authored-by: Charlie Tai <charlie.tai at intel.com> > > Signed-off-by: Sameh Gobriel <sameh.gobriel at intel.com> > > Co-authored-by: Sameh Gobriel <sameh.gobriel at intel.com> > > Signed-off-by: Ren Wang <ren.wang at intel.com> > > Co-authored-by: Ren Wang <ren.wang at intel.com> > > Signed-off-by: Antonio Fischetti <antonio.fischetti at intel.com> > > Co-authored-by: Antonio Fischetti <antonio.fischetti at intel.com> > > > > > > Yipeng Wang (5): > > dpif-netdev: Basic CD feature with scalar lookup. > > dpif-netdev: Add AVX2 implementation for CD lookup. > > dpif-netdev: Add CD statistics > > dpif-netdev: Add adaptive CD mechanism > > unit-test: Add a delay for CD initialization. > > > > lib/dpif-netdev.c | 566 > > +++++++++++++++++++++++++++++++++++++++++++++++++- > > tests/ofproto-dpif.at | 3 + > > 2 files changed, 558 insertions(+), 11 deletions(-) > > > > -- > > 1.9.1 > > > > _______________________________________________ > > dev mailing list > > [email protected] > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
