That's what I mean. It could go in the 'tests' directory.
On Tue, Nov 07, 2017 at 05:01:04PM +0000, Wang, Yipeng1 wrote: > Thanks Ben, > > Do you mean to include TRex script into the repo? Could you suggest more > details like where would be a suitable place to put such kind of test scripts? > > Thanks > > > -----Original Message----- > > From: Ben Pfaff [mailto:[email protected]] > > Sent: Friday, November 3, 2017 10:59 AM > > To: Wang, Yipeng1 <[email protected]> > > Cc: [email protected] > > Subject: Re: [ovs-dev] [PATCH v2 0/5] dpif-netdev: Cuckoo-Distributor > > implementation > > > > Is this something that should be included in the repo? > > > > On Fri, Nov 03, 2017 at 04:14:56PM +0000, Wang, Yipeng1 wrote: > > > To make it easier for the code reviewers to build and test the patchset, a > > TREX profile that presents a very simple synthetic test case of random > > traffic > > with 20 different IP src and 50K different IP dst is attached. It can be > > used > > together with the rule set we mentioned in cover letter to generate uniform > > distribution of hits among the 20 subtables. This synthetic traffic pattern > > represents the worst-case scenario for the current subtable ranking method. > > We observe about 2x speedup vs. the original OvS in this case. Note that the > > patchset automatically detects if there is benefit to turn CD on or off to > > accommodate any traffic pattern, so when the subtable ranking works > > perfectly, CD will not be enabled and will not harm the performance. > > > > > > One can change the dstip and srcip_cnt variables to generate different > > number of flows and subtable count scenarios. > > > > > > ---- > > > import locale, sys, time > > > from signal import * > > > > > > import stl_path > > > from trex_stl_lib.api import * > > > > > > [tx_port, rx_port] = my_ports = [0, 1] > > > tx_ports = [tx_port] > > > rx_ports = [rx_port] > > > > > > global c > > > > > > # dst IP vary from 0.0.0.0 to 0.0.195.255 is about 50k flows. > > > # src IP vary from 1.0.0.0 to 20.0.0.0 is 20 flows. > > > # 50k * 20 is about 1M total flows > > > dstip = "0.0.195.255" > > > srcip_cnt = 20 > > > size = 64 > > > > > > #create stream blocks. Each stream has one srcIP with various dstIP. > > > #There are in total of 20 different srcIP. > > > def make_streams(): > > > streams = [ > > > {"base_pkt":Ether()/IP(src="{}.0.0.0".format(i), tos=0x20)/UDP(), > > > "vm":[ > > > > > STLVmFlowVar(name="ip_dst",min_value="0.0.0.0",max_value=dstip,size=4 > > ,op="random"), > > > STLVmWrFlowVar(fv_name="ip_dst",pkt_offset="IP.dst"), > > > ] > > > } > > > for i in range(1,srcip_cnt + 1) > > > ] > > > return streams > > > > > > if __name__ == "__main__": > > > > > > c = STLClient(verbose_level = LoggerApi.VERBOSE_QUIET) > > > c.connect() > > > > > > c.reset(ports = my_ports) > > > new_streams = make_streams() > > > > > > for s in new_streams: > > > # 64 - 4 for FCS > > > pad = max(0, size - 4 - len(s["base_pkt"])) * 'x' > > > s["base_pkt"] = s["base_pkt"]/pad > > > > > > pkts = [STLPktBuilder(pkt = s["base_pkt"], vm = s["vm"]) for s in > > new_streams] > > > > > > #generate contiguous traffic. Each stream has equal bandwidth. > > > final_streams = [STLStream(packet = pkt, mode = > > STLTXCont(percentage = 100.0/len(pkts))) for pkt in pkts] > > > c.add_streams(final_streams, ports=[tx_port]) > > > c.set_port_attr(my_ports, promiscuous = True) > > > > > > #start the traffic > > > c.start(ports = tx_ports) > > > #wait for 20 seconds > > > time.sleep(20) > > > #write rx pps to stdio > > > sys.stdout.write(str("RX PPS: > > ")+str(int(c.get_stats(my_ports)[1]["rx_pps"])) + str("\n")) > > > #stop the traffic > > > c.stop(ports=my_ports) > > > c.disconnect() > > > c = None > > > ---- > > > > > > > > > > -----Original Message----- > > > > From: Wang, Yipeng1 > > > > Sent: Tuesday, October 31, 2017 4:40 PM > > > > To: [email protected] > > > > Cc: Wang, Yipeng1 <[email protected]>; Gobriel, Sameh > > > > <[email protected]>; Fischetti, Antonio > > > > <[email protected]>; [email protected]; > > > > [email protected] > > > > Subject: [PATCH v2 0/5] dpif-netdev: Cuckoo-Distributor implementation > > > > > > > > The Datapath Classifier uses tuple space search for flow classification. > > > > The rules are arranged into a set of tuples/subtables (each with a > > > > distinct mask). Each subtable is implemented as a hash table and lookup > > > > is done with flow keys formed by selecting the bits from the packet > > header > > > > based on each subtable's mask. Tuple space search will sequentially > > search > > > > each subtable until a match is found. With a large number of subtables, > > > > a > > > > sequential search of the subtables could consume a lot of CPU cycles. In > > > > a testbench with a uniform traffic pattern equally distributed across 20 > > > > subtables, we measured that up to 65% of total execution time is > > attributed > > > > to the megaflow cache lookup. > > > > > > > > This patch presents the idea of the two-layer hierarchical lookup, > > > > where a > > > > low overhead first level of indirection is accessed first, we call this > > > > level cuckoo distributor (CD). If a flow key has been inserted in the > > > > flow > > > > table the first level will indicate with high probability that which > > > > subtable to look into. A lookup is performed on the second level (the > > > > target subtable) to retrieve the result. If the key doesn’t have a > > > > match, > > > > then we revert back to the sequential search of subtables. The patch is > > > > partially inspired by earlier concepts proposed in "simTable"[1] and > > > > "Cuckoo Filter"[2], and DPDK's Cuckoo Hash implementation. > > > > > > > > This patch can improve the already existing Subtable Ranking when > > > > traffic > > > > data has high entropy. Subtable Ranking helps minimize the number of > > > > traversed subtables when most of the traffic hit the same subtable. > > > > However, in the case of high entropy traffic such as traffic coming from > > > > a physical port, multiple subtables could be hit with a similar > > > > frequency. > > > > In this case the average subtable lookups per hit would be much greater > > > > than 1. In addition, CD can adaptively turn off when it finds the > > > > traffic > > > > mostly hit one subtable. Thus, CD will not be an overhead when Subtable > > > > Ranking works well. > > > > > > > > Scheme: > > > > CD is in front of the subtables. Packets are directed to corresponding > > > > subtable > > > > if hit in CD instead of searching each subtable sequentially. > > > > ------- > > > > | CD | > > > > ------- > > > > \ > > > > \ > > > > ----- ----- ----- > > > > |sub ||sub |...|sub | > > > > |table||table| |table| > > > > ----- ----- ----- > > > > > > > > Evaluation: > > > > ---------- > > > > We create a set of rules with various src IP. We feed traffic containing > > various > > > > numbers of flows with various src IP and dst IP. All the flows hit > > > > 10/20/30 > > > > rules creating 10/20/30 subtables. We will explain the rule/traffic > > > > setup > > > > in detail later. > > > > > > > > The table below shows the preliminary continuous testing results (full > > > > line > > > > speed test) we collected with a uni-directional phy-to-phy setup. OvS > > > > runs with 1 PMD. We use Spirent as the hardware traffic generator. > > > > > > > > Before v2 rebase: > > > > ---- > > > > AVX2 data: > > > > 20k flows: > > > > no.subtable: 10 20 30 > > > > cd-ovs 4267332 3478251 3126763 > > > > orig-ovs 3260883 2174551 1689981 > > > > speedup 1.31x 1.60x 1.85x > > > > > > > > 100k flows: > > > > no.subtable: 10 20 30 > > > > cd-ovs 4015783 3276100 2970645 > > > > orig-ovs 2692882 1711955 1302321 > > > > speedup 1.49x 1.91x 2.28x > > > > > > > > 1M flows: > > > > no.subtable: 10 20 30 > > > > cd-ovs 3895961 3170530 2968555 > > > > orig-ovs 2683455 1646227 1240501 > > > > speedup 1.45x 1.92x 2.39x > > > > > > > > Scalar data: > > > > 1M flows: > > > > no.subtable: 10 20 30 > > > > cd-ovs 3658328 3028111 2863329 > > > > orig_ovs 2683455 1646227 1240501 > > > > speedup 1.36x 1.84x 2.31x > > > > > > > > After v2 rebase: > > > > ---- > > > > After rebase for v1, we tested 1M flows, 20 table cases, the results > > > > still > > hold. > > > > 1M flows: > > > > no.subtable: 20 > > > > cd-ovs 3066483 > > > > orig-ovs 1588049 > > > > speedup 1.93x > > > > > > > > > > > > Test rules/traffic setup: > > > > ---- > > > > To setup a test case with 20 subtables, the rule set we use is like > > > > below: > > > > tcp,nw_src=1.0.0.0/8, actions=output:1 > > > > udp,nw_src=2.0.0.0/9, actions=output:1 > > > > udp,nw_src=3.0.0.0/10,actions=output:1 > > > > udp,nw_src=4.0.0.0/11,actions=output:1 > > > > ... > > > > udp,nw_src=18.0.0.0/25,actions=output:1 > > > > udp,nw_src=19.0.0.0/26,actions=output:1 > > > > udp,nw_src=20.0.0.0/27,actions=output:1 > > > > > > > > Then for the traffic generator, we generate corresponding traffics with > > > > src_ip varying from 1.0.0.0 to 20.0.0.0. For each src_ip, we change > > > > dst_ip for 50000 different values. This will effectively generate 1M > > > > different flows hitting the 20 rules we created. And because the > > > > different > > > > wildcarding bits in nw_src, the 20 rules will belong to 20 subtables. > > > > We use 64 Bytes packet across all tests. > > > > > > > > How to check if CD works or not for your use case: > > > > ---- > > > > CD cannot improve throughput for all use cases. It targets on use cases > > when > > > > multiple subtables exist and when the top-ranked subtable is not hit by > > the > > > > vast majority of the traffic. > > > > > > > > One can use $OVS_DIR/utilities/ovs-appctl dpif-netdev/pmd-stats-show > > > > command to check CD statistics: hit/miss. > > > > Another statistic also shown is: "avg. subtable lookups per hit". > > > > In our test case, the original OvS will have an average subtable lookups > > value > > > > as 10, because there are in total of 20 subtables, and on average, a hit > > > > happens > > > > after iterating half of them. In such case, iterating 10 subtables are > > > > very expensive. > > > > > > > > By using CD, this value will be close to 1, which means on average only > > > > 1 > > > > subtable needs to be iterated to hit the rule, which reduces a lot of > > overhead. > > > > > > > > Other statistics to notice about is "megaflow hits" and "emc hits". > > > > If most packets hit EMC, CD does not improve much of the throughput > > > > since CD is used to optimize megaflow search instead of EMC lookup. If > > your > > > > test > > > > case has less than 8k flows, all of them may be EMC hit. > > > > > > > > Note that CD is adaptively turned on/off according to the number of > > > > subtables and > > > > their iterated pattern. If it finds there is not much benefit, CD will > > > > turn off > > > > itself automatically. > > > > > > > > > > > > References: > > > > ---------- > > > > [1] H. Lee and B. Lee, Approaches for improving tuple space search-based > > > > table lookup, ICTC '15 > > > > [2] B. Fan, D. G. Andersen, M. Kaminsky, and M. D. Mitzenmacher, > > > > Cuckoo Filter: Practically Better Than Bloom, CoNEXT '14 > > > > > > > > The previous RFC on mailing list are at: > > > > https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330570.html > > > > > > > > v2: Rebase to master head. > > > > Add more testing details in cover letter. > > > > Change commit messages. > > > > Minor style changes to code. > > > > Fix build errors happens without AVX and DPDK library. > > > > > > > > Yipeng Wang (5): > > > > dpif-netdev: Basic CD feature with scalar lookup. > > > > dpif-netdev: Add AVX2 implementation for CD lookup. > > > > dpif-netdev: Add CD statistics > > > > dpif-netdev: Add adaptive CD mechanism > > > > unit-test: Add a delay for CD initialization. > > > > > > > > lib/dpif-netdev.c | 567 > > > > +++++++++++++++++++++++++++++++++++++++++++++++++- > > > > tests/ofproto-dpif.at | 3 + > > > > 2 files changed, 560 insertions(+), 10 deletions(-) > > > > > > > > -- > > > > 2.7.4 > > > > > > _______________________________________________ > > > dev mailing list > > > [email protected] > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
