Re: [ovs-dev] [PATCH 0/3] dpif-netdev: Combine CD and DFC patch for datapath refactor

Wang, Yipeng1 Wed, 21 Mar 2018 09:34:35 -0700

Hi, all

Thanks Billy for the test. The skewed traffic generator is very helpful.


After investigation with Billy, we found the flow count is not correct in the 
results table. The number of flows in the table is half of the real number, so 
the case that CD/DFC performs much worse than master head is at 2 million (not 
1 million) flow case. Since CD/DFC has 1 million entries, In such case, the 
DFC/CD suffers a lot of misses (like EMC miss will harm total throughput too).  

We are working on a better test script (including skewed traffic to various 
numbers of subtables), and also a new patch set. We will post version 2 and new 
results soon.

Thanks
Yipeng

>-----Original Message-----
>From: O Mahony, Billy
>Sent: Thursday, March 8, 2018 2:31 AM
>To: Wang, Yipeng1 <[email protected]>; [email protected]; Wang, 
>Yipeng1 <[email protected]>;
>[email protected]; Bodireddy, Bhanuprakash 
><[email protected]>; [email protected]
>Cc: Tai, Charlie <[email protected]>; Franck Baudin <[email protected]>
>Subject: RE: [ovs-dev] [PATCH 0/3] dpif-netdev: Combine CD and DFC patch for 
>datapath refactor
>
>Hi All,
>
>I have run some tests using a more realistic distribution of flows - see below
>- to what we normally test with.
>
>This is a phy to phy test with just port forwarding but I think that is the
>best way to test the EMC as it avoids noise from other mechanisms e.g. vhost
>and dpcls lookup. It uses 64B packets and 1 PMD.
>
>I also ran the tests with the EMC disabled.
>
>                             Baseline HoM 951cbaf 6/3/18
>                ||   emc-insert-prob 1/100  |     emc disabled
>offered   flows ||   rxd    emc    cycles/  |  rxd  emc       cycles
>kpps            ||  kpps   hits       pkt   | kpps hits          pkt
>----------------++--------------------------+-----------------------
>14,000        8 || 10830   100%       212   | 7360   0%          311
>14,000      100 ||  9730   100%       236   | 7370   0%          311
>14,000    10000 ||  6545    69%       345   | 7370   0%          311
>14,000  1000000 ||  6202     3%       370   | 7370   0%          311
>
>                                 Combine CD and DFC patch
>                ||  emc-insert-prob 1/100   |     emc disabled
>offered   flows ||  rxd    emc  dfc  cycles | rxd   emc   dfc cycles
>kpps            || kpps   hits hits    /pkt |      hits  hits   /pkt
>----------------++--------------------------+-----------------------
>14,000        8 || 10930  100%   0%     210 | 8570   0%  100%    268
>14,000      100 || 10220  100%   0%     224 | 7800   0%  100%    294
>14,000    10000 || 8000    84%  16%     287 | 6770   0%  100%    339
>14,000  1000000 || 5921     7%  65%     387 | 6060   0%   72%    378
>
>In these scenarios the patch gives an advantage at lower numbers of flows but
>this advantaged is reversed at very high flow numbers - presumably as the DFC 
>itself
>approaches capacity.
>
>Another interesting scenario to test would be the case of many shortlived 
>flows.
>1M flows and 200k new flows/sec for example. This real use case was presented 
>at
>the last OvS Con 
>https://www.slideshare.net/LF_OpenvSwitch/lfovs17ovsdpdk-for-nfv-go-live-feedback.
>Which I'd hope to implement in due course.
>
>Below is some details on the how and why flow distribution was made for these 
>tests.
>
>Regards,
>Billy
>
>
>All caches are designed on the assumption that in the real world that access
>requests are not uniformly distributed.  By design they are used to improve
>performance only in situations where some items are accessed more frequently
>than others.  If this assumption does not hold then the use a cache actually
>degrades performance.
>
>Currently we test the EMC with one of two scenarios both of which break the
>above assumption:
>
>1) Round Robin 'Distribution':
>
>    The TG sends a packet from flow#1 then flow#2 up to flow#N then back to
>    flow#1 again.
>
>    Testing with this case gives results that under-state the benefit of the
>    EMC to the maximum extent possible.  By sending packet from every other 
> flow in
>    between every two packets from any given flow the chances of the flow 
> having
>    been evicted in the interim between the two packets are maximized.  If a 
> tester
>    were to intentionally design a test to understate the benefits of the EMC 
> it
>    would be a round-robin flow distribution.
>
>2) Uniform Distribution:
>
>    The TG randomly selects the next packet to send from all the configured 
> flows.
>
>    Testing with this case gives results that under-state the benefit of the
>    EMC.  Testing with this case gives results that under-state the benefit
>    of the EMC.  As each flow is equally likely to occur then unless the 
> number of
>    flows is less than the number of EMC entries, there are many more flows 
> than
>    is likely no benefit from using the EMC.
>
>By testing only with these flow distributions that break the fundamental
>assumptions indicating the use of an EMC, we are consistently under-stating the
>benefits of flow-caching.
>
>A More Realistic Distribution:
>
>A more realistic distribution is almost certainly some kind of power-law or
>Pareto distribution.  In this kind of distribution the majority of packets
>belong to a small number of flows.  Think of the classic '80/20' rule.  Given a
>power-law distribution if we rank all the flows by their packets-per-second we
>see that flow with the Nth most packets-per-second is has a rate that is some
>fraction of the N-1th flow for all N.  This kind of distribution is what is
>seen in the natural world regarding the distribution of ranking of word
>occurrence in a language,  the population ranks of cities in various countries,
>corporation sizes, income rankings, ranks of number of people watching the same
>TV channel, and so on. [https://en.wikipedia.org/wiki/Zipf%27s_law].
>
>For example using a Zipfian distribution with k=1 as the model for pps for
>flows the flow distribution would look like this (y-axis is not linear):
>
>        1000| *
>         500| *  *
>         333| *  *  *
>PPS      250| *  *  *  *
>           ..
>          10| *  *  *  *       *
>            +------------...-------------
>              1  2  3  4 ...  100
>              Flow (PPS Ranking)
>
>
>Trex software-based traffic generator with a small amount of additional
>scripting does allow an approximation of such a distribution (specifically the
>limitation is that instead of pps rate for each individual flow a bucket of
>flows much share the same pps value - for example below all the flows are
>distributed across 8 streams to appoximate this kind of power-law
>distribution).
>
>For example:
>
># start 14000kpps 10000flows
>Each stream has 14000/8 kpps but has a different number of flows. Each stream
>has about half the number of flows as the previous one.
>
># flows per stream: [5019, 2509, 1254, 627, 313, 156, 78, 39] total: 9995
>
>The actual number of flows per stream is a close approximation of the precise
>distribution - usually it's out by just one or two. This is done to allow use
>of prime numbers for the ranges of the elements of the 5-tuple members. For
>example incrementing over 29 ip address and 173 port numbers means 29*173=5017
>unique flows as 29 and 173 are prime. Use of non-primes would result in
>aliasing of the combinations resulting in much less unique flows.
>
># stream#: 0 req. #flows: 5019 tuple_spans: [29, 173] diff:  2
># stream#: 1 req. #flows: 2509 tuple_spans: [13, 193] diff:  0
># stream#: 2 req. #flows: 1254 tuple_spans: [5, 251] diff:  1
># stream#: 3 req. #flows: 627 tuple_spans: [313, 2] diff:  1
># stream#: 4 req. #flows: 313 tuple_spans: [1, 313] diff:  0
># stream#: 5 req. #flows: 156 tuple_spans: [1, 157] diff:  1
># stream#: 6 req. #flows: 78 tuple_spans: [1, 79] diff:  1
># stream#: 7 req. #flows: 39 tuple_spans: [3, 13] diff:  0
># stream#: 0 src: 16.0.0.0 - 16.0.0.28 : 0 - 172 dst 48.0.0.0 - 48.0.0.1 : 0 - 
>1
># stream#: 1 src: 16.0.0.29 - 16.0.0.41 : 173 - 365 dst 48.0.0.0 - 48.0.0.1 : 
>0 - 1
># stream#: 2 src: 16.0.0.42 - 16.0.0.46 : 366 - 616 dst 48.0.0.0 - 48.0.0.1 : 
>0 - 1
># stream#: 3 src: 16.0.0.47 - 16.0.1.103 : 617 - 618 dst 48.0.0.0 - 48.0.0.1 : 
>0 - 1
># stream#: 4 src: 16.0.1.104 - 16.0.1.104 : 619 - 931 dst 48.0.0.0 - 48.0.0.1 
>: 0 - 1
># stream#: 5 src: 16.0.1.105 - 16.0.1.105 : 932 - 1088 dst 48.0.0.0 - 48.0.0.1 
>: 0 - 1
># stream#: 6 src: 16.0.1.106 - 16.0.1.106 : 1089 - 1167 dst 48.0.0.0 - 
>48.0.0.1 : 0 - 1
># stream#: 7 src: 16.0.1.107 - 16.0.1.109 : 1168 - 1180 dst 48.0.0.0 - 
>48.0.0.1 : 0 - 1
>
>
>
>> -----Original Message-----
>> From: [email protected] [mailto:ovs-dev-
>> [email protected]] On Behalf Of yipeng wang
>> Sent: Tuesday, March 6, 2018 7:23 PM
>> To: [email protected]; Wang, Yipeng1 <[email protected]>;
>> [email protected]; Bodireddy, Bhanuprakash
>> <[email protected]>; [email protected]
>> Cc: Tai, Charlie <[email protected]>
>> Subject: [ovs-dev] [PATCH 0/3] dpif-netdev: Combine CD and DFC patch for
>> datapath refactor
>>
>> This patch set is the V1 implementation to combine the CD and DFC design.
>> Both patches intend to refactor datapath to avoid costly sequential subtable
>> search.
>>
>> CD and DFC patch sets:
>> CD: [PATCH v2 0/5] dpif-netdev: Cuckoo-Distributor   implementation
>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/340305.html
>>
>> DFC: [PATCH] dpif-netdev: Refactor datapath flow cache
>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-November/341066.html
>>
>> The first commit is a rebase of Jan Scheurich's patch of [PATCH] dpif-netdev:
>> Refactor datapath flow cache
>>
>> The second commit is to incorporate CD's way-associative design into DFC to
>> improve the hit rate.
>>
>> The third commit is to change the distributor to cache an index of flow_table
>> entry to improve memory efficiency.
>>
>>
>> RFC of this patch set:
>> https://mail.openvswitch.org/pipermail/ovs-dev/2018-January/343411.html
>>
>>
>>
>> RFC->V1:
>> 1. rebase to master head.
>> 2. The last commit is totally rewritten to use the flow_table as indirect 
>> table.
>>    The CD/DFC distributor will cache the index of flow_table entry.
>> 3. Incorporate commit 2 into commit 1. (Bhanu's comment) 4. Change DFC to be
>> always on in commit 1. (Bhanu's comment)
>>
>>
>> Yipeng Wang (2):
>>   dpif-netdev: Use way-associative cache
>>   use flow_table as indirect table
>>
>> Jan Scheurich (1):
>>   dpif-netdev: Refactor datapath flow cache
>>
>>  lib/cmap.c             |  62 +++++++++
>>  lib/cmap.h             |   5 +
>>  lib/dpif-netdev-perf.h |   1 +
>>  lib/dpif-netdev.c      | 359 
>> +++++++++++++++++++++++++++++++++----------------
>>  4 files changed, 310 insertions(+), 117 deletions(-)
>>
>> --
>> 2.7.4
>>
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 0/3] dpif-netdev: Combine CD and DFC patch for datapath refactor

Reply via email to