Karl, Thanks for all answers and clarifications!
~23Mpps that’s indeed your NIC - we’re observing same in CSIT.
One more point re host setup - I read it’s CPU power management disabled,
CPU TurboBoost disabled, and all memory channels populated?
I looks so, but wanted to recheck as not listed on your opening slide :)
> On 15 Feb 2017, at 21:55, Karl Rister <kris...@redhat.com> wrote:
> On 02/15/2017 03:28 PM, Maciek Konstantynowicz (mkonstan) wrote:
>> Thomas, many thanks for sending this.
>> Few comments and questions after reading the slides:
>> 1. s3 clarification - host and data plane thread setup - vswitch pmd
>> (data plane) thread placement
>> a. "1PMD/core (4 core)” - HT (SMT) disabled, 4 phy cores used for
>> vswitch, each with data plane thread.
>> b. “2PMD/core (2 core)” - HT (SMT) enabled, 2 phy cores, 4 logical
>> cores used for vswitch, each with data plane thread.
>> c. in both cases each data plane thread handling a single interface
>> - 2* physical, 2* vhost => 4 threads, all busy.
>> d. in both cases frames are dropped by vswitch or in vring due to
>> vswitch not keeping up - IOW testpmd in kvm guest is not DUT.
> That is the intent.
>> 2. s3 question - vswitch setup - it is unclear what is the forwarding
>> mode of each vswitch, as only srcIp changed in flows
>> a. flow or MAC learning mode?
> In OVS we program flow rules that pass bidirectional traffic between a
> physical and vhost port pair.
>> b. port to port crossconnect?
> In VPP we are using xconnect.
>> 3. s3 comment - host and data plane thread setup
>> a. “2PMD/core (2 core)” case - thread placement may yield different
>> - physical interface threads as siblings vs.
>> - physical and virtual interface threads as siblings.
> In both OVS and VPP a physical interface thread is paired with a virtual
> interface thread on the same core.
>> b. "1PMD/core (4 core)” - one would expect these to be much higher
>> than “2PMD/core (2 core)”
>> - speculation: possibly due to "instruction load" imbalance
>> between threads.
>> - two types of thread with different "instruction load":
>> phy->vhost vs. vhost->phy
>> - "instruction load" = instr/pkt, instr/cycle (IPC efficiency).
>> 4. s4 comment - results look as expected for vpp
>> 5. s5 question - unclear why throughput doubled
>> a. e.g. for vpp from "11.16 Mpps" to "22.03 Mpps"
>> b. if only queues increased, and cpu resources did not, or have they?
>> 6. s6 question - similar to point 5. - unclear cpu and thread reasources.
> Queues and cores increase together. In the host single queue used 4 PMD
> threads on 2 core, two queue uses 8 PMD threads on 4 cores, and three
> queue uses 12 PMD threads on 6 cores. In the guest we used 2, 4, and 6
> cores in testpmd without using sibling hyperthreads in order to avoid
> bottlenecks in the guest.
>> 7. s7 comment - anomaly for 3q (virtio multi-queue) for (srcMAc,dstMAC)
>> a. could be due to flow hashing inefficiency.
> That was our thinking and where we were going to look first.
> I think I have tracked the three queue issue to using too many mbufs for
> multi-queue as addressed in this commit:
> I originally used the suggestion of 131072 from this page:
> I'm now testing with 3 queue and 32768 mbufs and getting in excess of 23
> Mpps across all the flow configurations except the one with the hashing
> issue. For our hardware configuration we believe this is hardware
> limited and could potentially go faster (as mentioned on slide 6).
>>> On 15 Feb 2017, at 17:34, Thomas F Herbert <therb...@redhat.com
>>> <mailto:therb...@redhat.com>> wrote:
>>> Here are test results on VPP 17.01 compared with OVS/DPDK 2.6/1611
>>> performed by Karl Rister of Red Hat.
>>> This is PVP testing with 1, 2 and 3 queues. It is an interesting
>>> comparison with the CSIT results. Of particular interest is the drop
>>> off on the 3 queue results.
>>> *Thomas F Herbert*
>>> SDN Group
>>> Office of Technology
>>> *Red Hat*
>>> vpp-dev mailing list
>>> firstname.lastname@example.org <mailto:email@example.com>
> Karl Rister <kris...@redhat.com>
vpp-dev mailing list