Hi Ilya, Just wanted to check if you found anything interesting. Or anything we can try. Thanks, Shahaji
On Wed, Jun 20, 2018 at 9:01 AM, Shahaji Bhosle <[email protected] > wrote: > Thanks Ilya, > Sorry for the confusion with the number, we used to get some different > numbers on both ports so were recording it per port. You have to compare it > with the two port number.... > > CPU mask Mpps > 17.11 testpmd 6 queue 0xfe 21.5 + 21.5 > OvS 2.9+DPDK17.11 6 queue 0xfe 15.5 + 15.5 > 16.11 testpmd 6 queue 0xfe 21.5 + 21.5 > OvS 2.7+DPDK16.11 6 queue 0xfe 17.4+17.4 > Thanks, Shahaji > > On Wed, Jun 20, 2018 at 8:34 AM, Ilya Maximets <[email protected]> > wrote: > >> Ok, I'll look at the data later. >> >> But your testpmd results are much lower than OVS results. 21.5Mpps for >> testpmd >> versus 33.8Mpps for OVS. OVS should work slower than testpmd, because it >> performs >> a lot of parsing and processing while testpmd does not. >> You probably tested testpmd in deifferent environment or allocated less >> amount >> of resources for PMD htreads. Could you please recheck? >> >> What is your OVS configuration (pmd-cpu-mask, n_rxqs etc.)? >> And what is your testpmd command-line? >> >> On 20.06.2018 14:54, Shahaji Bhosle wrote: >> > Thanks Ilya, >> > Attaching the two perf reports...We did run testpmd on its own, there >> were no red flags there. In some of the cases like flowgen 17.11 performs >> much better than 16.11, but for the macswap case, the numbers are below. >> Let me know if you cannot see the attached perf reports. I can just cut and >> paste them in the email if attachment does not work. Sorry I am not sure I >> can post these on any outside servers. Let me know >> > Thanks, Shahaji >> > >> > *DPDK on Maia (macswap)* *Rings* *Mpps* *Cycles/Packet* >> > 17.11 testpmd 6 queue 21.5 + 21.5 60 >> > 1 queue 10.4+10.4 14 >> > 16.11 testpmd 6 queue 21.5 + 21.5 60 >> > 1 queue 10.4+10.4 14 >> > >> > >> > On Wed, Jun 20, 2018 at 4:52 AM, Ilya Maximets <[email protected] >> <mailto:[email protected]>> wrote: >> > >> > Looking at your perf stats I see following: >> > >> > OVS 2.7: >> > >> > ??.??% - dp_netdev_process_rxq_port >> > |-- 93.36% - dp_netdev_input >> > |-- ??.??% - netdev_rxq_recv >> > >> > OVS 2.9: >> > >> > 99.69% - dp_netdev_process_rxq_port >> > |-- 79.45% - dp_netdev_input >> > |-- 11.26% - dp_netdev_pmd_flush_output_packets >> > |-- ??.??% - netdev_rxq_recv >> > >> > Could you please fill the missed (??.??) values? >> > This data I got from the picture attached to the previous mail, but >> pictures >> > are still not allowed in mail-list (i.e. stripped). It'll be good >> if you can >> > upload your raw data to some external resource and post the link >> here. >> > >> > Anyway, from the data I have, I can see that total sum of time >> spent in >> > "dp_netdev_input" and "dp_netdev_pmd_flush_output_packets" for 2.9 >> is 90.71%, >> > which is less then 93.36% spent for 2.7. This means that processing >> + sending >> > become even faster or remains with the approximately same >> performance. >> > We definitely need all the missed values to be sure, but it seems >> that the >> > "netdev_rxq_recv()" could be the issue. >> > >> > To check if DPDK itself causes the performance regression, I'd ask >> you >> > to check pure PHY-PHY test with testpmd app from DPDK 16.11 and >> DPDK 17.11. >> > Maybe it's the performance issue with bnxt driver that you're using. >> > There was too many changes in that driver: >> > >> > 30 files changed, 17189 insertions(+), 3358 deletions(-) >> > >> > Best regards, Ilya Maximets. >> > >> > On 20.06.2018 01:18, Shahaji Bhosle wrote: >> > > Hi Ilya, >> > > This issue is a release blocker for us, just wanted to check >> check if you need more details from us? Anything to expedite or root cause >> the problem we can help >> > > Please let us know >> > > Thanks Shahaji >> > > >> > > On Mon, Jun 18, 2018 at 10:20 AM Shahaji Bhosle < >> [email protected] <mailto:[email protected]> <mailto: >> [email protected] <mailto:[email protected]>>> wrote: >> > > >> > > Thanks Ilya, I will look at the commit, but not sure now how >> to tell how much real work is being done, I would have liked polling cycles >> to be treated as before and not towards packet processing. That does >> explain, as long as there are packets on the wire we are always 100%, >> basically cannot tell how efficiently the CPUs are being used. >> > > Thanks, Shahaji >> > > >> > > On Mon, Jun 18, 2018 at 10:07 AM, Ilya Maximets < >> [email protected] <mailto:[email protected]> <mailto: >> [email protected] <mailto:[email protected]>>> wrote: >> > > >> > > Thanks for the data. >> > > >> > > I have to note additionally that the meaning of >> "processing cycles" >> > > significantly changed since the following commit: >> > > >> > > commit a2ac666d5265c01661e189caac321d962f54649f >> > > Author: Ciara Loftus <[email protected] <mailto: >> [email protected]> <mailto:[email protected] <mailto: >> [email protected]>>> >> > > Date: Mon Feb 20 12:53:00 2017 +0000 >> > > >> > > dpif-netdev: Change definitions of 'idle' & >> 'processing' cycles >> > > >> > > Instead of counting all polling cycles as >> processing cycles, only count >> > > the cycles where packets were received from the >> polling. >> > > >> > > This could explain the difference in "PMD Processing >> Cycles" column, >> > > because successful "POLLING" cycles are now included into >> "PROCESSING". >> > > >> > > Best regards, Ilya Maximets. >> > > >> > > On 18.06.2018 16:31, Shahaji Bhosle wrote: >> > > > Hi Ilya, >> > > > Thanks for the quick reply, >> > > > Please find the numbers for our PHY-PHY test, please >> note that with OVS 2.9.1 + DPDK 17.11 even a 10% of the below numbers will >> make the OVS 2.9+DPDK17.11 processing cycles to hit 100%, but 2.7 will on >> our setup never goes above 75% for processing cycles. I am also attaching >> the perf report between the two code bases and I think the >> "11.26%--dp_netdev_pmd_flush_output_packets" is causing us to take the >> performance hit. Out testing is also SRIOV and CPUs are ARM A72 cores. We >> are happy to run more tests, it is not easy for use to move back to OVS >> 2.8, but could happy to try more experiments if it helps us narrow down >> further. Please note we have also tried increasing the tx-flush-interval >> and it helps a little but still not significant enough. Let us know. >> > > > >> > > > Thanks, Shahaji >> > > > >> > > > >> > > > *Setup:* >> > > > IXIA<----SFP28--->Port 0 {(PF0)==[OVS+DPDK]==(PF1)} >> Port 1<-----SFP28---->IXIA >> > > > >> > > > release/version config Test direction >> MPPS Ixia Line rate (%) PMD Processing Cycles (%) >> > > > OVS 2.9 + DPDK 17.11 OVS on Maia (PF0--PF1) No drop >> port 1 to 2 31.3 85 99.9 >> > > > >> port 2 to 1 31.3 85 99.9 >> > > > >> bi 15.5 + 15.5 42 99.9 >> > > > >> > > > >> > > > OVS 2.7 + DPDK 16.11 OVS on Maia (PF0--PF1) No drop >> port 1 to 2 33.8 90 71 >> > > > >> port 2 to 1 32.7 88 70 >> > > > >> bi 17.4+17.4 47 74 >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > On Mon, Jun 18, 2018 at 4:25 AM, Nitin Katiyar < >> [email protected] <mailto:[email protected]> <mailto: >> [email protected] <mailto:[email protected]>> <mailto: >> [email protected] <mailto:[email protected]> <mailto: >> [email protected] <mailto:[email protected]>>>> wrote: >> > > > >> > > > Hi, >> > > > We also experienced degradation from OVS2.6/2.7 to >> OVS2.8.2(with DPDK17.05.02). The drop is more for 64 bytes packet size >> (~8-10%) even with higher number of flows. I tried OVS 2.8 with DPDK17.11 >> and it improved for higher packet sizes but 64 bytes size is still the >> concern. >> > > > >> > > > Regards, >> > > > Nitin >> > > > >> > > > -----Original Message----- >> > > > From: Ilya Maximets [mailto:[email protected] >> <mailto:[email protected]> <mailto:[email protected] <mailto: >> [email protected]>> <mailto:[email protected] <mailto: >> [email protected]> <mailto:[email protected] <mailto: >> [email protected]>>>] >> > > > Sent: Monday, June 18, 2018 1:32 PM >> > > > To: [email protected] <mailto: >> [email protected]> <mailto:[email protected] <mailto: >> [email protected]>> <mailto:[email protected] <mailto: >> [email protected]> <mailto:[email protected] <mailto: >> [email protected]>>>; [email protected] <mailto: >> [email protected]> <mailto:[email protected] <mailto: >> [email protected]>> <mailto:[email protected] >> <mailto:[email protected]> <mailto:[email protected] >> <mailto:[email protected]>>> >> > > > Subject: Re: [ovs-dev] 64Byte packet performance >> regression on 2.9 from 2.7 >> > > > >> > > > CC: Shahaji Bhosle >> > > > >> > > > Sorry, missed you in CC list. >> > > > >> > > > Best regards, Ilya Maximets. >> > > > >> > > > On 15.06.2018 10:44, Ilya Maximets wrote: >> > > > >> Hi, >> > > > >> I just upgraded from OvS 2.7 + DPDK 16.11 to >> OvS2.9 + DPDK 17.11 and >> > > > >> running into performance issue with 64 Byte >> packet rate. One >> > > > >> interesting thing that I notice that even at >> very light load from >> > > > >> IXIA the processing cycles on all the PMD >> threads run close to 100% >> > > > >> of the cpu cycle on 2.9 OvS, but on OvS 2.7 even >> under full load the >> > > > >> processing cycles remain at 75% of the cpu >> cycles. >> > > > >> >> > > > >> Attaching the FlameGraphs of both the versions, >> the only thing that >> > > > >> pops out to me is the new way invoking >> netdev_send() is on 2.9 is >> > > > >> being invoked via >> > dp_netdev_pmd_flush_output_packets() >> which seems >> > > > >> to be adding another ~11% to the whole rx to tx >> path. >> > > > >> >> > > > >> I also did try the tx-flush-interval to 50 and >> more it does seem to >> > > > >> help, but not significant enough to match the >> 2.7 performance. >> > > > >> >> > > > >> >> > > > >> Any help or ideas would be really great. Thanks, >> Shahaji >> > > > > >> > > > > Hello, Shahaji. >> > > > > Could you, please, describe your testing scenario >> in more details? >> > > > > Also, mail-list filters attachments, so they are >> not available. You >> > > > > need to publish them somewhere else or write in >> text format inside the letter. >> > > > > >> > > > > About the performance itself: Some performance >> degradation because of >> > > > > output batching is expected for tests with low >> number of flows or >> > > > > simple PHY-PHY tests. It was mainly targeted for >> cases with relatively >> > > > > large number of flows, for amortizing of >> vhost-user penalties >> > > > > (PHY-VM-PHY, VM-VM cases), OVS bonding cases. >> > > > > >> > > > > If your test involves vhost-user ports, then you >> should also consider >> > > > > vhost-user performance regression in stable DPDK >> 17.11 because of >> > > > > fixes for CVE-2018-1059. Related bug: >> > > > > https://dpdk.org/tracker/show_bug.cgi?id=48 >> <https://dpdk.org/tracker/show_bug.cgi?id=48> < >> https://dpdk.org/tracker/show_bug.cgi?id=48 < >> https://dpdk.org/tracker/show_bug.cgi?id=48>> >> > > > > >> > > > > It'll be good if you'll be able to test OVS 2.8 + >> DPDK 17.05. There >> > > > > was too many changes since 2.7. It'll be hard to >> track down the root cause. >> > > > > >> > > > > Best regards, Ilya Maximets. >> > > > > >> > > > >> > > > >> > > >> > > >> > >> > >> > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
