Resending email to cc csit and vpp lists
+csit-dev vpp-dev
Karl,
We had a discussion about the variability in today's CSIT meeting.
Initial testing has showed that CSIT-925
<https://jira.fd.io/browse/CSIT-925> shows the best promise in solving
the problem. The first recommendation is that you disable all unused
plugins.
The CSIT team is encouraging you to repeat the tests with all but the
needed plugins disabled and is interested in hearing the results.
In addition, there was a conversation between myself and Ray Kinsella
from Intel during the meeting. He is interesting in reviewing your test
setup and comparing with what they are doing internally in Intel and and
fdio CSIT.
I will start a separate thread with yourself, and Ray et. al. about
comparing configs etc.
--Tom
On 02/20/2018 06:20 PM, Billy McFall wrote:
Hey Karl,
Thomas was going to follow-up with you and Andrew at Andrew's next
NetPerf meeting on the variability you are seeing in your VPP testing.
Not sure if that meeting has happen or not, but I wanted to touch
basis with you because there was a lot of discussion in today's VPP
call around some of the variability they are seeing in CSIT. Most of
what they are reporting is being seen in VPP 18.01 and not in VPP
17.10. I think you are seeing it across the last couple of releases,
so some of the points below may not address your issue. One thought is
I wonder how long their tests run for? I think your tests run for 5
min if I remember correctly.
Couple of points:
* First, not sure of you saw, but VPP 18.01.1 was released on
2/7/2018. I attached a diff of the CLI to VPP 17.10. Probably too
late since you already ran VPP 18.01 through its paces. I'll try
to get it out earlier next release.
* The FD.io CSIT 18.01 Report has been released (based on VPP 18.01.1):
o https://docs.fd.io/csit/rls1801/doc/
* During the VPP 18.01 testing, some performance degradation was
discovered. In CSIT, they always test with all plugins installed
(default for VPP). They tracked down the an issue in NAT where
some NAT worker thread was doing some periodic work even though
NAT wasn't enabled (VPP-1162
<https://jira.fd.io/browse/VPP-1162>). That along with a VTS fix
pushed for a VPP 18.01.1.
* They have identified a few additional issues that seem to be
causing some variability in the CSIT environment that have NOT
been fixed. Not sure if these could be causing the deviation you
are seeing:
o Known Issues
<https://docs.fd.io/csit/rls1801/report/vpp_performance_tests/csit_release_notes.html#known-issues>
-
Particularly:
+ CSIT-925 <https://jira.fd.io/browse/CSIT-925> - With all
plugins loaded (default VPP startup config) rates vary
intermittently 3% to 5% across multiple test executions.
Not seen in VPP 17.10 (so may not be what you are seeing)
and not seen if all plugins except DPDK are disabled.
+ CSIT-926 <https://jira.fd.io/browse/CSIT-926> - NDR, PDR
and MaxRates of -3%..-1% vs. rls1710
+ CSIT-927 <https://jira.fd.io/browse/CSIT-927> - vhost-user
lower NDR: virtio vring size is not properly negotiated to
1024, instead it's set to the default of 256. They don't
think the code changed so looking into test setup or test
environment.
* Section 2.2.2.1 and 2.2.2.2 from the report have links (for
example see pretty ASCII format for 1t1c
<https://docs.fd.io/csit/rls1801/report/_static/vpp/performance-changes-ndr-1t1c-full.txt>)
to a text file with rates and stdev for the tests. There are links
for NDR and PDR and 1t1c and 2t2c.
* I remember from previous VPP calls that theFD.io CSIT 18.01
Report was also held up to complete some pre and post Meltdown and
Spectre fix tests, comparing performance before and after OS
patches. I searched for Spectre in the report and came up with
this link, but the tests that are pointed to don't exist, so this
may still be a work in progress,
o Impact of SpectreAndMeltdown Patches
<https://docs.fd.io/csit/rls1801/report/vpp_performance_tests/impact_spectreandmeltdown/index.html>
Billy McFall
On Mon, Feb 5, 2018 at 2:36 PM, Karl Rister (via Google Sheets)
<drive-shares-nore...@google.com
<mailto:drive-shares-nore...@google.com>> wrote:
kris...@redhat.com <mailto:kris...@redhat.com> has invited you to
*comment on* the following spreadsheet:
vpp-comparison-18.01
<https://docs.google.com/spreadsheets/d/1jFoQZieTT93xikWcjZU08J1kF6qn_tjTTC0A3mu0yyo/edit?usp=sharing_eil&ts=5a78b247>
Unknown profile photoHere is the latest set of results we have for
VPP testing with release 18.01. We did a bunch of cleanup on how
the results are presented to hopefully make it easier to comprehend.
One thing that stands out to me is that VPP in general has much
higher variability between the recorded samples than OVS (the
exception being tests where OVS scored very low; the variability
there is quite high since small differences between each sample
are magnified). The general trend is that VPP variability is
increasing at 1M flows and it's a bit mixed at 256 and 10K flows.
Open in Sheets
<https://docs.google.com/spreadsheets/d/1jFoQZieTT93xikWcjZU08J1kF6qn_tjTTC0A3mu0yyo/edit?usp=sharing_eip&ts=5a78b247>
Google Sheets: Create and edit spreadsheets online.
Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA 94043,
USA
<https://maps.google.com/?q=1600+Amphitheatre+Parkway,+Mountain+View,+CA+94043,+USA&entry=gmail&source=g>
You have received this email because someone shared a spreadsheet
with you from Google Sheets. Logo for Google Sheets
<https://drive.google.com>
--
*Billy McFall*
Networking Group
CTO Office
*Red Hat*
--
*Thomas F Herbert*
NFV and Fast Data Planes
Networking Group Office of the CTO
*Red Hat*