> -----Original Message-----
> From: Ilya Maximets <[email protected]>
> Sent: Thursday, May 23, 2024 10:37 PM
> To: Van Haaren, Harry <[email protected]>; Chaudron, Eelco
> <[email protected]>; Finn, Emma <[email protected]>
> Cc: [email protected]; [email protected]
> Subject: Re: [ovs-dev] [v3] odp-execute: Fix AVX checksum calculation.
> 
> On 5/23/24 13:03, Van Haaren, Harry wrote:
> >> On 21 May 2024, at 16:13, Emma Finn wrote:
> >>> The AVX implementation for calcualting checksums was not handling
> >>> carry-over addition correctly in some cases.
> >>> This patch adds an additional shuffle to add 16-bit padding to the
> >>> final part of the calculation to handle such cases.
> >>> This commit also adds a unit test to fuzz test the actions
> >>> autovalidator.
> >>>
> >>> Signed-off-by: Emma Finn <[email protected]>
> >>> Reported-by: Eelco Chaudron <[email protected]>
> > <snip>
> >>> +AT_SETUP([datapath - Actions Autovalidator Fuzzy]) AT_SKIP_IF([!
> >>> +$PYTHON3 -c "import scapy"], [], []) AT_SKIP_IF([! $PYTHON3
> >>> +$srcdir/genpkts.py 10000 fuzzy > packets])
> >>
> >> Hi Emma,
> >
> > Hi Eelco, Ilya & Emma (&OVS community)
> >
> > As you know checksums are a very "input sensitive" workload; its kind
> > of why they're useful (hash ~uniquely represents inputset ideally)
> > Internally, computing a hash always involves carry-bits and
> > (expected/good) overflows, however this patch is fixing an unintended/bad
> overflow.
> > Testing these corner-cases & boundary conditions is hard; for me the
> > only way to be confident in its correctness is autovaldation and fuzzing.
> 
> Hi.  Fuzzing is indeed a valid testing approach and in case of checksums it 
> may
> be one of the most effective.  However, as already mentioned, it has a few
> disadvantages.  The main one is unreliability - the test doesn't reproduce the
> issue every time it runs, and also time - it takes about 3 minutes to run the
> test.
> 
> Fuzzing tests are useful, but they should not be part of the unit testing,
> especially if they are not guaranteed to reproduce issues they are written 
> for.
> They should be ran separately and continuously.  Once the issue is found
> through fuzzing, a new unit test for the found problem can be added to the
> unit test suite, so we have a guaranteed failure in case the same issue is re-
> introduced.  Fuzzing can't guarantee that.
> 
> This is the approach we use with oss-fuzz.  oss-fuzz project is continuously
> fuzzing parts of OVS.  Once they find a bug, we fix it and add a regression 
> test
> either for this exact case or for a class of issues, if possible.
> 
> >
> >> As Ilya explained in the v2 reply, we do not like to have fuzzy tests
> >> in the test suite as they are hard to reproduce.
> >> (link
> >>
> https://patchwork.ozlabs.org/project/openvswitch/patch/20240514134815
> >> [email protected]/#3313143)
> >
> > I understand that any randomness in CI is not desired, and understand
> > that random tests can be hard to reproduce. The autovalidator tries to
> > mitigate the "reproduce" issue by dumping the packet contents on
> > failure, providing the erroring case to easily allow reproduction.
> > This test *should* be 100% reliably passing, so no false-positives are
> expected.
> > If not, we should fix the test to be 100% reliable.
> 
> This test can't produce false failures, I agree.  However, it does produce 
> false
> successes fairly regularly.  For example, for me it doesn't fail IPv4 check in
> about 2 out of 5 runs.  Making it more reliably reproduce the issue would
> mean increasing the number of samples for the already very long test.
> 
> >
> >> So my request is to send out a v4 with a test modeled around the
> >> 'userspace offload - ip csum offload' test case we have in
> >> tests/dpif-netdev.at, as per Ilya’s request.
> >
> > I'll suggest 3 paths forward, and give my pro/cons list for each;
> > whatever option is preferred by OVS community we can send as a v4;
> >
> > 1) Accept "SIMD code fix" part of patch only (no unit test, just fixes issue
> >    with local fuzz-testing during development)
> > 2) Build "specific-to-this-issue unit-test" only (regression tests against
> >    this specific issue, but nothing else; low value in my opinion, as it
> >    leaves a large input space untested)
> 
> Large input space will remain untested anyway.  Fuzzy test only covers a tiny
> fraction at a time.
> 
> There is also an advantage to the specific regression test - we can check the
> expected output.  In this case we can capture the resulted packet and check
> that the checksum is not only the same between the action implementations,
> but also that it is correct.  This adds value to the test and makes it useful 
> even
> if some implementations are not available on a testing system.
> 
> We can take the string representation of the packet, replace the IP, 
> re-generate
> the hex and compare with what we got on the other side in pcap.
> 
> > 3) Accept "v3 patch as is, with fuzzing" (best test-coverage, confidence in
> >    current & future code changes. CI runtime on fuzz-tests, and if failures
> >    occur, they must be flagged & fixed, but are *actual real issues* that
> >    should be fixed.)
> >
> > I think 2) is what is being asked above, but wanted to make sure the
> > tradeoffs are understood.  As above, please indicate which is the
> > preferred approach and a v4 will be on the way.
> >
> 
> Option 4:
> 
> Build a specific unit test that that will test for regression for this 
> particular
> issue.  Send the patch out.
> 
> Then move all the existing fuzzing tests into a separate test target, e.g.
> 'make check-fuzz-autovalidator' (maybe there is a better name), add fuzzy
> actions test there.  Make some CI (realistically, Intel CI, because other CI
> systems do not have AVX512) run it periodically or semi-continuously and
> report failures to the mailing list, so fixes can be developed and regression
> tests added to the main unit test suite.
> 
> What do you think?
> 

Thanks for the response Ilya. I have pushed the v4 with the regression test for
Checksum issue. The test is mirrored off 'userspace offload - ip csum offload' 
test case'
as suggested.

I did not include the fuzz test refactor you spoke about above. That is a 
refactor/improvement, 
and I think it is more important to address the checksum fixes first.

Thanks, 
Emma 
> 
> 
> FWIW, I collected a few samples that can be used for the regression test:
> 
> eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800,
> nw_src=9.77.254.180,nw_dst=216.53.194.20,nw_proto=6,nw_ttl=64,nw_fr
> ag=no,
> tp_src=14303,tp_dst=1044,tcp_flags=ack
> 
> eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800,
> nw_src=108.72.150.185,nw_dst=247.95.158.114,nw_proto=6,nw_ttl=64,n
> w_frag=no,
> tp_src=32598,tp_dst=7421,tcp_flags=ack
> 
> eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800,
> nw_src=106.187.156.70,nw_dst=117.142.150.202,nw_proto=6,nw_ttl=64,
> nw_frag=no,
> tp_src=10122,tp_dst=1462,tcp_flags=ack
> 
> eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd,
> ipv6_src=2f8a:2076:3926:9e7:2d47:4bc9:9c7:17f3,
> ipv6_dst=7287:10dd:2fb9:41d5:3eb2:2c7a:11b0:6258,
> ipv6_label=0x51ac,nw_proto=6,nw_ttl=142,nw_frag=no,
> tp_src=20405,tp_dst=20662,tcp_flags=ack
> 
> eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd,
> ipv6_src=630c:659b:2561:38e1:6989:4f2b:3386:51f6,
> ipv6_dst=658e:6f2:29fc:35b:5fdf:6558:119:40bc,
> ipv6_label=0x721b,nw_proto=6,nw_ttl=20,nw_frag=no,
> tp_src=2632,tp_dst=15427,tcp_flags=ack
> 
> eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd,
> ipv6_src=16bf:5603:6d40:1620:2a8e:6a83:4348:6136,
> ipv6_dst=5003:74fd:47cd:6b65:353c:7fc1:5ac5:7d08,
> ipv6_label=0x7661,nw_proto=6,nw_ttl=214,nw_frag=no,
> tp_src=21024,tp_dst=23624,tcp_flags=ack
> 
> Feeding them into 'ovs-ofctl compose-packet --bare' produces valid packets
> that trigger the issue.
> 
> Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to