> -----Original Message----- > From: Ilya Maximets <[email protected]> > Sent: Thursday, May 23, 2024 10:37 PM > To: Van Haaren, Harry <[email protected]>; Chaudron, Eelco > <[email protected]>; Finn, Emma <[email protected]> > Cc: [email protected]; [email protected] > Subject: Re: [ovs-dev] [v3] odp-execute: Fix AVX checksum calculation. > > On 5/23/24 13:03, Van Haaren, Harry wrote: > >> On 21 May 2024, at 16:13, Emma Finn wrote: > >>> The AVX implementation for calcualting checksums was not handling > >>> carry-over addition correctly in some cases. > >>> This patch adds an additional shuffle to add 16-bit padding to the > >>> final part of the calculation to handle such cases. > >>> This commit also adds a unit test to fuzz test the actions > >>> autovalidator. > >>> > >>> Signed-off-by: Emma Finn <[email protected]> > >>> Reported-by: Eelco Chaudron <[email protected]> > > <snip> > >>> +AT_SETUP([datapath - Actions Autovalidator Fuzzy]) AT_SKIP_IF([! > >>> +$PYTHON3 -c "import scapy"], [], []) AT_SKIP_IF([! $PYTHON3 > >>> +$srcdir/genpkts.py 10000 fuzzy > packets]) > >> > >> Hi Emma, > > > > Hi Eelco, Ilya & Emma (&OVS community) > > > > As you know checksums are a very "input sensitive" workload; its kind > > of why they're useful (hash ~uniquely represents inputset ideally) > > Internally, computing a hash always involves carry-bits and > > (expected/good) overflows, however this patch is fixing an unintended/bad > overflow. > > Testing these corner-cases & boundary conditions is hard; for me the > > only way to be confident in its correctness is autovaldation and fuzzing. > > Hi. Fuzzing is indeed a valid testing approach and in case of checksums it > may > be one of the most effective. However, as already mentioned, it has a few > disadvantages. The main one is unreliability - the test doesn't reproduce the > issue every time it runs, and also time - it takes about 3 minutes to run the > test. > > Fuzzing tests are useful, but they should not be part of the unit testing, > especially if they are not guaranteed to reproduce issues they are written > for. > They should be ran separately and continuously. Once the issue is found > through fuzzing, a new unit test for the found problem can be added to the > unit test suite, so we have a guaranteed failure in case the same issue is re- > introduced. Fuzzing can't guarantee that. > > This is the approach we use with oss-fuzz. oss-fuzz project is continuously > fuzzing parts of OVS. Once they find a bug, we fix it and add a regression > test > either for this exact case or for a class of issues, if possible. > > > > >> As Ilya explained in the v2 reply, we do not like to have fuzzy tests > >> in the test suite as they are hard to reproduce. > >> (link > >> > https://patchwork.ozlabs.org/project/openvswitch/patch/20240514134815 > >> [email protected]/#3313143) > > > > I understand that any randomness in CI is not desired, and understand > > that random tests can be hard to reproduce. The autovalidator tries to > > mitigate the "reproduce" issue by dumping the packet contents on > > failure, providing the erroring case to easily allow reproduction. > > This test *should* be 100% reliably passing, so no false-positives are > expected. > > If not, we should fix the test to be 100% reliable. > > This test can't produce false failures, I agree. However, it does produce > false > successes fairly regularly. For example, for me it doesn't fail IPv4 check in > about 2 out of 5 runs. Making it more reliably reproduce the issue would > mean increasing the number of samples for the already very long test. > > > > >> So my request is to send out a v4 with a test modeled around the > >> 'userspace offload - ip csum offload' test case we have in > >> tests/dpif-netdev.at, as per Ilya’s request. > > > > I'll suggest 3 paths forward, and give my pro/cons list for each; > > whatever option is preferred by OVS community we can send as a v4; > > > > 1) Accept "SIMD code fix" part of patch only (no unit test, just fixes issue > > with local fuzz-testing during development) > > 2) Build "specific-to-this-issue unit-test" only (regression tests against > > this specific issue, but nothing else; low value in my opinion, as it > > leaves a large input space untested) > > Large input space will remain untested anyway. Fuzzy test only covers a tiny > fraction at a time. > > There is also an advantage to the specific regression test - we can check the > expected output. In this case we can capture the resulted packet and check > that the checksum is not only the same between the action implementations, > but also that it is correct. This adds value to the test and makes it useful > even > if some implementations are not available on a testing system. > > We can take the string representation of the packet, replace the IP, > re-generate > the hex and compare with what we got on the other side in pcap. > > > 3) Accept "v3 patch as is, with fuzzing" (best test-coverage, confidence in > > current & future code changes. CI runtime on fuzz-tests, and if failures > > occur, they must be flagged & fixed, but are *actual real issues* that > > should be fixed.) > > > > I think 2) is what is being asked above, but wanted to make sure the > > tradeoffs are understood. As above, please indicate which is the > > preferred approach and a v4 will be on the way. > > > > Option 4: > > Build a specific unit test that that will test for regression for this > particular > issue. Send the patch out. > > Then move all the existing fuzzing tests into a separate test target, e.g. > 'make check-fuzz-autovalidator' (maybe there is a better name), add fuzzy > actions test there. Make some CI (realistically, Intel CI, because other CI > systems do not have AVX512) run it periodically or semi-continuously and > report failures to the mailing list, so fixes can be developed and regression > tests added to the main unit test suite. > > What do you think? >
Thanks for the response Ilya. I have pushed the v4 with the regression test for Checksum issue. The test is mirrored off 'userspace offload - ip csum offload' test case' as suggested. I did not include the fuzz test refactor you spoke about above. That is a refactor/improvement, and I think it is more important to address the checksum fixes first. Thanks, Emma > > > FWIW, I collected a few samples that can be used for the regression test: > > eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800, > nw_src=9.77.254.180,nw_dst=216.53.194.20,nw_proto=6,nw_ttl=64,nw_fr > ag=no, > tp_src=14303,tp_dst=1044,tcp_flags=ack > > eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800, > nw_src=108.72.150.185,nw_dst=247.95.158.114,nw_proto=6,nw_ttl=64,n > w_frag=no, > tp_src=32598,tp_dst=7421,tcp_flags=ack > > eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800, > nw_src=106.187.156.70,nw_dst=117.142.150.202,nw_proto=6,nw_ttl=64, > nw_frag=no, > tp_src=10122,tp_dst=1462,tcp_flags=ack > > eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd, > ipv6_src=2f8a:2076:3926:9e7:2d47:4bc9:9c7:17f3, > ipv6_dst=7287:10dd:2fb9:41d5:3eb2:2c7a:11b0:6258, > ipv6_label=0x51ac,nw_proto=6,nw_ttl=142,nw_frag=no, > tp_src=20405,tp_dst=20662,tcp_flags=ack > > eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd, > ipv6_src=630c:659b:2561:38e1:6989:4f2b:3386:51f6, > ipv6_dst=658e:6f2:29fc:35b:5fdf:6558:119:40bc, > ipv6_label=0x721b,nw_proto=6,nw_ttl=20,nw_frag=no, > tp_src=2632,tp_dst=15427,tcp_flags=ack > > eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd, > ipv6_src=16bf:5603:6d40:1620:2a8e:6a83:4348:6136, > ipv6_dst=5003:74fd:47cd:6b65:353c:7fc1:5ac5:7d08, > ipv6_label=0x7661,nw_proto=6,nw_ttl=214,nw_frag=no, > tp_src=21024,tp_dst=23624,tcp_flags=ack > > Feeding them into 'ovs-ofctl compose-packet --bare' produces valid packets > that trigger the issue. > > Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
