On 5/23/24 13:03, Van Haaren, Harry wrote: >> On 21 May 2024, at 16:13, Emma Finn wrote: >>> The AVX implementation for calcualting checksums was not >>> handling carry-over addition correctly in some cases. >>> This patch adds an additional shuffle to add 16-bit padding >>> to the final part of the calculation to handle such cases. >>> This commit also adds a unit test to fuzz test the actions >>> autovalidator. >>> >>> Signed-off-by: Emma Finn <[email protected]> >>> Reported-by: Eelco Chaudron <[email protected]> > <snip> >>> +AT_SETUP([datapath - Actions Autovalidator Fuzzy]) >>> +AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], []) >>> +AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 10000 fuzzy > packets]) >> >> Hi Emma, > > Hi Eelco, Ilya & Emma (&OVS community) > > As you know checksums are a very "input sensitive" workload; its kind of > why they're useful (hash ~uniquely represents inputset ideally) > Internally, computing a hash always involves carry-bits and (expected/good) > overflows, however this patch is fixing an unintended/bad overflow. > Testing these corner-cases & boundary conditions is hard; for me the only > way to be confident in its correctness is autovaldation and fuzzing.
Hi. Fuzzing is indeed a valid testing approach and in case of checksums it may be one of the most effective. However, as already mentioned, it has a few disadvantages. The main one is unreliability - the test doesn't reproduce the issue every time it runs, and also time - it takes about 3 minutes to run the test. Fuzzing tests are useful, but they should not be part of the unit testing, especially if they are not guaranteed to reproduce issues they are written for. They should be ran separately and continuously. Once the issue is found through fuzzing, a new unit test for the found problem can be added to the unit test suite, so we have a guaranteed failure in case the same issue is re-introduced. Fuzzing can't guarantee that. This is the approach we use with oss-fuzz. oss-fuzz project is continuously fuzzing parts of OVS. Once they find a bug, we fix it and add a regression test either for this exact case or for a class of issues, if possible. > >> As Ilya explained in the v2 reply, we do not like to have fuzzy tests in >> the test suite as they are hard to reproduce. >> (link >> https://patchwork.ozlabs.org/project/openvswitch/patch/[email protected]/#3313143) > > I understand that any randomness in CI is not desired, and understand that > random tests can be hard to reproduce. The autovalidator tries to mitigate > the "reproduce" issue by dumping the packet contents on > failure, providing the erroring case to easily allow reproduction. > This test *should* be 100% reliably passing, so no false-positives are > expected. > If not, we should fix the test to be 100% reliable. This test can't produce false failures, I agree. However, it does produce false successes fairly regularly. For example, for me it doesn't fail IPv4 check in about 2 out of 5 runs. Making it more reliably reproduce the issue would mean increasing the number of samples for the already very long test. > >> So my request is to send out a v4 with a test modeled around the 'userspace >> offload - ip csum offload' test case we have in tests/dpif-netdev.at, as per >> Ilya’s request. > > I'll suggest 3 paths forward, and give my pro/cons list for each; whatever > option is preferred by OVS community we can send as a v4; > > 1) Accept "SIMD code fix" part of patch only (no unit test, just fixes issue > with local fuzz-testing during development) > 2) Build "specific-to-this-issue unit-test" only (regression tests against > this specific issue, but nothing else; low value in my opinion, as it > leaves a large input space untested) Large input space will remain untested anyway. Fuzzy test only covers a tiny fraction at a time. There is also an advantage to the specific regression test - we can check the expected output. In this case we can capture the resulted packet and check that the checksum is not only the same between the action implementations, but also that it is correct. This adds value to the test and makes it useful even if some implementations are not available on a testing system. We can take the string representation of the packet, replace the IP, re-generate the hex and compare with what we got on the other side in pcap. > 3) Accept "v3 patch as is, with fuzzing" (best test-coverage, confidence in > current & future code changes. CI runtime on fuzz-tests, and if failures > occur, they must be flagged & fixed, but are *actual real issues* that > should be fixed.) > > I think 2) is what is being asked above, but wanted to make sure the tradeoffs > are understood. As above, please indicate which is the preferred approach and > a v4 will be on the way. > Option 4: Build a specific unit test that that will test for regression for this particular issue. Send the patch out. Then move all the existing fuzzing tests into a separate test target, e.g. 'make check-fuzz-autovalidator' (maybe there is a better name), add fuzzy actions test there. Make some CI (realistically, Intel CI, because other CI systems do not have AVX512) run it periodically or semi-continuously and report failures to the mailing list, so fixes can be developed and regression tests added to the main unit test suite. What do you think? FWIW, I collected a few samples that can be used for the regression test: eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800, nw_src=9.77.254.180,nw_dst=216.53.194.20,nw_proto=6,nw_ttl=64,nw_frag=no, tp_src=14303,tp_dst=1044,tcp_flags=ack eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800, nw_src=108.72.150.185,nw_dst=247.95.158.114,nw_proto=6,nw_ttl=64,nw_frag=no, tp_src=32598,tp_dst=7421,tcp_flags=ack eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x0800, nw_src=106.187.156.70,nw_dst=117.142.150.202,nw_proto=6,nw_ttl=64,nw_frag=no, tp_src=10122,tp_dst=1462,tcp_flags=ack eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd, ipv6_src=2f8a:2076:3926:9e7:2d47:4bc9:9c7:17f3, ipv6_dst=7287:10dd:2fb9:41d5:3eb2:2c7a:11b0:6258, ipv6_label=0x51ac,nw_proto=6,nw_ttl=142,nw_frag=no, tp_src=20405,tp_dst=20662,tcp_flags=ack eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd, ipv6_src=630c:659b:2561:38e1:6989:4f2b:3386:51f6, ipv6_dst=658e:6f2:29fc:35b:5fdf:6558:119:40bc, ipv6_label=0x721b,nw_proto=6,nw_ttl=20,nw_frag=no, tp_src=2632,tp_dst=15427,tcp_flags=ack eth_src=8a:bf:7e:2f:05:84,eth_dst=0a:8f:39:4f:e0:73,dl_type=0x86dd, ipv6_src=16bf:5603:6d40:1620:2a8e:6a83:4348:6136, ipv6_dst=5003:74fd:47cd:6b65:353c:7fc1:5ac5:7d08, ipv6_label=0x7661,nw_proto=6,nw_ttl=214,nw_frag=no, tp_src=21024,tp_dst=23624,tcp_flags=ack Feeding them into 'ovs-ofctl compose-packet --bare' produces valid packets that trigger the issue. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
