> -----Original Message-----
> From: dev <[email protected]> On Behalf Of David Marchand
> Sent: Thursday, December 2, 2021 12:21 PM
> To: Amber, Kumar <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: Re: [ovs-dev] [PATCH v3 2/4] system-dpdk: Use dummy-pmd port for
> packet injection.
> 
> On Wed, Dec 1, 2021 at 3:52 PM Amber, Kumar <[email protected]>
> wrote:
> > > diff --git a/tests/genpkts.py b/tests/genpkts.py new file mode 100755 
> > > index
> > > 0000000000..f64f786ccb
> > > --- /dev/null
> > > +++ b/tests/genpkts.py
> > > @@ -0,0 +1,56 @@
> > > +#!/usr/bin/env python3
> > > +
> > > +import sys
> > > +
> > > +from scapy.all import RandMAC, RandIP, RandIP6, RandShort, fuzz from
> > > +scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP
> > > +
> > > +if len(sys.argv) < 2:
> > > +    print('usage: {} packets_count [fuzz]'.format(sys.argv[0]))
> > > +    sys.exit(1)
> > > +
> > > +tmpl = []
> > > +
> > > +if len(sys.argv) == 2:
> > > +    eth = Ether(dst='ff:ff:ff:ff:ff:ff')
> > > +    vlan = eth / Dot1Q(vlan=1)
> > > +    p = eth / IP() / TCP(sport=20, dport=80, flags='SA', window=8192)
> > > +    tmpl += [p.build().hex()]
> > > +    p = eth / IP() / UDP(sport=53, dport=53)
> > > +    tmpl += [p.build().hex()]
> > > +    p = eth / IP() / TCP(sport=20, dport=80, flags='S', window=8192)
> > > +    tmpl += [p.build().hex()]
> > > +    p = eth / IP() / UDP(sport=53, dport=53)
> > > +    tmpl += [p.build().hex()]
> > > +    p = vlan / IP() / UDP(sport=53, dport=53)
> > > +    tmpl += [p.build().hex()]
> > > +    p = vlan / IP() / TCP(sport=20, dport=80, flags='S', window=8192)
> > > +    tmpl += [p.build().hex()]
> >
> > Hardcoding the values here is not preferable as we wanted to test the
> optimized implementations
> > with various values contained inside the header.
> 
> Those hardcoded values comes from the pcap file that was previously used.
> If you want to add more protocols, it is easier with this patch as you
> only need to update some python script rather than rewrite a pcap
> file.

The PCAP was written by a script, which generated random MAC addresses;
  previously:  eth = Ether(src=RandMAC(), dst=RandMAC())

The same issue exists here with TCP sport/dport, which were previously randomly
generated, and are now being hard-coded to a small set of specific values;
  previously:    tcp = TCP(dport=RandShort(), sport=RandShort())

Amber correctly points out that hard-coding these values is a regression in 
testing.
Re-writing a PCAP file is easy, please refer to the current "mfex_fuzzy.py" 
script.
Updating hex ethernet addresses manually, or TCP ports manually instead of
automatically generating them is a step backwards.

I do not see the value add here, and I'm concerned that when regressions in 
these methods are pointed
out that these are not acted on, but instead we are told "this hardcoded method 
is better". It is not.


> > > +on_exit "pkill -f -x -9 '$PYTHON3 $srcdir/genpkts.py -1 fuzz'"
> > > +($PYTHON3 $srcdir/genpkts.py -1 fuzz | while read pkt; do
> > > +     ovs-appctl netdev-dummy/receive p1 "$pkt" || break
> > > + done) &
> > > +
> > >  AT_CHECK([ovs-appctl dpif-netdev/miniflow-parser-set autovalidator], [0],
> [dnl
> > > Miniflow extract implementation set to autovalidator.
> > >  ])
> > >
> > > -OVS_WAIT_UNTIL([test `ovs-vsctl get interface p1 statistics | grep -oP
> > > 'rx_packets=\s*\K\d+'` -ge 100000])
> >
> > We should increase the packet count to at-least 10x the current number to
> have a proper fuzzy testing and we have measured it would only take 10~ to 15~
> sec more. The current runtime did not catch issues when we purposely broke the
> implementation and by allowing to run for 10000 packets, it did catch the 
> induced
> error.
> 
> For me, the fuzzy testing does not have its place in a CI, because it
> is not reproducible.
> I let it in place and just made sure it would not reach the timeout.

Disagree here, Fuzzing is *ideal* for CI, as it tests different inputs 
continuously,
and each CI run improves the confidence in the system. A large number of open
source projects are actively doing large-scale fuzzing in CI instances.
e.g.; https://google.github.io/clusterfuzz/

If the fuzzing autotests fails in CI, it still flags that there is *an issue*. 
In our
case with AutoValidators for DPCLS and MFEX, it even prints a whole debug log
of "good" miniflow, as well as "bad" results of the optimized implementation.
These VLOG_ERR results would be hugely helpful in identifying & debugging.

I do not understand the motivation for disabling/limiting fuzzing in CI.


(Side note; there are methods to get reproducible PRNG built into unit tests,
see this project for example, where exactly that is achieved by printing the 
"seed"
value of the PRNG before test, allowing developers to reproduce the exact run:
https://nemequ.github.io/munit/#prng)


> On the system I used, this test takes 5s with 1k and timeouts with 10k.
> So I guess the 10s/15s evaluation is dependent on the system.
> 
> I prefer to stick to current value.
> 
> 
> Thanks.
> 
> --
> David Marchand

Removing automated randomness from Fuzzy testing, and removing/limiting 
fuzz-testing in CI runs
are both big steps backwards in my view, please reconsider what this patch is 
actually trying to achieve.

Regards, -Harry

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to