Re: [ovs-dev] [RFC] Federating the 0-day robot, and improving the testing

Aaron Conole Tue, 11 Sep 2018 12:53:13 -0700

Ophir Munk <[email protected]> writes:

>> -----Original Message-----
>> From: Aaron Conole [mailto:[email protected]]
>> Sent: Thursday, September 06, 2018 11:56 AM
>> To: Ian Stokes <[email protected]>; Kevin Traynor
>> <[email protected]>; Ophir Munk <[email protected]>; Ferruh
>> Yigit <[email protected]>; Luca Boccassi <[email protected]>; Jeremy
>> Plsek <[email protected]>; Sugesh Chandran
>> <[email protected]>; Jean-Tsung Hsiao <[email protected]>;
>> Christian Trautman <[email protected]>; Ben Pfaff <[email protected]>;
>> Bala Sankaran <[email protected]>
>> Cc: [email protected]
>> Subject: [RFC] Federating the 0-day robot, and improving the testing
>> 
>> As of June, the 0-day robot has tested over 450 patch series.
>> Occasionally it spams the list (apologies for that), but for the majority of 
>> the
>> time it has caught issues before they made it to the tree - so it's
>> accomplishing the initial goal just fine.
>> 
>> I see lots of ways it can improve.  Currently, the bot runs on a light 
>> system.  It
>> takes ~20 minutes to complete a set of tests, including all the checkpatch
>> and rebuild runs.  That's not a big issue.  BUT, it does mean that the 
>> machine
>> isn't able to perform all the kinds of regression tests that we would want.  
>> I
>> want to improve this in a way that various contributors can bring their own
>> hardware and regression tests to the party.  In that way, various projects 
>> can
>> detect potential issues before they would ever land on the tree and it could
>> flag functional changes earlier in the process.
>> 
>
> First of all - lots of thanks for this great work. 
> A few questions/comments:
> 1. Are the tests mentioned above considered core/sanity tests to make
> sure the basic functionality is not broken?


Yes - actually, I haven't re-enabled reporting the make check, so it's
basically:

1. git am

2. checkpatch

3. make

If any of those fails, they get reported.  Future work, we'll re-enable
reporting the other checks.

> 2. Is there a link to the tests which are executed? How can they be reviewed?

Documentation/topics/testing.rst covers the high level overview
(including the testsuites run by doing make
 check{-dpdk,-kernel,-kmod,-system-userspace,-ryu,-oftest,-valgrind,-lcov,})

The various tests are primarily wired up through m4, although they can
be written in any language provided there's a binary to execute.

> 3. Is there a link to the tests results? How can they be viewed?

For the bot, right now, there isn't a link.  I think a dashboard
functionality is probably worthwhile to write.

> 4. Is the test environment documented? I think it would be beneficial
> if in parallel to the 0-day robot each vendor would be able to build
> the same environment locally in order to test his patches before
> sending them.

Yes and no.  For example, the exact steps the bot takes are all
documented at:

https://github.com/orgcandman/pw-ci/blob/master/3rd-party/openvswitch/config.xml

But, as I wrote above, we just report failures from the steps above.

> 5. I am interested in having Mellanox NICs taking part of these
> tests. We will have some internal discussions regarding this, then I
> will be more specific.

Awesome!  Look forward to hearing more.

>> I'm not sure the best way to do that.  One thing I'll be doing is updating 
>> the
>> bot to push a series that successfully builds and passes checkpatch to a
>> special branch on a github repository to kick off travis builds.  That will 
>> give
>> us a more complete regression coverage, and we could be confident that a
>> series won't break something major.  
>
> I suggest to tag the daily regression series and to have public access to it.
> In case anything is broken we should get an email notifying on this
> and be able to bisect the tree (based on tag) to find which commit is
> causing issues. It is even better to have the bot doing the bisect.

Not sure what it means.  I don't think there should be anything to
bisect yet - but that's probably because I'm focused on the submission
side of testing.  Of course, a future effort would be some kind of full
regression.  I guess that's what you're referring to here.

>> After that, I'm not sure how to notify
>> various alternate test infrastructures how to kick off their own tests using
>> the patched sources.
>> 
>> My goal is to get really early feedback on patch series.  I've sent this out 
>> to
>> the folks I know are involved in testing and test discussions in the hopes 
>> that
>> we can talk about how best to get more CI happening.  The open questions:
>> 
>> 1. How can we notify various downstream consumers of OvS of these
>>    0-day builds?  Should we just rely on people rolling their own?
>>    Should there be a more formalized framework?  How will these other
>>    test frameworks report any kind of failures?
>> 
>> 2. What kinds of additional testing do we want to see the robot include?
>>    Should the test results be made available in general on some kind of
>>    public facing site?  Should it just stay as a "bleep bloop -
>>    failure!" marker?
>> 
>> 3. What other concerns should be addressed?
>
> I am looking forward to start running even with just basic tests to
> see how this whole framework works, then improving along the way. Can
> you please make sure to add the dpdk-latest and dpdk-hwol branches to
> the bot tests in addition to the master branch?

Great news - see the test suite documentation for the checks that we
run.

We'll make sure that the dpdk branches are properly covered.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [RFC] Federating the 0-day robot, and improving the testing

Reply via email to