"Eelco Chaudron" <[email protected]> writes:

> On 6 Sep 2018, at 10:56, Aaron Conole wrote:
>
>> As of June, the 0-day robot has tested over 450 patch series.
>> Occasionally it spams the list (apologies for that), but for the
>> majority of the time it has caught issues before they made it to the
>> tree - so it's accomplishing the initial goal just fine.
>>
>> I see lots of ways it can improve.  Currently, the bot runs on a light
>> system.  It takes ~20 minutes to complete a set of tests, including
>> all
>> the checkpatch and rebuild runs.  That's not a big issue.  BUT, it
>> does
>> mean that the machine isn't able to perform all the kinds of
>> regression
>> tests that we would want.  I want to improve this in a way that
>> various
>> contributors can bring their own hardware and regression tests to the
>> party.  In that way, various projects can detect potential issues
>> before
>> they would ever land on the tree and it could flag functional changes
>> earlier in the process.
>>
>> I'm not sure the best way to do that.  One thing I'll be doing is
>> updating the bot to push a series that successfully builds and passes
>> checkpatch to a special branch on a github repository to kick off
>> travis
>> builds.  That will give us a more complete regression coverage, and we
>> could be confident that a series won't break something major.  After
>> that, I'm not sure how to notify various alternate test
>> infrastructures
>> how to kick off their own tests using the patched sources.
>>
>> My goal is to get really early feedback on patch series.  I've sent
>> this
>> out to the folks I know are involved in testing and test discussions
>> in
>> the hopes that we can talk about how best to get more CI happening.
>> The
>> open questions:
>>
>> 1. How can we notify various downstream consumers of OvS of these
>>    0-day builds?  Should we just rely on people rolling their own?
>>    Should there be a more formalized framework?  How will these other
>>    test frameworks report any kind of failures?
>>
>> 2. What kinds of additional testing do we want to see the robot
>> include?
>
> First of all thanks for the 0-day robot, I really like the idea…
>
> One thing I feel would really benefit is some basic performance
> testing, like a PVP test for the kernel/dpdk datapath. This will help
> easily identifying performance impacting patches as they happen…
> Rather than people figuring out after a release why their performance
> has dropped.

Yes - I hope to pull in the work you've done for ovs_perf to have some
kind of baselines.

For this to make sense, I think it also needs to have a bunch of
hardware that we can benchmark (hint hint to some of the folks in the CC
list :).  Not for absolute numbers, but at least to detect significant
changes.

I'm also not sure how to measure a 'problem.'  Do we run a test
pre-series, and then run it post-series?  In that case, we could slowly
degrade performance over time without any noticing.  Do we take it from
the previous release, and compare?  Might make more sense, but I don't
know if it has other problems associated.  What are the thresholds we
use for saying something is a regression?  How do we report it to
developers?

>>    Should the test results be made available in general on some kind
>> of
>>    public facing site?  Should it just stay as a "bleep bloop -
>>    failure!" marker?
>>
>> 3. What other concerns should be addressed?
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to