On Wed, Sep 05, 2018 at 01:55:08PM -0700, Ben Pfaff wrote:
> On Wed, Sep 05, 2018 at 01:50:06PM +0200, Thomas Goirand wrote:
> > On 09/04/2018 11:06 PM, Ben Pfaff wrote:
> > > On Tue, Sep 04, 2018 at 09:20:45AM +0200, Thomas Goirand wrote:
> > >> On 09/02/2018 03:12 AM, Justin Pettit wrote:
> > >>>
> > >>>> On Sep 1, 2018, at 3:52 PM, Ben Pfaff <[email protected]> wrote:
> > >>>>
> > >>>> On Sat, Sep 01, 2018 at 01:23:32PM -0700, Justin Pettit wrote:
> > >>>>>
> > >>>>>> On Sep 1, 2018, at 12:21 PM, Thomas Goirand <[email protected]> wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>> The only one failure:
> > >>>>>>
> > >>>>>> 2633: ovn -- ACL rate-limited logging                 FAILED 
> > >>>>>> (ovn.at:6516)
> > >>>>>
> > >>>>> My guess if that this is meter-related. Can you send the 
> > >>>>> ovs-vswitchd.log and testsuite.log so I can take a look?
> > >>>>
> > >>>> It probably hasn't changed from what he sent the first time around.
> > >>>
> > >>> Yes, "testsuite.log" was in the original message, so I don't need that. 
> > >>>  Thomas, can you send me "ovs-vswitchd.log" and "ovn-controller.log"?  
> > >>> Does it consistently fail for you?
> > >>>
> > >>> --Justin
> > >>
> > >> Hi,
> > >>
> > >> As I blacklisted the above test, I uploaded to Sid, and now there's a
> > >> number of failures on non-intel arch:
> > >>
> > >> https://buildd.debian.org/status/package.php?p=openvswitch
> > >> https://buildd.debian.org/status/logs.php?pkg=openvswitch
> > >>
> > >> Ben, Justin, can you help me fix all of this?
> > > 
> > > Thanks for passing that along.
> > > 
> > > A lot of these failures seem to involve unexpected timeouts.  I wonder
> > > whether the buildds are so overloaded that some of the 10-second
> > > timeouts in the testsuite are just too short.  Usually, this is a
> > > generous timeout interval.
> > > 
> > > I sent a patch that should help to debug the problem by doing more 
> > > logging:
> > >         https://patchwork.ozlabs.org/patch/966087/
> > > 
> > > It won't help with tests that fully succeed, because the logs by default
> > > are discarded, but for tests that have a sequence of waits, in which one
> > > eventually fails, it will allow us to see how long the successful waits
> > > took.
> > > 
> > > Any chance you could apply that patch and try another build?  Feel free
> > > to wait for review, if you prefer.
> > > 
> > 
> > Hi,
> > 
> > I've just uploaded OVS with that patch. Thanks, I think it's a very good
> > idea. And indeed, it looks like failing arch are the slower ones.
> 
> I'm pretty pleased with the theory myself, but the results tend to show
> that it wasn't the problem.  In most of the tests that eventually
> failed, the wait failure was preceded by other waits that succeeded
> immediately, and the longest wait I see is 3 seconds.  I'll look for
> other possible causes.

Most of the test failures seem related to the "asynchronous message
control" tests.  I haven't yet determined the reason for the failure,
but after some work I was able to reproduce it on my own system.
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to