Re: [Gluster-devel] NetBSD Regression Failures for 2 weeks

Jeff Darcy Tue, 09 Aug 2016 07:16:03 -0700

> > *96* of *247* regressions failed
> 
> That is huge.

Agreed.


I think there's an experiment we should do, which I've discussed with a couple 
of others: redefine EXPECT_WITHIN on NetBSD to double or triple the time given, 
and see if it makes a difference.  Why?  Because NetBSD (or perhaps the 
instances we run it on) often just seems slow - particularly for things that 
hit the local filesystem.  For example (from recent runs):

  split-brain-favorite-child-policy.t
  Linux: 602 seconds
  NetBSD: 703 second

  nuke.t
  Linux: 79 seconds
  NetBSD: 181 seconds

  heald.t
  Linux: 145 seconds
  NetBSD: 157 seconds

Many of our tests are very timing-sensitive, running "close to the edge" in the 
sense of whether they'll pass consistently with the timeouts we use.  If such a 
test has a 90% chance of passing on Linux, it might have only a 50% chance of 
passing on NetBSD.  It doesn't take many tests like that before we start to see 
a scary number of NetBSD regression failures.

I have no doubt that some of these failures are also real, most often race 
conditions that fall prey to differences in Linux and NetBSD process/thread 
scheduling.  Nonetheless, I think it might be useful to know how many are 
problems in the code we ship vs. in tests that are overly optimistic with 
respect to how long various operations take.
_______________________________________________
Gluster-devel mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-devel

Re: [Gluster-devel] NetBSD Regression Failures for 2 weeks

Reply via email to