On Fri, May 20, 2016 at 4:03 PM, Apekshit Sharma <[email protected]> wrote:
> That's a nice dashboard! I like the spark lines in the last column. > Let me add creating a dashboard to my todos. > > I see denominator of ~400 in 2 day failure rate column. Do you really run > the flaky tests about 200 times a day? > Would be hard to get that kind of resource upstream :). > Yep, we do. It's not so hard - 200 times a day is only once every 7 minutes. We basically just run the flaky-tests job in a loop, and when there aren't that many tests that are flaky, they only take a few minutes total to run. For Kudu in particular, we also run four build configurations in parallel, so it's really only 50 times per day (x 4 configs). Overall cost on GCE spot instances is pretty cheap (couple hundred bucks a month) -Todd > On Fri, May 20, 2016 at 3:43 PM, Todd Lipcon <[email protected]> wrote: > > > On Fri, May 20, 2016 at 1:17 PM, Matteo Bertozzi < > [email protected]> > > wrote: > > > > > any suggestion on how to make people aware of the tests being flaky? > > > > > > > You guys might consider doing something like what we do for Apache Kudu > > (incubating): > > > > http://dist-test.cloudera.org:8080/ has a dashboard (driven from our > > flaky-tests job) which shows the percent flakiness of each test, as well > as > > a breakdown of pass/fail rates by revision. We don't automatically email > > these to the list or anything, currently, but would be pretty easy to set > > up a cron job to do so. > > > > The dashboard is very helpful for prioritizing the de-flaking of the > worst > > offenders, and also useful to quickly drill down and grab failure logs > from > > the flaky tests themselves. > > > > -Todd > > -- > > Todd Lipcon > > Software Engineer, Cloudera > > > > > > -- > > Regards > > Apekshit Sharma | Software Engineer, Cloudera | Palo Alto, California | > 650-963-6311 > -- Todd Lipcon Software Engineer, Cloudera
