Without a nightly build and with this many flaky tests it is very hard to identify the braking commits. We can use something like bisect and multiple test runs.
There is a more elegant way to do this with nightly test runs: https://issues.apache.org/jira/browse/HBASE-15917 <https://issues.apache.org/jira/browse/HBASE-15917> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html <https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html> This also helps to identify the flaky tests, and creates a continuos, updated list of them. > On Feb 23, 2018, at 6:55 PM, Sahil Takiar <takiar.sa...@gmail.com> wrote: > > +1 > > Does anyone have suggestions about how to efficiently identify which commit > is breaking a test? Is it just git-bisect or is there an easier way? Hive > QA isn't always that helpful, it will say a test is failing for the past > "x" builds, but that doesn't help much since Hive QA isn't a nightly build. > > On Thu, Feb 22, 2018 at 10:31 AM, Vihang Karajgaonkar <vih...@cloudera.com> > wrote: > >> +1 >> Commenting on JIRA and giving a 24hr heads-up (excluding weekends) would be >> good. >> >> On Thu, Feb 22, 2018 at 10:19 AM, Alan Gates <alanfga...@gmail.com> wrote: >> >>> +1. >>> >>> Alan. >>> >>> On Thu, Feb 22, 2018 at 8:25 AM, Thejas Nair <thejas.n...@gmail.com> >>> wrote: >>> >>>> +1 >>>> I agree, this makes sense. The number of failures keeps increasing. >>>> A 24 hour heads up in either case before revert would be good. >>>> >>>> >>>> On Thu, Feb 22, 2018 at 2:45 AM, Peter Vary <pv...@cloudera.com> >> wrote: >>>> >>>>> I agree with Zoltan. The continuously braking tests make it very hard >>> to >>>>> spot real issues. >>>>> Any thoughts on doing it automatically? >>>>> >>>>>> On Feb 22, 2018, at 10:47 AM, Zoltan Haindrich <k...@rxd.hu> >> wrote: >>>>>> >>>>>> * >>>>>> >>>>>> Hello, >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> ** >>>>>> >>>>>> In the last couple weeks the number of broken tests have started to >>> go >>>>> up...and even tho I run bisect/etc from time to time ; sometimes >> people >>>>> don’t react to my comments/tickets/etc. >>>>>> >>>>>> Because keeping this many failing tests makes it easier for a new >> one >>>> to >>>>> slip in...I think reverting the patch introducing the test failures >>> would >>>>> also help in some case. >>>>>> >>>>>> I think it would help a lot to prevent further test breaks to >> revert >>>> the >>>>> patch if any of the following conditions is met: >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> C1) if the notification/comment about the fact that the patch >> indeed >>>>> broken a test somehow have been unanswered for at least 24 hours. >>>>>> >>>>>> C2) if the patch is in for 7 days; but the test failure is still >> not >>>>> addressed (note that in this case there might be a conversation about >>>>> fixing it...but in this case ; to enable other people to work in a >>>> cleaner >>>>> environment is more important than a single patch - and if it can't >> be >>>>> fixed in 7 days...well it might not get fixed in a month). >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> I would like to also note that I've seen a few tickets which have >>> been >>>>> picked up by people who were not involved in creating the original >>>> change - >>>>> and although the intention was good, they might miss the context of >> the >>>>> original patch and may "fix" the tests in the wrong way: accept a >> q.out >>>>> which is inappropriate or ignore the test... >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> would it be ok to implement this from now on? because it makes my >>>>> efforts practically useless if people are not reacting… >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> note: just to be on the same page - this is only about running a >>> single >>>>> test which falls on its own - I feel that flaky tests are an entirely >>>>> different topic. >>>>>> >>>>>> * >>>>>> * >>>>>> >>>>>> cheers, >>>>>> >>>>>> Zoltan >>>>>> >>>>>> ** >>>>>> * >>>>> >>>>> >>>> >>> >> > > > > -- > Sahil Takiar > Software Engineer > takiar.sa...@gmail.com | (510) 673-0309