> Is Nightly now using a list of flakes? Dashboard job was flaky yesterday, so didn't start using it. Looks like it's working fine now. Let me exclude flakies from nightly job.
> Just took a look at the dashboard. Does this capture only failed runs or all runs? Sorry the question isn't clear. Runs of what? Here's an attempt to answer it in best way i can understand - it looks at last X (x=6 now) runs of nightly branch-2 to collect failing, hanging, and timedout tests. > I see that the following tests have failed 100% of the time for the last 30 > runs [1]. If this captures all runs, this isn't truly flaky, but rather a > legitimate failure, right? > Maybe this tool is used to see all test failures, but if not, I feel like > we could/should remove a test from the flaky tests/excludes if it fails > consistently so we can fix the root cause Has come up a lot of times before. Yes, you're right 100% failure = legitimate failure. <rant> We as a community suck at tracking nightly runs for failing tests and fixing them, otherwise we wouldn't have ~40 bad test, right! In fact, we suck at fixing tests even when it's presented in a nice clean list (this dashboard). We just don't prioritize tests in our work. The general attitude is, tests are failing...meh..what's new, have been failing for years. Instead of - Oh, one test failed, find the cause and revert it! So the real thing to change here is attitude of the community towards tests. I am +1 for anything that'll promote/support that change. </rant> I think we can actually update the script to send a mail to dev@ when it encounters these 100% failing tests. Waana try? :) -- Appy On Fri, Jan 12, 2018 at 11:29 AM, Zach York <zyork.contribut...@gmail.com> wrote: > Just took a look at the dashboard. Does this capture only failed runs or > all runs? > > I see that the following tests have failed 100% of the time for the last 30 > runs [1]. If this captures all runs, this isn't truly flaky, but rather a > legitimate failure, right? > Maybe this tool is used to see all test failures, but if not, I feel like > we could/should remove a test from the flaky tests/excludes if it fails > consistently so we can fix the root cause. > > [1] > master.balancer.TestRegionsOnMasterOptions > client.TestMultiParallel > regionserver.TestRegionServerReadRequestMetrics > > Thanks, > Zach > > On Fri, Jan 12, 2018 at 8:19 AM, Stack <st...@duboce.net> wrote: > > > Dashboard doesn't capture timed out tests, right Appy? > > Thanks, > > S > > > > On Thu, Jan 11, 2018 at 6:10 PM, Apekshit Sharma <a...@cloudera.com> > > wrote: > > > > > https://builds.apache.org/job/HBase-Find-Flaky-Tests- > > > branch2.0/lastSuccessfulBuild/artifact/dashboard.html > > > > > > @stack: when you branch out branch-2.0, let me know, i'll update the > jobs > > > to point to that branch so that it's helpful in release. Once release > is > > > done, i'll move them back to "branch-2". > > > > > > > > > -- Appy > > > > > > -- -- Appy