On Wed, Nov 4, 2015 at 7:48 AM, Michael Henretty <[email protected]> wrote: > > On Wed, Nov 4, 2015 at 4:45 PM, Fabrice Desré <[email protected]> wrote: >> >> Can we *right now* identify the worst offenders by looking at the tests >> results/re-runs? You know that sheriffs will very quickly hide and >> ignore tests that are really flaky. > > > > Yes, that's an important point. The problem is that you have to actually > look at the logs of an individual chunk to see which tests failed. If a > certain Gij test passes at least 1 out of it's 5 given runs, it will not > surface to Treeherder, which means we can't start it. Looking through each > chunk log file (of which we have 40 per run) is doable, but more time > consuming and error prone.
Jumping in on something I haven't been able to pay much attention to myself here so I may be missing context here, but this sounds like it sets people up to assume that if something occasionally works we're good to ship it, as opposed to if it occasionally fails we need to fix it. Seems to me that this needs to be flipped around very aggressively for these tests to provide much value. - jst _______________________________________________ dev-fxos mailing list [email protected] https://lists.mozilla.org/listinfo/dev-fxos

