On Mon, Apr 5, 2010 at 11:27 PM, Brent Fulgham <[email protected]> wrote: > On Apr 5, 2010, at 9:58 PM, Adam Barth wrote: >> We had some trouble today keeping the tree green. In this email, I >> present a post-mordem analysis of what happened and what we can learn >> from these events. I've removed most of the names from this account >> because the purpose isn't to assign blame but to document what >> happened in the hopes that we can learn from it. > > Could rollout of patches be automated in some fashion, so that a > previously-green tree becoming red could trigger a rollout of the last > checkin?
We have the tools to do this currently, but the prevailing wisdom is that we should have some human judgement involved in the process. We still have enough test flakiness that false positives could roll out perfect good patches. In other cases, it's clear how to fix the tree without rolling out. > On the other hand, I've noticed that the varying speed of the various build > bots makes it difficult to assess which patch might have triggered a break. > It's not uncommon for some machines to be several patches behind others, and > long test runs further exacerbate the problem. The sheriffbot does a pretty good job of narrowing down the regression window. Its algorithm is somewhat robust to flaky tests and other bits of noise. We continue to refine it based on experience. For example, even during today's complex overlapping failures, it correct computed the regression window to a set of five commits. The main failure mode we're seeing currently is that if a test fails 80% of the time, sheriffbot will generate false positives because it thinks that the test was fixed the one time it happens to be green. I have some ideas for how to handle that case, but we'll need to experiment some more. Adam _______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

