Ownership is a great concept. I started out planning LTTF as ownership-based. Unfortunately, the types of failures are scattered far and wide across the directories, some clustered, some not. After a few initial passes, I walked away thinking that it's not as simple as drawing the lines and basically gave up. That's how Finders/Fixers idea was born.
:DG< On Tue, Oct 13, 2009 at 4:24 PM, Yaar Schnitman <[email protected]> wrote: > I think ownership might actually help with flakiness. > Today, in order to distinguish flakiness from real bugs, the gardener needs > to have intimate knowledge of the relevant part of the code base and its > history. That is beyond the capabilities of the average webkit gardener. > Now, imagine a world were every layout test has an owner who can decide > intelligently that the bug is flakey and advise the gardener what to do with > it. Wouldn't it make gardening much easier? > [Flakiness dashboard is very helpful in making the decision, but specialized > knowledge topples generic statistics, especially if a test just started > flaking] > On Tue, Oct 13, 2009 at 1:21 PM, Julie Parent <[email protected]> wrote: >> >> I like the idea of ownership of groups of layout tests. Maybe these >> directory "owners" could be more like the "finders"? An owner shouldn't >> have to necessarily fix everything in a group/directory, but they should be >> responsible for triaging and getting meaningful bugs filled for them, to >> keep things moving along. (I volunteer for editing/) >> Another complicating factor - >> The state of the main Chrome tree has a lot of effect on the gardener. If >> the tree is already filled with flakiness, then the webkit roll is likely to >> show failures, which may or may not have been there before the roll. This >> was largely the case in the situation pkasting was referring to, when he >> took over as sheriff, he inherited a tree with a lot of flakiness not >> reflected in test_expectations/disabled ui tests. I think very few (if any) >> of the tests he added to test_expectations had anything to do with the roll. >> Any policy we make needs to keep in mind that main tree sheriffs deal with >> flakiness differently; some cross their fingers and hope it goes away, and >> some do clean up. Maybe we need to get better at enforcing (or automating) >> adding flaky tests to expectations, so we at least have a clean slate >> for gardeners to start with. >> On Tue, Oct 13, 2009 at 11:53 AM, Stephen White <[email protected]> >> wrote: >>> >>> I agree with Dimitri that we're fighting a losing battle here. >>> In my last stint as gardener, I did informally what I proposed formally >>> last time: I spent basically 1 full day just triaging failures from my 2 >>> days gardening. Not fixing, but just running tests locally, analyzing, >>> grouping, creating bugs, assigning to appropriate people (when I knew who >>> they were, cc'ing poor dglazkov when I didn't). So at least I didn't leave >>> a monster bug with "layout tests broken by merge #foo" but at least grouped >>> by area. That was manageable, but I don't know if another day would >>> actually be enough for a meaningful amount of fixing. >>> I also agree with Drew that actively fixing all the broken tests is >>> usually beyond the skills of any one gardener. >>> Perhaps we should start taking ownership of particular groups of layout >>> tests? And maybe automatically assign them (or least cc them), the same way >>> Area-Foo causes automatic cc'ing in bugs.chromium.org (I think?) That way, >>> the gardener wouldn't have to know who to assign to. >>> >>> I've basically taken responsibility for fixing all layout tests broken by >>> Skia rolls, which can pretty heavy on its own, but I'm willing to take >>> ownership of a directory or two. >>> BTW, the layout test flakiness dashboard has become an invaluable tool >>> for analyzing failures: searching for a test by name is lightning-fast, and >>> you can clearly see if a test has become flaky, on which platforms, and >>> which WebKit merge was responsible, which can also help with grouping. >>> (Props to Ojan for that). >>> Also, it may be Gilbert-and-Sullivan-esque of me, but I think everyone >>> who contributes patches to WebKit for chromium should be on the WebKit >>> gardener rotation. >>> Stephen >>> >>> On Tue, Oct 13, 2009 at 1:53 PM, Drew Wilson <[email protected]> >>> wrote: >>>> >>>> I've been thinking quite a bit about this - I agree with Dmitri that the >>>> current Sisyphean approach is unsustainable. >>>> I don't think the right path is to ask the sheriffs to do the cleanup >>>> themselves - for example, a webkit roll that breaks workers in some obscure >>>> way is almost certainly beyond the ability of any random gardener to fix in >>>> two days, especially when there may be multiple bugs. >>>> A better solution would be to have the sheriff (or someone from LTTF) >>>> assign the bugs to specific people, with a general rule that such bugs must >>>> be fixed within two days (make these bugs the top priority over other >>>> tasks). This allows for load balancing of bugs, and also makes sure that we >>>> have the right people working on any specific bug. >>>> -atw >>>> On Tue, Oct 13, 2009 at 10:40 AM, Pam Greene <[email protected]> wrote: >>>>> >>>>> I don't think it's realistic to expect the gardener, or any one person, >>>>> to be able to fix an arbitrary broken layout test in a reasonable period >>>>> of >>>>> time. That's certainly true for new tests, but even for regressions I >>>>> often >>>>> can't even tell for sure whether our results are correct, much less what >>>>> to >>>>> do if they're not. >>>>> It's far more efficient to have the "right" person fix a test. (Of >>>>> course, people should also strive to broaden their knowledge, but there's >>>>> a >>>>> limit to how much of that one can do in a week.) Never having broken >>>>> layout >>>>> tests is an excellent goal, but quite frankly I don't think it's one we >>>>> should prioritize so high that we hobble other efforts and burn out >>>>> developers. >>>>> - Pam >>>>> On Tue, Oct 13, 2009 at 10:31 AM, Dimitri Glazkov >>>>> <[email protected]> wrote: >>>>>> >>>>>> I think we need to change something. I am not sure what -- I have >>>>>> ideas, but -- I would appreciate some collective thinking on this. >>>>>> >>>>>> PROBLEM: We accumulate more test failures via WebKit rolls than we fix >>>>>> with our LTTF effort. This ain't right. >>>>>> >>>>>> ANALYSIS: >>>>>> >>>>>> Ok, WebKit gardening is hard. So is fixing layout tests. You can't >>>>>> call it a successful WebKit roll if it breaks layout tests. But we >>>>>> don't revert WebKit rolls. It's a forward-only thing. And we want to >>>>>> roll quickly, so that we can react to next "big breaker" faster. So >>>>>> we're stuck with roll-now/clean-up-after deal. This sucks, because the >>>>>> "clean-up-after" is rarely fully completed. Which brings failing >>>>>> layout tests, which brings the suffering and spells asymptotic doom to >>>>>> the LTTF effort. >>>>>> >>>>>> POSSIBLE SOLUTIONS: >>>>>> >>>>>> * Extend WebKit gardener's duties to 4 days. First two days you roll. >>>>>> Next two days you fix layout tests. Not file bugs -- actually fix >>>>>> them. The net result of 4 days should be 0 (or less!) new layout test >>>>>> failures. This solution kind of expects the gardener to be part of >>>>>> LTTF, which is not always the case. So it may not seem totally fair. >>>>>> >>>>>> * Assign LTTF folks specifically for test clean-up every day. The idea >>>>>> here is to slant LTTF effort aggressively toward fixing newer >>>>>> failures. This seems nice for the gardeners, but appears to separate >>>>>> the action/responsibility dependency: no matter what you roll, the >>>>>> LTTF elves will fix it. >>>>>> >>>>>> * [ your idea goes here ] >>>>>> >>>>>> TIMELINE: >>>>>> >>>>>> I would like for us to agree on a solution and make the necessary >>>>>> changes to the process today. Tomorrow is a new day, full of >>>>>> surprising changes upstream. >>>>>> >>>>>> :DG< >>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >>> >>> >>> -- >>> All truth passes through three stages. First, it is ridiculed. Second, it >>> is violently opposed. Third, it is accepted as being self-evident. -- >>> Schopenhauer >>> >>> >> >> >> >> > > --~--~---------~--~----~------------~-------~--~----~ Chromium Developers mailing list: [email protected] View archives, change email options, or unsubscribe: http://groups.google.com/group/chromium-dev -~----------~----~----~----~------~----~------~--~---
