[chromium-dev] Re: [LTTF][WebKit Gardening]: Keeping up with the weeds.

Dimitri Glazkov Tue, 13 Oct 2009 22:10:09 -0700

Ownership is a great concept. I started out planning LTTF as
ownership-based. Unfortunately, the types of failures are scattered
far and wide across the directories, some clustered, some not. After a
few initial passes, I walked away thinking that it's not as simple as
drawing the lines and basically gave up. That's how Finders/Fixers
idea was born.


:DG<

On Tue, Oct 13, 2009 at 4:24 PM, Yaar Schnitman <[email protected]> wrote:
> I think ownership might actually help with flakiness.
> Today, in order to distinguish flakiness from real bugs, the gardener needs
> to have intimate knowledge of the relevant part of the code base and its
> history. That is beyond the capabilities of the average webkit gardener.
> Now, imagine a world were every layout test has an owner who can decide
> intelligently that the bug is flakey and advise the gardener what to do with
> it. Wouldn't it make gardening much easier?
> [Flakiness dashboard is very helpful in making the decision, but specialized
> knowledge topples generic statistics, especially if a test just started
> flaking]
> On Tue, Oct 13, 2009 at 1:21 PM, Julie Parent <[email protected]> wrote:
>>
>> I like the idea of ownership of groups of layout tests.  Maybe these
>> directory "owners" could be more like the "finders"?  An owner shouldn't
>> have to necessarily fix everything in a group/directory, but they should be
>> responsible for triaging and getting meaningful bugs filled for them, to
>> keep things moving along. (I volunteer for editing/)
>> Another complicating factor -
>> The state of the main Chrome tree has a lot of effect on the gardener.  If
>> the tree is already filled with flakiness, then the webkit roll is likely to
>> show failures, which may or may not have been there before the roll.  This
>> was largely the case in the situation pkasting was referring to, when he
>> took over as sheriff, he inherited a tree with a lot of flakiness not
>> reflected in test_expectations/disabled ui tests.  I think very few (if any)
>> of the tests he added to test_expectations had anything to do with the roll.
>> Any policy we make needs to keep in mind that main tree sheriffs deal with
>> flakiness differently; some cross their fingers and hope it goes away, and
>> some do clean up.  Maybe we need to get better at enforcing (or automating)
>> adding flaky tests to expectations, so we at least have a clean slate
>> for gardeners to start with.
>> On Tue, Oct 13, 2009 at 11:53 AM, Stephen White <[email protected]>
>> wrote:
>>>
>>> I agree with Dimitri that we're fighting a losing battle here.
>>> In my last stint as gardener, I did informally what I proposed formally
>>> last time:  I spent basically 1 full day just triaging failures from my 2
>>> days gardening.  Not fixing, but just running tests locally, analyzing,
>>> grouping, creating bugs, assigning to appropriate people (when I knew who
>>> they were, cc'ing poor dglazkov when I didn't).  So at least I didn't leave
>>> a monster bug with "layout tests broken by merge #foo" but at least grouped
>>> by area.  That was manageable, but I don't know if another day would
>>> actually be enough for a meaningful amount of fixing.
>>> I also agree with Drew that actively fixing all the broken tests is
>>> usually beyond the skills of any one gardener.
>>> Perhaps we should start taking ownership of particular groups of layout
>>> tests?  And maybe automatically assign them (or least cc them), the same way
>>> Area-Foo causes automatic cc'ing in bugs.chromium.org (I think?)  That way,
>>> the gardener wouldn't have to know who to assign to.
>>>
>>> I've basically taken responsibility for fixing all layout tests broken by
>>> Skia rolls, which can pretty heavy on its own, but I'm willing to take
>>> ownership of a directory or two.
>>> BTW, the layout test flakiness dashboard has become an invaluable tool
>>> for analyzing failures:  searching for a test by name is lightning-fast, and
>>> you can clearly see if a test has become flaky, on which platforms, and
>>> which WebKit merge was responsible, which can also help with grouping.
>>>  (Props to Ojan for that).
>>> Also, it may be Gilbert-and-Sullivan-esque of me, but I think everyone
>>> who contributes patches to WebKit for chromium should be on the WebKit
>>> gardener rotation.
>>> Stephen
>>>
>>> On Tue, Oct 13, 2009 at 1:53 PM, Drew Wilson <[email protected]>
>>> wrote:
>>>>
>>>> I've been thinking quite a bit about this - I agree with Dmitri that the
>>>> current Sisyphean approach is unsustainable.
>>>> I don't think the right path is to ask the sheriffs to do the cleanup
>>>> themselves - for example, a webkit roll that breaks workers in some obscure
>>>> way is almost certainly beyond the ability of any random gardener to fix in
>>>> two days, especially when there may be multiple bugs.
>>>> A better solution would be to have the sheriff (or someone from LTTF)
>>>> assign the bugs to specific people, with a general rule that such bugs must
>>>> be fixed within two days (make these bugs the top priority over other
>>>> tasks). This allows for load balancing of bugs, and also makes sure that we
>>>> have the right people working on any specific bug.
>>>> -atw
>>>> On Tue, Oct 13, 2009 at 10:40 AM, Pam Greene <[email protected]> wrote:
>>>>>
>>>>> I don't think it's realistic to expect the gardener, or any one person,
>>>>> to be able to fix an arbitrary broken layout test in a reasonable period 
>>>>> of
>>>>> time. That's certainly true for new tests, but even for regressions I 
>>>>> often
>>>>> can't even tell for sure whether our results are correct, much less what 
>>>>> to
>>>>> do if they're not.
>>>>> It's far more efficient to have the "right" person fix a test. (Of
>>>>> course, people should also strive to broaden their knowledge, but there's 
>>>>> a
>>>>> limit to how much of that one can do in a week.) Never having broken 
>>>>> layout
>>>>> tests is an excellent goal, but quite frankly I don't think it's one we
>>>>> should prioritize so high that we hobble other efforts and burn out
>>>>> developers.
>>>>> - Pam
>>>>> On Tue, Oct 13, 2009 at 10:31 AM, Dimitri Glazkov
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> I think we need to change something. I am not sure what -- I have
>>>>>> ideas, but -- I would appreciate some collective thinking on this.
>>>>>>
>>>>>> PROBLEM: We accumulate more test failures via WebKit rolls than we fix
>>>>>> with our LTTF effort. This ain't right.
>>>>>>
>>>>>> ANALYSIS:
>>>>>>
>>>>>> Ok, WebKit gardening is hard. So is fixing layout tests. You can't
>>>>>> call it a successful WebKit roll if it breaks layout tests. But we
>>>>>> don't revert WebKit rolls. It's a forward-only thing. And we want to
>>>>>> roll quickly, so that we can react to next "big breaker" faster. So
>>>>>> we're stuck with roll-now/clean-up-after deal. This sucks, because the
>>>>>> "clean-up-after" is rarely fully completed. Which brings failing
>>>>>> layout tests, which brings the suffering and spells asymptotic doom to
>>>>>> the LTTF effort.
>>>>>>
>>>>>> POSSIBLE SOLUTIONS:
>>>>>>
>>>>>> * Extend WebKit gardener's duties to 4 days. First two days you roll.
>>>>>> Next two days you fix layout tests. Not file bugs -- actually fix
>>>>>> them. The net result of 4 days should be 0 (or less!) new layout test
>>>>>> failures. This solution kind of expects the gardener to be part of
>>>>>> LTTF, which is not always the case. So it may not seem totally fair.
>>>>>>
>>>>>> * Assign LTTF folks specifically for test clean-up every day. The idea
>>>>>> here is to slant LTTF effort aggressively toward fixing newer
>>>>>> failures. This seems nice for the gardeners, but appears to separate
>>>>>> the action/responsibility dependency: no matter what you roll, the
>>>>>> LTTF elves will fix it.
>>>>>>
>>>>>> * [ your idea goes here ]
>>>>>>
>>>>>> TIMELINE:
>>>>>>
>>>>>> I would like for us to agree on a solution and make the necessary
>>>>>> changes to the process today. Tomorrow is a new day, full of
>>>>>> surprising changes upstream.
>>>>>>
>>>>>> :DG<
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> All truth passes through three stages. First, it is ridiculed. Second, it
>>> is violently opposed. Third, it is accepted as being self-evident. --
>>> Schopenhauer
>>>
>>>
>>
>>
>> >>
>
>

--~--~---------~--~----~------------~-------~--~----~
Chromium Developers mailing list: [email protected] 
View archives, change email options, or unsubscribe: 
    http://groups.google.com/group/chromium-dev
-~----------~----~----~----~------~----~------~--~---

[chromium-dev] Re: [LTTF][WebKit Gardening]: Keeping up with the weeds.

Reply via email to