On Feb 25, 2013, at 4:27 PM, Ryosuke Niwa <rn...@webkit.org> wrote:

> On Mon, Feb 25, 2013 at 3:36 PM, Glenn Adams <gl...@skynav.com> wrote:
> 
> On Mon, Feb 25, 2013 at 3:53 PM, Ryosuke Niwa <rn...@webkit.org> wrote:
> On Mon, Feb 25, 2013 at 2:48 PM, Dirk Pranke <dpra...@google.com> wrote:
> On Mon, Feb 25, 2013 at 2:05 PM, Ryosuke Niwa <rn...@webkit.org> wrote:
> > On Mon, Feb 25, 2013 at 1:39 PM, Dirk Pranke <dpra...@google.com> wrote:
> >> Are you suggesting we should land a "failling" baseline in the meantime?
> >
> >
> > No. I'm suggesting patch authors perform their due diligence and either ask
> > port maintainers to rebaseline or rebaseline tests themselves.
> >
> 
> I think either you misunderstood my question, or I am misunderstanding
> your answer. I'm not asking "who", I'm asking "what" ...
> 
> If we know some tests are failing, and when we fix a bug the tests
> will start passing again (but that patch might not land for quite some
> time), what should we (anyone) do in the meantime? Leave the tree red,
> land "incorrect" -expected baselines so that we can catch changes in
> behavior, or add lines to TestExpectations? Many of the lines you
> cited fell into the last category.
> 
> Either one of those two solutions would work (although I strictly advice we 
> do the latter) when there are failing tests and we need a fix in order for 
> those tests to pass but I'm not interested in discussing that matter at the 
> moment.
> 
> I'm specifically opposed to adding new entries to TestExpectations for the 
> sole purpose of rebaselining them later.
> 
> One of the modes that the new generic TE file permits is to specify a generic 
> Skip and overriding this with per-port Pass. This provides (IMO) a more 
> manageable way to land a patch that you expect will require per-port mods to 
> fully get the patch working on the primary ports. It also allows one to land 
> a patch containing tests where a subsequent patch will provide the 
> functionality to be tested.
> 
> I recently used both of these modes to sub-divide a large patch and to 
> incrementally verify (landing individual per-port fixes as needed) individual 
> ports.
> 
> Personally, I prefer this over the "just turn bots red" mode. I suspect the 
> turn bots red approach results in greater churn, due to rollouts and 
> re-landing, than the finer grain approach I outlined above. I agree it 
> requires the committer to follow through on other ports and eventually remove 
> the generic Skip and Pass rules (once it works everywhere), but we have to 
> assume here (IMO) that committers will do the right thing and take 
> responsibility for the process. [And if not, then remind them.]
> 
> So I for one, would vote against the "turn the bots red" approach. Overall, I 
> think that approach produces a higher interrupt load on our limited committer 
> and reviewer resources. I think it would be useful to give other procedures 
> an opportunity to be tried out so we can have more data for choosing between 
> alternative approaches.
> 
> This has been tried as many people including yourself are using this approach 
> in landing and rebaselining patches. In particular, a lot of new contributors 
> from Chromium port use this strategy.
> 
> When I used to work for Google, I periodically went through all entries in 
> TestExpectations files that needed rebaselines and manually rebaselined them 
> myself because people just leave entries like I've cited in another email all 
> the time.
> 
> Quite frankly, I don't want it to be my (or anyone but patch author's) job to 
> take care of all these stale entries people add.

The problem of Chromium in the early days was that all tests were rebaselined 
when they failed. Too many tests turned the bots red at this time and the 
Chromium folks didn't (couldn't?) take attention of all these failing tests. We 
lost the chance to fix a lot of these bugs immediately during this time. It 
took months to cover these tests in bug reports and fix them finally. Some 
tests may still report that everything is ok, even if they are failing - 
untracked. I see the current "needs rebaseline" approach better, since the 
tests are at least logged. I do not have an opinion how they should be logged.

Greetings,
Dirk

> 
> - R. Niwa
> 
> _______________________________________________
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> https://lists.webkit.org/mailman/listinfo/webkit-dev

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
https://lists.webkit.org/mailman/listinfo/webkit-dev

Reply via email to