On Thu, Aug 16, 2012 at 2:32 PM, Filip Pizlo <[email protected]> wrote:
> > On Aug 16, 2012, at 2:13 PM, Dirk Pranke wrote: > > > On Wed, Aug 15, 2012 at 6:02 PM, Filip Pizlo <[email protected]> wrote: > >> > >> 2) Possibility of the sheriff getting it wrong. > >> > >> (2) concerns me most. We're talking about using filenames to serve as a > >> kind of unchecked comment. We already know that comments are usually > bad > >> because there is no protection against them going stale. > >> > > > > Sheriffs can already get things wrong (and rebaseline when they > > shouldn't). I believe that adding passing/failing to expected will > > make things better in this regard, not worse. > > In what way do things become better? Because other people will see what > the sheriff believed about the result? > > Can you articulate some more about what happens when you have both > -expected and -failing? > > My specific concern is that after someone checks in a fix, we will have > some sheriff accidentally misjudge the change in behavior to be a > regression, and check in a -failing file. And then we end up in a world of > confusion. > > This is why I think that just having -expected files is better. It is a > kind of recognition that we're tracking changes in behavior, rather than > comparing against some almighty notion of what it means to be correct. > > > > > Another idea/observation is that if we have multiple types of > > expectation files, it might be easier to set up watchlists, e.g., "let > > me know whenever a file gets checked into fast/forms with an -expected > > or -failing result". It seems like this might be useful, but I'm not > > sure. > > > >> In particular, to further clarify my position, if someone were to argue > that > >> Dirk's proposal would be a wholesale replacement for TestExpectations, > then > >> I would be more likely to be on board, since I very much like the idea > of > >> reducing the number of ways of doing things. Maybe that's a good way to > >> reach compromise. > >> > >> Dirk, what value do you see in TestExpectations were your change to be > >> landed? Do scenarios still exist where there would be a test for which > (a) > >> there is no -fail.* file, (b) the test is not skipped, and (c) it's > marked > >> with some annotation in TestExpectations? I'm most interested in the > >> question of such scenarios exist, since in my experience, whenever a > test is > >> not rebased, is not skipped, and is marked as failing in > TestExpectations, > >> it ends up just causing gardening overhead later. > > > > This is a good question, because it is definitely my intent that this > > change replace some existing practices, not add to them. > > > > Currently, the Chromium port uses TestExpectations entries for four > > different kinds of things: tests we don't ever plan to fix (WONTFIX), > > tests that we skip because not doing so causes other tests to break, > > tests that fail (reliably), and tests that are flaky. > > > > Skipped files do not let you distinguish (programmatically) between > > the first two categories, and so my plan is to replace Skipped files > > with TestExpectations (using the new syntax discussed a month or so > > ago) soon (next week or two at the latest). > > > > I would like to replace using TestExpectations for failing tests (at > > least for tests that are expected to keep failing indefinitely because > > someone isn't working on an active fix) with this new mechanism. > > > > That leaves flaky tests. One can debate what the right thing to do w/ > > flaky tests is here; I'm inclined to argue that flakiness is at least > > as bad as failing, and we should probably be skipping them, but the > > Chromium port has not yet actively tried this approach (I think other > > ports probably have experience here, though). > > > > Does that help answer your question / sway you at all? > > Yes, it does - it answers my question, though it perhaps doesn't sway me. > My concerns are still that: > > 1) Switching to skipping flaky tests wholesale in all ports would be > great, and then we could get rid of the flakiness support. > This is not necessarily helpful in some cases. There are cases some test make subsequent tests flaky so skipping a flaky test doesn't fix the problem. Instead, it just makes the next test flaky. The right fix, of course, to skip/fix the culprit but pinpointing the test that makes subsequent tests more flaky has proved to be a time consuming and challenging task. 2) The WONTFIX mode in TestExpectations feels to me more like a statement > that you're just trying to see if the test doesn't crash. Correct? Either > way, it's confusing. > Yes, I'd argue that they should just be skipped but that's a bikeshedding for another time. - Ryosuke
_______________________________________________ webkit-dev mailing list [email protected] http://lists.webkit.org/mailman/listinfo/webkit-dev

