Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-22 Thread Dirk Pranke
On Tue, Aug 21, 2012 at 4:16 PM, Maciej Stachowiak m...@apple.com wrote: On Aug 21, 2012, at 3:23 PM, Ojan Vafai o...@chromium.org wrote: On Mon, Aug 20, 2012 at 6:03 PM, Maciej Stachowiak m...@apple.com wrote: Here's how I imagine the workflow when a sheriff or just innocent bystander

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-21 Thread Ojan Vafai
On Mon, Aug 20, 2012 at 6:03 PM, Maciej Stachowiak m...@apple.com wrote: Here's how I imagine the workflow when a sheriff or just innocent bystander notices a deterministically failing test. Follow this two-step algorithm: 1) Are you confident that the new result is an improvement or no

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-21 Thread Maciej Stachowiak
On Aug 21, 2012, at 3:23 PM, Ojan Vafai o...@chromium.org wrote: On Mon, Aug 20, 2012 at 6:03 PM, Maciej Stachowiak m...@apple.com wrote: Here's how I imagine the workflow when a sheriff or just innocent bystander notices a deterministically failing test. Follow this two-step algorithm:

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-20 Thread Dirk Pranke
On Sat, Aug 18, 2012 at 8:31 PM, Filip Pizlo fpi...@apple.com wrote: On Aug 18, 2012, at 5:55 PM, Maciej Stachowiak m...@apple.com wrote: On Aug 18, 2012, at 5:11 PM, Filip Pizlo fpi...@apple.com wrote: Maybe at this point we can agree to let Dirk land some variant of this with whatever

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-18 Thread Ryosuke Niwa
+1 to that. -expected.png / -failure.png is clearer than -expected.png / -previous.png or -expected.png / -correct.png. It's hard to grasp the difference between expected and correct unless you fully knew how rebaselines worked in layout tests. Also, this model has a nice one-to-one mapping with

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-18 Thread Filip Pizlo
I like your idea of having both the result-we-currently-expect and the result-we-think-may-be-more-correct to be checked in. I still prefer Dirk's naming scheme though. I get the notion that expected always means literally what it seems to mean from the standpoint of whether the tooling is

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-18 Thread Maciej Stachowiak
On Aug 18, 2012, at 1:08 AM, Filip Pizlo fpi...@apple.com wrote: I like your idea of having both the result-we-currently-expect and the result-we-think-may-be-more-correct to be checked in. I still prefer Dirk's naming scheme though. I think if we had both checked in, the

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-18 Thread Filip Pizlo
Maybe at this point we can agree to let Dirk land some variant of this with whatever half-way sensible name (any of the options on the table are decent) and see how it works? It seems that the only thing anyone is disagreeing over is naming and which files to keep around, which is a much

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-18 Thread Maciej Stachowiak
On Aug 18, 2012, at 5:11 PM, Filip Pizlo fpi...@apple.com wrote: Maybe at this point we can agree to let Dirk land some variant of this with whatever half-way sensible name (any of the options on the table are decent) and see how it works? It seems that the only thing anyone is

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-18 Thread Filip Pizlo
On Aug 18, 2012, at 5:55 PM, Maciej Stachowiak m...@apple.com wrote: On Aug 18, 2012, at 5:11 PM, Filip Pizlo fpi...@apple.com wrote: Maybe at this point we can agree to let Dirk land some variant of this with whatever half-way sensible name (any of the options on the table are decent)

[webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Ojan Vafai
Asserting a test case is 100% correct is nearly impossible for a large percentage of tests. The main advantage it gives us is the ability to have -expected mean unsure. Lets instead only add -failing (i.e. no -passing). Leaving -expected to mean roughly what it does today to Chromium folk

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Ryosuke Niwa
On Fri, Aug 17, 2012 at 4:55 PM, Ojan Vafai o...@chromium.org wrote: Asserting a test case is 100% correct is nearly impossible for a large percentage of tests. The main advantage it gives us is the ability to have -expected mean unsure. Lets instead only add -failing (i.e. no -passing).

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Dirk Pranke
On Fri, Aug 17, 2012 at 5:01 PM, Ryosuke Niwa rn...@webkit.org wrote: On Fri, Aug 17, 2012 at 4:55 PM, Ojan Vafai o...@chromium.org wrote: Asserting a test case is 100% correct is nearly impossible for a large percentage of tests. The main advantage it gives us is the ability to have

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Filip Pizlo
+1, contingent upon the following: are we agreeing that all current uses of TEXT, IMAGE, and so forth in TestExpectations should be in the *very near term* following Dirk's change be turned into -failing files? -Filip On Aug 17, 2012, at 5:01 PM, Ryosuke Niwa rn...@webkit.org wrote: On Fri,

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Dirk Pranke
All non-flaky failures, yes. Flaky tests would still require entries in the TestExpectations files at this time; discussion of how to treat them is a separate topic. -- Dirk On Fri, Aug 17, 2012 at 5:35 PM, Filip Pizlo fpi...@apple.com wrote: +1, contingent upon the following: are we agreeing

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Ryosuke Niwa
That's my expectation although we probably can't do that for flaky tests :( e.g. sometimes fails with image diff. On Fri, Aug 17, 2012 at 5:35 PM, Filip Pizlo fpi...@apple.com wrote: +1, contingent upon the following: are we agreeing that all current uses of TEXT, IMAGE, and so forth in

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Filip Pizlo
Then I am on board. We still do need to revisit the handling of flaky tests. The current approach is an absolute disaster. (I normally love exaggerating, but in this case, I feel no satisfaction in doing so because it is at best an understatement.) -Filip On Aug 17, 2012, at 5:36 PM, Dirk

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Dirk Pranke
On Fri, Aug 17, 2012 at 5:43 PM, Filip Pizlo fpi...@apple.com wrote: Then I am on board. We still do need to revisit the handling of flaky tests. The current approach is an absolute disaster. (I normally love exaggerating, but in this case, I feel no satisfaction in doing so because it is

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Ojan Vafai
+1 On Fri, Aug 17, 2012 at 5:36 PM, Ryosuke Niwa rn...@webkit.org wrote: That's my expectation although we probably can't do that for flaky tests :( e.g. sometimes fails with image diff. On Fri, Aug 17, 2012 at 5:35 PM, Filip Pizlo fpi...@apple.com wrote: +1, contingent upon the

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Maciej Stachowiak
My understanding of the current proposal is this: 1) This applies to tests that fail deterministically, for reasons other than a crash or hang. 2) If the test has a new result that you're confident is a progression (or neither better or worse), you simply update the -expected.txt file. 3) If

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Ojan Vafai
That matches my understanding. You proposed modification sounds fine to me. On Fri, Aug 17, 2012 at 6:40 PM, Maciej Stachowiak m...@apple.com wrote: My understanding of the current proposal is this: 1) This applies to tests that fail deterministically, for reasons other than a crash or

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Dirk Pranke
I'm not sure if I like this idea or not. A couple of observations/questions ... 1) I wouldn't want to call it '-correct' unless we were sure it was correct; '-previous' is better in that regard 2) the issue with keeping a '-correct' in the tree is that it's quite possible for a previous correct

Re: [webkit-dev] A simpler proposal for handling failing tests WAS: A proposal for handling failing layout tests and TestExpectations

2012-08-17 Thread Filip Pizlo
So this is down to expected/failing and expected/previous? I must say that expected/failing feels less confusing. Easier to remember if I have to quickly recall what it means. -Filip On Aug 17, 2012, at 7:36 PM, Dirk Pranke dpra...@chromium.org wrote: I'm not sure if I like this idea or