On Tue, Aug 21, 2012 at 4:16 PM, Maciej Stachowiak m...@apple.com wrote:
On Aug 21, 2012, at 3:23 PM, Ojan Vafai o...@chromium.org wrote:
On Mon, Aug 20, 2012 at 6:03 PM, Maciej Stachowiak m...@apple.com wrote:
Here's how I imagine the workflow when a sheriff or just innocent
bystander
On Mon, Aug 20, 2012 at 6:03 PM, Maciej Stachowiak m...@apple.com wrote:
Here's how I imagine the workflow when a sheriff or just innocent
bystander notices a deterministically failing test. Follow this two-step
algorithm:
1) Are you confident that the new result is an improvement or no
On Aug 21, 2012, at 3:23 PM, Ojan Vafai o...@chromium.org wrote:
On Mon, Aug 20, 2012 at 6:03 PM, Maciej Stachowiak m...@apple.com wrote:
Here's how I imagine the workflow when a sheriff or just innocent bystander
notices a deterministically failing test. Follow this two-step algorithm:
On Sat, Aug 18, 2012 at 8:31 PM, Filip Pizlo fpi...@apple.com wrote:
On Aug 18, 2012, at 5:55 PM, Maciej Stachowiak m...@apple.com wrote:
On Aug 18, 2012, at 5:11 PM, Filip Pizlo fpi...@apple.com wrote:
Maybe at this point we can agree to let Dirk land some variant of this with
whatever
+1 to that. -expected.png / -failure.png is clearer than -expected.png /
-previous.png or -expected.png / -correct.png.
It's hard to grasp the difference between expected and correct unless
you fully knew how rebaselines worked in layout tests. Also, this model has
a nice one-to-one mapping with
I like your idea of having both the result-we-currently-expect and the
result-we-think-may-be-more-correct to be checked in. I still prefer Dirk's
naming scheme though.
I get the notion that expected always means literally what it seems to mean
from the standpoint of whether the tooling is
On Aug 18, 2012, at 1:08 AM, Filip Pizlo fpi...@apple.com wrote:
I like your idea of having both the result-we-currently-expect and the
result-we-think-may-be-more-correct to be checked in. I still prefer Dirk's
naming scheme though.
I think if we had both checked in, the
Maybe at this point we can agree to let Dirk land some variant of this with
whatever half-way sensible name (any of the options on the table are decent)
and see how it works?
It seems that the only thing anyone is disagreeing over is naming and which
files to keep around, which is a much
On Aug 18, 2012, at 5:11 PM, Filip Pizlo fpi...@apple.com wrote:
Maybe at this point we can agree to let Dirk land some variant of this with
whatever half-way sensible name (any of the options on the table are decent)
and see how it works?
It seems that the only thing anyone is
On Aug 18, 2012, at 5:55 PM, Maciej Stachowiak m...@apple.com wrote:
On Aug 18, 2012, at 5:11 PM, Filip Pizlo fpi...@apple.com wrote:
Maybe at this point we can agree to let Dirk land some variant of this with
whatever half-way sensible name (any of the options on the table are decent)
Asserting a test case is 100% correct is nearly impossible for a large
percentage of tests. The main advantage it gives us is the ability to have
-expected mean unsure.
Lets instead only add -failing (i.e. no -passing). Leaving -expected to
mean roughly what it does today to Chromium folk
On Fri, Aug 17, 2012 at 4:55 PM, Ojan Vafai o...@chromium.org wrote:
Asserting a test case is 100% correct is nearly impossible for a large
percentage of tests. The main advantage it gives us is the ability to have
-expected mean unsure.
Lets instead only add -failing (i.e. no -passing).
On Fri, Aug 17, 2012 at 5:01 PM, Ryosuke Niwa rn...@webkit.org wrote:
On Fri, Aug 17, 2012 at 4:55 PM, Ojan Vafai o...@chromium.org wrote:
Asserting a test case is 100% correct is nearly impossible for a large
percentage of tests. The main advantage it gives us is the ability to have
+1, contingent upon the following: are we agreeing that all current uses of
TEXT, IMAGE, and so forth in TestExpectations should be in the *very near term*
following Dirk's change be turned into -failing files?
-Filip
On Aug 17, 2012, at 5:01 PM, Ryosuke Niwa rn...@webkit.org wrote:
On Fri,
All non-flaky failures, yes.
Flaky tests would still require entries in the TestExpectations files
at this time; discussion of how to treat them is a separate topic.
-- Dirk
On Fri, Aug 17, 2012 at 5:35 PM, Filip Pizlo fpi...@apple.com wrote:
+1, contingent upon the following: are we agreeing
That's my expectation although we probably can't do that for flaky tests :(
e.g. sometimes fails with image diff.
On Fri, Aug 17, 2012 at 5:35 PM, Filip Pizlo fpi...@apple.com wrote:
+1, contingent upon the following: are we agreeing that all current uses
of TEXT, IMAGE, and so forth in
Then I am on board.
We still do need to revisit the handling of flaky tests. The current approach
is an absolute disaster. (I normally love exaggerating, but in this case, I
feel no satisfaction in doing so because it is at best an understatement.)
-Filip
On Aug 17, 2012, at 5:36 PM, Dirk
On Fri, Aug 17, 2012 at 5:43 PM, Filip Pizlo fpi...@apple.com wrote:
Then I am on board.
We still do need to revisit the handling of flaky tests. The current
approach is an absolute disaster. (I normally love exaggerating, but in this
case, I feel no satisfaction in doing so because it is
+1
On Fri, Aug 17, 2012 at 5:36 PM, Ryosuke Niwa rn...@webkit.org wrote:
That's my expectation although we probably can't do that for flaky tests :(
e.g. sometimes fails with image diff.
On Fri, Aug 17, 2012 at 5:35 PM, Filip Pizlo fpi...@apple.com wrote:
+1, contingent upon the
My understanding of the current proposal is this:
1) This applies to tests that fail deterministically, for reasons other than a
crash or hang.
2) If the test has a new result that you're confident is a progression (or
neither better or worse), you simply update the -expected.txt file.
3) If
That matches my understanding. You proposed modification sounds fine to me.
On Fri, Aug 17, 2012 at 6:40 PM, Maciej Stachowiak m...@apple.com wrote:
My understanding of the current proposal is this:
1) This applies to tests that fail deterministically, for reasons other
than a crash or
I'm not sure if I like this idea or not. A couple of observations/questions ...
1) I wouldn't want to call it '-correct' unless we were sure it was
correct; '-previous' is better in that regard
2) the issue with keeping a '-correct' in the tree is that it's quite
possible for a previous correct
So this is down to expected/failing and expected/previous?
I must say that expected/failing feels less confusing. Easier to remember if I
have to quickly recall what it means.
-Filip
On Aug 17, 2012, at 7:36 PM, Dirk Pranke dpra...@chromium.org wrote:
I'm not sure if I like this idea or
23 matches
Mail list logo