Re: [webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

2012-06-14 Thread Dirk Pranke
On Thu, Jun 14, 2012 at 10:37 PM, Ojan Vafai  wrote:
>
>
> On Thu, Jun 14, 2012 at 9:20 PM, Maciej Stachowiak  wrote:
>>
>>
>> On Jun 14, 2012, at 9:06 PM, Adam Barth  wrote:
>>
>> > On Thu, Jun 14, 2012 at 9:02 PM, Ojan Vafai  wrote:
>> >>
>> >> Seems like it will be a common error to mark a reftest failure as
>> >> ImageOnlyFail and then be confused why it's not working, no?
>> >
>> > Maybe that can be solved with another name, like PixelOnlyFailure.
>
>
> I'm OK with trying it this way. We can always have another
> bikeshed fruitful discussion if it turns out to be a
> frequent cause of confusion in practice.
>
> Not sure one name is any more clear than the other. PixelOnlyFailure seems
> fine to me since others have expressed a preference in the past for Pixel
> over Image.
>

I can certainly see the advantages of the suggested scheme, but I
would want to sleep on this one, and possibly do some more data mining
to see if I can figure out exactly what percentage of Chromium's tests
are pixel tests marked as TEXT only failures or reftests are either
just IMAGE or just TEXT. (I'm sure Maciej is right and the percentage
is low, but I'd like to know how low). Further, once Chromium moves to
a world where failing baselines are checked in, the percentage is
probably somewhere between zero and inconsequential :).

I don't think PixelOnlyFailure is any better than ImageOnlyFailure.

>>
>> That sounds good. We could also make it an error to apply PixelOnlyFailure
>> (or what have you) to a text-only test, a reftest, or an audio test. Error
>> in the sense that it would be reported as a failure, with an informative
>> diagnostic saying it does not apply.
>
>
> We already have a mechanism for "errors" like this. They are reported when
> you run the tests or when you run with --lint-test-files. At least on the
> chromium bots, this runs as a separate step that turns red when when you
> cause a lint failure. That way errors get noticed and addressed quickly (the
> lint step takes ~2 seconds to run).

We also will catch these errors on upload through the style checker
(which is essentially just running --lint-test-files).

As a side note, I need to add the lint step to the b.w.o bots ...

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

2012-06-14 Thread Ojan Vafai
On Thu, Jun 14, 2012 at 9:20 PM, Maciej Stachowiak  wrote:

>
> On Jun 14, 2012, at 9:06 PM, Adam Barth  wrote:
>
> > On Thu, Jun 14, 2012 at 9:02 PM, Ojan Vafai  wrote:
> >>
> >> Seems like it will be a common error to mark a reftest failure as
> >> ImageOnlyFail and then be confused why it's not working, no?
> >
> > Maybe that can be solved with another name, like PixelOnlyFailure.
>

I'm OK with trying it this way. We can always have another
bikeshed fruitful discussion if it turns out to be a
frequent cause of confusion in practice.

Not sure one name is any more clear than the other. PixelOnlyFailure seems
fine to me since others have expressed a preference in the past for Pixel
over Image.


> That sounds good. We could also make it an error to apply PixelOnlyFailure
> (or what have you) to a text-only test, a reftest, or an audio test. Error
> in the sense that it would be reported as a failure, with an informative
> diagnostic saying it does not apply.
>

We already have a mechanism for "errors" like this. They are reported when
you run the tests or when you run with --lint-test-files. At least on the
chromium bots, this runs as a separate step that turns red when when you
cause a lint failure. That way errors get noticed and addressed quickly
(the lint step takes ~2 seconds to run).

Ojan
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

2012-06-14 Thread Maciej Stachowiak

On Jun 14, 2012, at 9:06 PM, Adam Barth  wrote:

> On Thu, Jun 14, 2012 at 9:02 PM, Ojan Vafai  wrote:
>> 
>> Seems like it will be a common error to mark a reftest failure as
>> ImageOnlyFail and then be confused why it's not working, no?
> 
> Maybe that can be solved with another name, like PixelOnlyFailure.

That sounds good. We could also make it an error to apply PixelOnlyFailure (or 
what have you) to a text-only test, a reftest, or an audio test. Error in the 
sense that it would be reported as a failure, with an informative diagnostic 
saying it does not apply.

Regards,
Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

2012-06-14 Thread Adam Barth
On Thu, Jun 14, 2012 at 9:02 PM, Ojan Vafai  wrote:
> On Thu, Jun 14, 2012 at 9:00 PM, Adam Barth  wrote:
>>
>> On Thu, Jun 14, 2012 at 8:56 PM, Ojan Vafai  wrote:
>> > On Thu, Jun 14, 2012 at 4:34 PM, Dirk Pranke 
>> > wrote:
>> >> On Thu, Jun 14, 2012 at 4:22 PM, Maciej Stachowiak 
>> >> wrote:
>> >> > On Jun 14, 2012, at 1:47 PM, Ryosuke Niwa  wrote:
>> >> > On Thu, Jun 14, 2012 at 1:44 PM, Peter Kasting
>> >> > 
>> >> > wrote:
>> >> >>
>> >> >> On Thu, Jun 14, 2012 at 1:39 PM, Elliot Poger 
>> >> >> wrote:
>> >> >>>
>> >> >>> Can someone please remind me why IMAGE+TEXT even exists?
>> >> >>>
>> >> >>> Wouldn't it be simpler to just mark a test as follows?
>> >> >>>
>> >> >>> IMAGE : allow image failure; go red if there is a text failure
>> >> >>> TEXT: allow text failure; go red if there is an image failure
>> >> >>> IMAGE TEXT: allow text and/or image failure
>> >> >>
>> >> >> The distinction is that IMAGE TEXT will allow image, text, or both
>> >> >> to
>> >> >> fail, thus making transitions among the three generate no events.
>> >> >>  IMAGE+TEXT says specifically that we expect both to fail and that
>> >> >> if
>> >> >> one
>> >> >> starts passing, someone should do something.  (For example, maybe
>> >> >> someone
>> >> >> checks in a partial rebaseline where they miss the image
>> >> >> expectations.)
>> >> >
>> >> >
>> >> > Not to bike-shed on anything, but I think we should rename Text and
>> >> > Image to
>> >> > TextOnly and ImageOnly. Every single person I know, including myself,
>> >> > had
>> >> > never got the distinction between IMAGE TEXT and IMAGE+TEXT without
>> >> > someone
>> >> > explaining it to him/her .
>> >> >
>> >> >
>> >> > I think IMAGE+TEXT is not a very useful distinction from TEXT either.
>> >> > I
>> >> > checked for uses of TEXT that is not IMAGE+TEXT in the Chromium
>> >> > TextExpectations, and it seems that nearly all instances fall into
>> >> > one
>> >> > of
>> >> > the two following categories:
>> >> >
>> >> > 1) text-only test, so IMAGE+TEXT would not have different semantics
>> >> > from
>> >> > TEXT (the vast majority)
>> >> > 2) Flaky test that may actually pass, so distinguishing what happens
>> >> > with
>> >> > the image result is of limited utility (most of these are also
>> >> > text-only
>> >> > tests; only a small subset even have an image result)
>> >> >
>> >> > Thus, I think Fail and ImageOnlyFail would be more useful and
>> >> > understandable
>> >> > categories than {TEXT, IMAGE, TEXT+IMAGE, TEXT IMAGE}. Fail would
>> >> > have
>> >> > the
>> >> > semantic that a text failure is expected, and image result if any can
>> >> > either
>> >> > pass or fail.
>> >>
>> >> This is perhaps true, but if it's okay I would like to treat that
>> >> feature request separately from the other syntactic changes we've been
>> >> discussing. So far the rest of the changes have not really implied any
>> >> changes to how we actually track which changes fail and how (note that
>> >> FAIL is different and we've fixed that separately from these changes
>> >> as well).
>> >
>> >
>> > Lets have the separate bikeshed. While this is less precise, I agree
>> > that
>> > Fail and ImageOnlyFail would capture the vast majority use-case and
>> > remove a
>> > frequent source of confusion and error. The big downside of this
>> > approach is
>> > that a text-only failure that also starts failing the pixel result make
>> > genuinely indicate a new bug. I think that happens rarely enough that
>> > I'm OK
>> > with it for the added simplicity.
>> >
>> > A couple open questions:
>> > -Does Fail also replace Audio? Seems reasonable to me.
>>
>> Yeah, audio tests can fail only in one way.
>>
>> > -What about reftest failures where there is no text comparison? I'd be
>> > fine
>> > with saying you can do Fail or ImageOnlyFail and they mean the same
>> > thing
>> > here.
>>
>> Similarly, I'd say that we should just Fail here.  Reftests can fail
>> only in one way.
>
>
> Seems like it will be a common error to mark a reftest failure as
> ImageOnlyFail and then be confused why it's not working, no?

Maybe that can be solved with another name, like PixelOnlyFailure.

Adam


>> In this view, ImageOnlyFail is a special case for pixel tests because
>> they're so fragile.
>>
>> Adam
>
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

2012-06-14 Thread Ojan Vafai
On Thu, Jun 14, 2012 at 9:00 PM, Adam Barth  wrote:

> On Thu, Jun 14, 2012 at 8:56 PM, Ojan Vafai  wrote:
> > On Thu, Jun 14, 2012 at 4:34 PM, Dirk Pranke 
> wrote:
> >> On Thu, Jun 14, 2012 at 4:22 PM, Maciej Stachowiak 
> wrote:
> >> > On Jun 14, 2012, at 1:47 PM, Ryosuke Niwa  wrote:
> >> > On Thu, Jun 14, 2012 at 1:44 PM, Peter Kasting  >
> >> > wrote:
> >> >>
> >> >> On Thu, Jun 14, 2012 at 1:39 PM, Elliot Poger 
> >> >> wrote:
> >> >>>
> >> >>> Can someone please remind me why IMAGE+TEXT even exists?
> >> >>>
> >> >>> Wouldn't it be simpler to just mark a test as follows?
> >> >>>
> >> >>> IMAGE : allow image failure; go red if there is a text failure
> >> >>> TEXT: allow text failure; go red if there is an image failure
> >> >>> IMAGE TEXT: allow text and/or image failure
> >> >>
> >> >> The distinction is that IMAGE TEXT will allow image, text, or both to
> >> >> fail, thus making transitions among the three generate no events.
> >> >>  IMAGE+TEXT says specifically that we expect both to fail and that if
> >> >> one
> >> >> starts passing, someone should do something.  (For example, maybe
> >> >> someone
> >> >> checks in a partial rebaseline where they miss the image
> expectations.)
> >> >
> >> >
> >> > Not to bike-shed on anything, but I think we should rename Text and
> >> > Image to
> >> > TextOnly and ImageOnly. Every single person I know, including myself,
> >> > had
> >> > never got the distinction between IMAGE TEXT and IMAGE+TEXT without
> >> > someone
> >> > explaining it to him/her .
> >> >
> >> >
> >> > I think IMAGE+TEXT is not a very useful distinction from TEXT either.
> I
> >> > checked for uses of TEXT that is not IMAGE+TEXT in the Chromium
> >> > TextExpectations, and it seems that nearly all instances fall into one
> >> > of
> >> > the two following categories:
> >> >
> >> > 1) text-only test, so IMAGE+TEXT would not have different semantics
> from
> >> > TEXT (the vast majority)
> >> > 2) Flaky test that may actually pass, so distinguishing what happens
> >> > with
> >> > the image result is of limited utility (most of these are also
> text-only
> >> > tests; only a small subset even have an image result)
> >> >
> >> > Thus, I think Fail and ImageOnlyFail would be more useful and
> >> > understandable
> >> > categories than {TEXT, IMAGE, TEXT+IMAGE, TEXT IMAGE}. Fail would have
> >> > the
> >> > semantic that a text failure is expected, and image result if any can
> >> > either
> >> > pass or fail.
> >>
> >> This is perhaps true, but if it's okay I would like to treat that
> >> feature request separately from the other syntactic changes we've been
> >> discussing. So far the rest of the changes have not really implied any
> >> changes to how we actually track which changes fail and how (note that
> >> FAIL is different and we've fixed that separately from these changes
> >> as well).
> >
> >
> > Lets have the separate bikeshed. While this is less precise, I agree that
> > Fail and ImageOnlyFail would capture the vast majority use-case and
> remove a
> > frequent source of confusion and error. The big downside of this
> approach is
> > that a text-only failure that also starts failing the pixel result make
> > genuinely indicate a new bug. I think that happens rarely enough that
> I'm OK
> > with it for the added simplicity.
> >
> > A couple open questions:
> > -Does Fail also replace Audio? Seems reasonable to me.
>
> Yeah, audio tests can fail only in one way.
>
> > -What about reftest failures where there is no text comparison? I'd be
> fine
> > with saying you can do Fail or ImageOnlyFail and they mean the same thing
> > here.
>
> Similarly, I'd say that we should just Fail here.  Reftests can fail
> only in one way.
>

Seems like it will be a common error to mark a reftest failure as
ImageOnlyFail and then be confused why it's not working, no?


> In this view, ImageOnlyFail is a special case for pixel tests because
> they're so fragile.
>
> Adam
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

2012-06-14 Thread Adam Barth
On Thu, Jun 14, 2012 at 8:56 PM, Ojan Vafai  wrote:
> On Thu, Jun 14, 2012 at 4:34 PM, Dirk Pranke  wrote:
>> On Thu, Jun 14, 2012 at 4:22 PM, Maciej Stachowiak  wrote:
>> > On Jun 14, 2012, at 1:47 PM, Ryosuke Niwa  wrote:
>> > On Thu, Jun 14, 2012 at 1:44 PM, Peter Kasting 
>> > wrote:
>> >>
>> >> On Thu, Jun 14, 2012 at 1:39 PM, Elliot Poger 
>> >> wrote:
>> >>>
>> >>> Can someone please remind me why IMAGE+TEXT even exists?
>> >>>
>> >>> Wouldn't it be simpler to just mark a test as follows?
>> >>>
>> >>> IMAGE : allow image failure; go red if there is a text failure
>> >>> TEXT: allow text failure; go red if there is an image failure
>> >>> IMAGE TEXT: allow text and/or image failure
>> >>
>> >> The distinction is that IMAGE TEXT will allow image, text, or both to
>> >> fail, thus making transitions among the three generate no events.
>> >>  IMAGE+TEXT says specifically that we expect both to fail and that if
>> >> one
>> >> starts passing, someone should do something.  (For example, maybe
>> >> someone
>> >> checks in a partial rebaseline where they miss the image expectations.)
>> >
>> >
>> > Not to bike-shed on anything, but I think we should rename Text and
>> > Image to
>> > TextOnly and ImageOnly. Every single person I know, including myself,
>> > had
>> > never got the distinction between IMAGE TEXT and IMAGE+TEXT without
>> > someone
>> > explaining it to him/her .
>> >
>> >
>> > I think IMAGE+TEXT is not a very useful distinction from TEXT either. I
>> > checked for uses of TEXT that is not IMAGE+TEXT in the Chromium
>> > TextExpectations, and it seems that nearly all instances fall into one
>> > of
>> > the two following categories:
>> >
>> > 1) text-only test, so IMAGE+TEXT would not have different semantics from
>> > TEXT (the vast majority)
>> > 2) Flaky test that may actually pass, so distinguishing what happens
>> > with
>> > the image result is of limited utility (most of these are also text-only
>> > tests; only a small subset even have an image result)
>> >
>> > Thus, I think Fail and ImageOnlyFail would be more useful and
>> > understandable
>> > categories than {TEXT, IMAGE, TEXT+IMAGE, TEXT IMAGE}. Fail would have
>> > the
>> > semantic that a text failure is expected, and image result if any can
>> > either
>> > pass or fail.
>>
>> This is perhaps true, but if it's okay I would like to treat that
>> feature request separately from the other syntactic changes we've been
>> discussing. So far the rest of the changes have not really implied any
>> changes to how we actually track which changes fail and how (note that
>> FAIL is different and we've fixed that separately from these changes
>> as well).
>
>
> Lets have the separate bikeshed. While this is less precise, I agree that
> Fail and ImageOnlyFail would capture the vast majority use-case and remove a
> frequent source of confusion and error. The big downside of this approach is
> that a text-only failure that also starts failing the pixel result make
> genuinely indicate a new bug. I think that happens rarely enough that I'm OK
> with it for the added simplicity.
>
> A couple open questions:
> -Does Fail also replace Audio? Seems reasonable to me.

Yeah, audio tests can fail only in one way.

> -What about reftest failures where there is no text comparison? I'd be fine
> with saying you can do Fail or ImageOnlyFail and they mean the same thing
> here.

Similarly, I'd say that we should just Fail here.  Reftests can fail
only in one way.

In this view, ImageOnlyFail is a special case for pixel tests because
they're so fragile.

Adam
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


[webkit-dev] IMAGE+TEXT WAS: TestExpectations syntax changes, last call (for a while, at least) ...

2012-06-14 Thread Ojan Vafai
On Thu, Jun 14, 2012 at 4:34 PM, Dirk Pranke  wrote:

> On Thu, Jun 14, 2012 at 4:22 PM, Maciej Stachowiak  wrote:
> >
> > On Jun 14, 2012, at 1:47 PM, Ryosuke Niwa  wrote:
> >
> >
> > On Thu, Jun 14, 2012 at 1:44 PM, Peter Kasting 
> > wrote:
> >>
> >> On Thu, Jun 14, 2012 at 1:39 PM, Elliot Poger 
> wrote:
> >>>
> >>> Can someone please remind me why IMAGE+TEXT even exists?
> >>>
> >>> Wouldn't it be simpler to just mark a test as follows?
> >>>
> >>> IMAGE : allow image failure; go red if there is a text failure
> >>> TEXT: allow text failure; go red if there is an image failure
> >>> IMAGE TEXT: allow text and/or image failure
> >>
> >> The distinction is that IMAGE TEXT will allow image, text, or both to
> >> fail, thus making transitions among the three generate no events.
> >>  IMAGE+TEXT says specifically that we expect both to fail and that if
> one
> >> starts passing, someone should do something.  (For example, maybe
> someone
> >> checks in a partial rebaseline where they miss the image expectations.)
> >
> >
> > Not to bike-shed on anything, but I think we should rename Text and
> Image to
> > TextOnly and ImageOnly. Every single person I know, including myself, had
> > never got the distinction between IMAGE TEXT and IMAGE+TEXT without
> someone
> > explaining it to him/her .
> >
> >
> > I think IMAGE+TEXT is not a very useful distinction from TEXT either. I
> > checked for uses of TEXT that is not IMAGE+TEXT in the Chromium
> > TextExpectations, and it seems that nearly all instances fall into one of
> > the two following categories:
> >
> > 1) text-only test, so IMAGE+TEXT would not have different semantics from
> > TEXT (the vast majority)
> > 2) Flaky test that may actually pass, so distinguishing what happens with
> > the image result is of limited utility (most of these are also text-only
> > tests; only a small subset even have an image result)
> >
> > Thus, I think Fail and ImageOnlyFail would be more useful and
> understandable
> > categories than {TEXT, IMAGE, TEXT+IMAGE, TEXT IMAGE}. Fail would have
> the
> > semantic that a text failure is expected, and image result if any can
> either
> > pass or fail.
>
> This is perhaps true, but if it's okay I would like to treat that
> feature request separately from the other syntactic changes we've been
> discussing. So far the rest of the changes have not really implied any
> changes to how we actually track which changes fail and how (note that
> FAIL is different and we've fixed that separately from these changes
> as well).
>

Lets have the separate bikeshed. While this is less precise, I agree that
Fail and ImageOnlyFail would capture the vast majority use-case and remove
a frequent source of confusion and error. The big downside of this approach
is that a text-only failure that also starts failing the pixel result make
genuinely indicate a new bug. I think that happens rarely enough that I'm
OK with it for the added simplicity.

A couple open questions:
-Does Fail also replace Audio? Seems reasonable to me.
-What about reftest failures where there is no text comparison? I'd be fine
with saying you can do Fail or ImageOnlyFail and they mean the same thing
here.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev