Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Dirk Pranke
On Wed, Apr 11, 2012 at 4:15 PM, Ryosuke Niwa  wrote:
> On Wed, Apr 11, 2012 at 4:05 PM, Jacob Goldstein  wrote:
>>
>> Don't people currently fix tests that are failing and that they didn't
>> author?
>
>
> We don't fix imported tests if you meant modifying html files, etc... to fix
> a broken test file.
>
>> If so, isn't that similar as generating a new ref file in that you first
>> have to understand what is being tested?  There would be the old pixel file
>> to go from, so in many cases this may not be too hard.
>
>
> Yes, but determining whether a pixel result is correct or not is easier than
> determining whether a reference file is correct AND adequately addresses all
> intended test purposes.

It may sometimes be easier to tell if a pixel result is correct or
not, but not always: I've stared at plenty of failing pixel tests
where there were no visible changes at all.

Moreover, I think that requiring a reference file to address "all
intended test purposes" is too high of a bar to allow it to replace a
PNG; I'm not even sure if it's a practical goal at all, especially
given the limited amount of experience we have with reference tests. I
think there is some lower bar of goodness that we can reach in
practice through experience.

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Ryosuke Niwa
On Wed, Apr 11, 2012 at 4:05 PM, Jacob Goldstein  wrote:
>
> Don't people currently fix tests that are failing and that they didn't
> author?
>

We don't fix imported tests if you meant modifying html files, etc... to
fix a broken test file.

If so, isn't that similar as generating a new ref file in that you first
> have to understand what is being tested?  There would be the old pixel file
> to go from, so in many cases this may not be too hard.
>

Yes, but determining whether a pixel result is correct or not is easier
than determining whether a reference file is correct AND adequately
addresses all intended test purposes.

On the other hand, having the test author create the ref file is definitely
> preferable.  What would we do in the case where a test author is no longer
> available?
>

Someone needs to figure out the intended purpose of the test and make a
call. I've done quite few of those for editing tests we have (not imported
tests); I've converted them to dumpAsText and dump-as-markup tests.

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Dirk Pranke
On Wed, Apr 11, 2012 at 4:00 PM, Ryosuke Niwa  wrote:
> On Wed, Apr 11, 2012 at 3:52 PM, Jacob Goldstein  wrote:
>>
>> +1 on not introducing new pixel tests and allowing someone other than the
>> test author to create the -expected file.
>>
>> We may also be able to streamline some of this process by implementing
>> some helper scripts.  Ultimately, someone will still have to review new
>> files manually, but scripts should be able to speed up the process.
>
>
> -1 on that. As I said on other threads about this topic, determining whether
> a reference file adequately detect all bugs a test is intended to test is
> hard, and losing the test coverage at the cost of lowering maintenance cost
> is not necessary a good thing.
>
> Also, adding a reference file would mean that either we're adding -ref.html
> / -noref.html files or modifying reftest.list. If doing the former, then we
> can't use this approach in any directory where we use reftest.list at the
> moment because we explicitly prohibit mixing naming convention and
> reftest.list.
>
> Modifying reftest.list is essentially modifying the test suite, and it seems
> like there is a consensus that we don't want to do it.
>

This is a quibble compared to the first paragraph, but the last two
paragraphs are merely implementation details (especially if we think
we're generating reftest.list from  tags embedded in the tests).
I think if we agree that it's okay to add -ref.html / -noref.html
files for tests we can revisit what the best way to manage such a
process is. I think the initial guideline was established when we
thought that an imported test suite would come with all of the needed
reference files, and in such a case I agree we should leave it as
"stock" as possible.

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Jacob Goldstein
I understand, all valid points.

Don't people currently fix tests that are failing and that they didn't author?  
If so, isn't that similar as generating a new ref file in that you first have 
to understand what is being tested?  There would be the old pixel file to go 
from, so in many cases this may not be too hard.

On the other hand, having the test author create the ref file is definitely 
preferable.  What would we do in the case where a test author is no longer 
available?




From: Ryosuke Niwa mailto:rn...@webkit.org>>
Date: Wed, 11 Apr 2012 16:00:44 -0700
To: Jacob Goldstein mailto:jac...@adobe.com>>
Cc: Dirk Pranke mailto:dpra...@chromium.org>>, Robert 
Hogan mailto:li...@roberthogan.net>>, 
"webkit-dev@lists.webkit.org<mailto:webkit-dev@lists.webkit.org>" 
mailto:webkit-dev@lists.webkit.org>>
Subject: Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

On Wed, Apr 11, 2012 at 3:52 PM, Jacob Goldstein 
mailto:jac...@adobe.com>> wrote:
+1 on not introducing new pixel tests and allowing someone other than the
test author to create the -expected file.

We may also be able to streamline some of this process by implementing
some helper scripts.  Ultimately, someone will still have to review new
files manually, but scripts should be able to speed up the process.

-1 on that. As I said on other threads about this topic, determining whether a 
reference file adequately detect all bugs a test is intended to test is hard, 
and losing the test coverage at the cost of lowering maintenance cost is not 
necessary a good thing.

Also, adding a reference file would mean that either we're adding -ref.html / 
-noref.html files or modifying reftest.list. If doing the former, then we can't 
use this approach in any directory where we use reftest.list at the moment 
because we explicitly prohibit mixing naming convention and reftest.list.

Modifying reftest.list is essentially modifying the test suite, and it seems 
like there is a consensus that we don't want to do it.

- Ryosuke

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Ryosuke Niwa
On Wed, Apr 11, 2012 at 3:52 PM, Jacob Goldstein  wrote:

> +1 on not introducing new pixel tests and allowing someone other than the
> test author to create the -expected file.
>
> We may also be able to streamline some of this process by implementing
> some helper scripts.  Ultimately, someone will still have to review new
> files manually, but scripts should be able to speed up the process.
>

-1 on that. As I said on other threads about this topic, determining
whether a reference file adequately detect all bugs a test is intended to
test is hard, and losing the test coverage at the cost of lowering
maintenance cost is not necessary a good thing.

Also, adding a reference file would mean that either we're adding -ref.html
/ -noref.html files or modifying reftest.list. If doing the former, then we
can't use this approach in any directory where we use reftest.list at the
moment because we explicitly prohibit mixing naming convention and
reftest.list.

Modifying reftest.list is essentially modifying the test suite, and it
seems like there is a consensus that we don't want to do it.

- Ryosuke
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Jacob Goldstein
+1 on not introducing new pixel tests and allowing someone other than the
test author to create the -expected file.

We may also be able to streamline some of this process by implementing
some helper scripts.  Ultimately, someone will still have to review new
files manually, but scripts should be able to speed up the process.

Alan Stearns and I plan to present some ideas around W3C and WebKit test
integration at the contributors meeting to elicit more feedback and
recruit people to support the effort.




On 4/11/12 3:29 PM, "Dirk Pranke"  wrote:

>On Wed, Apr 11, 2012 at 12:31 PM, Robert Hogan 
>wrote:
>> On Wednesday 11 April 2012 20:27:23 Dirk Pranke wrote:
>>>
>>> What does "clean up the existing folder" entail?
>>>
>>
>> Just move any -expected.html file out of there and generate pixel
>>results.
>> Or move the test and its -expected.html into a folder in fast/css.
>
>Okay, following a conversation on #irc, I think I understand this
>better. There do appear to be differences of opinion over what to do
>here, so I will try to describe this from ground zero.
>
>The CSS test suite has ~9000 tests. the correctness of which is
>established manually (and visually). We would like to import these
>tests and run them automatically. Some small number of tests (a few
>hundred?) do have reference html, but most don't. Since most tests do
>not come with reference html or expected PNGs, we have to add one or
>the other.
>
>Currently we have added ~400 or so to the tree, and when we add them
>we have been writing -expected.html reference results to avoid
>generating pixel results where possible.
>
>This raises at least two questions, which have been confusing the
>discussion until now:
>
>1) What is the best way to add these tests to the tree? Should we add
>them one-at-a-time with -expected files that we have created? Should
>we add them all at once and SKIP the tests until we have generated
>-expected results for each test? Or should we only import subsets that
>have official -expected files?
>
>2) Is it acceptable for someone other than the test author to write
>-expected html references? What bar do we need to have to ensure that
>the reference file will catch the bug the initial test is intended to
>catch? Do we (or should we) maintain a similar bar for pixel tests
>today?
>
>Personally, I don't care that much about (1), as long as we can track
>the provenance of the tests (where they came from) ... I would be
>happy with us importing tests one at a time and adding our own
>results, and hopefully feeding them back up stream to the W3C.
>
>I care much more about (2): I believe we should not be introducing new
>tests that only have pixel tests for results unless absolutely
>necessary - the cost of maintaining pixel test results vastly
>outweighs the potential benefit we get by having a reference file that
>won't break the same way the test breaks. I believe it is reasonably
>for someone other than the test author to write a reference html for a
>test and get it to be good enough that it will provide more value than
>a pixel test result, and we should be encouraging that.
>
>I believe we're hoping to discuss this more broadly at the
>contributors meeting, but it might be good to discuss here first.
>Thoughts?
>
>-- Dirk
>___
>webkit-dev mailing list
>webkit-dev@lists.webkit.org
>http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Robert Hogan
On Wednesday 11 April 2012 23:29:56 Dirk Pranke wrote:
> 
> 1) What is the best way to add these tests to the tree? Should we add
> them one-at-a-time with -expected files that we have created? Should
> we add them all at once and SKIP the tests until we have generated
> -expected results for each test? Or should we only import subsets that
> have official -expected files?

Here is a good example of this in action:

http://trac.webkit.org/changeset/113076

The tests all come unmodified from the CSS test suite, the -expected.html 
files were created by me.

The CSS test suite does not used the -expected.html semantic.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


[webkit-dev] importing test suites (was: CSS 2.1 Test Suite)

2012-04-11 Thread Dirk Pranke
On Wed, Apr 11, 2012 at 12:31 PM, Robert Hogan  wrote:
> On Wednesday 11 April 2012 20:27:23 Dirk Pranke wrote:
>>
>> What does "clean up the existing folder" entail?
>>
>
> Just move any -expected.html file out of there and generate pixel results.
> Or move the test and its -expected.html into a folder in fast/css.

Okay, following a conversation on #irc, I think I understand this
better. There do appear to be differences of opinion over what to do
here, so I will try to describe this from ground zero.

The CSS test suite has ~9000 tests. the correctness of which is
established manually (and visually). We would like to import these
tests and run them automatically. Some small number of tests (a few
hundred?) do have reference html, but most don't. Since most tests do
not come with reference html or expected PNGs, we have to add one or
the other.

Currently we have added ~400 or so to the tree, and when we add them
we have been writing -expected.html reference results to avoid
generating pixel results where possible.

This raises at least two questions, which have been confusing the
discussion until now:

1) What is the best way to add these tests to the tree? Should we add
them one-at-a-time with -expected files that we have created? Should
we add them all at once and SKIP the tests until we have generated
-expected results for each test? Or should we only import subsets that
have official -expected files?

2) Is it acceptable for someone other than the test author to write
-expected html references? What bar do we need to have to ensure that
the reference file will catch the bug the initial test is intended to
catch? Do we (or should we) maintain a similar bar for pixel tests
today?

Personally, I don't care that much about (1), as long as we can track
the provenance of the tests (where they came from) ... I would be
happy with us importing tests one at a time and adding our own
results, and hopefully feeding them back up stream to the W3C.

I care much more about (2): I believe we should not be introducing new
tests that only have pixel tests for results unless absolutely
necessary - the cost of maintaining pixel test results vastly
outweighs the potential benefit we get by having a reference file that
won't break the same way the test breaks. I believe it is reasonably
for someone other than the test author to write a reference html for a
test and get it to be good enough that it will provide more value than
a pixel test result, and we should be encouraging that.

I believe we're hoping to discuss this more broadly at the
contributors meeting, but it might be good to discuss here first.
Thoughts?

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev