Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Maciej Stachowiak

On Jun 13, 2012, at 1:32 PM, Geoffrey Garen  wrote:

>> These tests regularly timeout on the Chromium debug bots and occasionally
>> timeout on the Apple Lion bots.
> 
> WebKit has a clear policy about this: Tests must be fast enough not to time 
> out. We can fix this issue by making these tests shorter. 
> 
> I don't really see the connection to an abstract debate about fuzzers. 
> Fuzzers can be short-running, and non-fuzzers can be long-running.

Also, if a fuzzer is deterministic (which it should be, if it's going to be in 
LayoutTests), there is probably a fairly mechanical way to split it into 
multiple faster tests.

 - Maciej

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Geoffrey Garen
> These tests regularly timeout on the Chromium debug bots and occasionally
> timeout on the Apple Lion bots.

WebKit has a clear policy about this: Tests must be fast enough not to time 
out. We can fix this issue by making these tests shorter. 

I don't really see the connection to an abstract debate about fuzzers. Fuzzers 
can be short-running, and non-fuzzers can be long-running.

Geoff
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Dirk Pranke
I guess I was saying two slightly different things ...

1) I have a strong bias for individual tests that are fast
2) I have a strong bias for individual tests that are simple, focused,
easy to understand, and are predictable. All other things being equal
(which of course they never are), I would prefer 100 different tests
that can fail individually to 1 test that tests 100 different things.

Of course you have to weigh this against coverage and establishing
correctness; I wouldn't want to lose coverage, either.

-- Dirk

On Wed, Jun 13, 2012 at 12:17 PM, Filip Pizlo  wrote:
> Are we sure that we want to make this a general rule?
>
> We have two profitable fuzzers in fast/js that I believe deserve to be in 
> LayoutTests and should be run every time you make any JSC change:
>
> LayoutTests/fast/js/dfg-double-vote-fuzz.html
> LayoutTests/fast/js/dfg-poison-fuzz.html
>
> Both are somewhat long-running (I seem to recall some buzz about them being 
> marked either SLOW or TIMEOUT on Chromium) but both have caught lots of bugs 
> in the JSC optimizing JIT.  They generate ~1000 simple programs and eval 
> them, each program differing in the position of some evil operation.  When 
> you get a crash or a fail, it's pretty easy to use them to quickly identify 
> what went wrong since the offending code is nice and tidy.  On the other 
> hand, if it wasn't for their use of fuzzing, they would certainly have 
> reduced coverage because the exact shape of a program that would cause a 
> failure depends on number of registers available and compiler heuristics, 
> both of which can change with unrelated changes to the JIT or if you switch 
> hardware targets.
>
> So these tests are great for testing things like register allocation, OSR, 
> and type inference.  Even seemingly unrelated changes to JSC, or possibly 
> even JSC bindings, could either cause or reveal bugs that these tests would 
> catch.  Hence it would be bad if they were not part of the LayoutTests.  We 
> would lose coverage while gaining very little in return, since although these 
> tests are on the slower end of the execution time spectrum, the other fast/js 
> tests put together take much longer and probably don't catch as many juicy 
> bugs.  Certainly no other test in LayoutTests/fast/js does nearly as good of 
> a job in covering the code paths that deal with register allocation under 
> register pressure, or type inference under evil control flow, in the presence 
> of an operation that would cause an OSR exit.
>
> More broadly, I think this is a question of test economics.  Does this 
> particular fuzzer test catch enough bugs to justify its run-time?  If yes 
> then we should keep it.  And if nobody can recall a time when the test saved 
> them from making a broken commit, or when it helped a bot watcher identify a 
> genuinely broken changeset, then we should probably get rid of it.
>
> -F
>
>
> On Jun 13, 2012, at 11:58 AM, Dirk Pranke wrote:
>
>> I agree that the fuzzer should be used to create dedicated layout
>> tests, but we shouldn't run the fuzzer itself as part of the layout
>> test regression. I would have no objection to it being a separate test
>> step.
>>
>> -- Dirk
>>
>> On Tue, Jun 12, 2012 at 5:17 PM, Ojan Vafai  wrote:
>>> See https://bugs.webkit.org/show_bug.cgi?id=87772.
>>>
>>> It's great to use a fuzzer in order to find cases where we're broken and
>>> then make reduced layout tests from those. The viewspec-parser tests are
>>> themselves just a fuzzer though. Granted, they are deterministic by avoiding
>>> using an actual random function, but I don't think throwing randomly
>>> generated bits at a parser is appropriate for layout testing. If nothing
>>> else it's very slow.
>>>
>>> These tests regularly timeout on the Chromium debug bots and occasionally
>>> timeout on the Apple Lion bots. Even on the bots where they don't timeout,
>>> they're slow. I don't it makes sense to spend 1+ minutes running these 5
>>> tests when more targeted reductions could get the same effective coverage
>>> much faster.
>>>
>>> Am I wrong? If not, does anyone object to moving these tests over to
>>> ManualTests or just deleting them entirely?
>>>
>>> Ojan
>>>
>>> ___
>>> webkit-dev mailing list
>>> webkit-dev@lists.webkit.org
>>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>>>
>> ___
>> webkit-dev mailing list
>> webkit-dev@lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Filip Pizlo
Are we sure that we want to make this a general rule?

We have two profitable fuzzers in fast/js that I believe deserve to be in 
LayoutTests and should be run every time you make any JSC change:

LayoutTests/fast/js/dfg-double-vote-fuzz.html
LayoutTests/fast/js/dfg-poison-fuzz.html

Both are somewhat long-running (I seem to recall some buzz about them being 
marked either SLOW or TIMEOUT on Chromium) but both have caught lots of bugs in 
the JSC optimizing JIT.  They generate ~1000 simple programs and eval them, 
each program differing in the position of some evil operation.  When you get a 
crash or a fail, it's pretty easy to use them to quickly identify what went 
wrong since the offending code is nice and tidy.  On the other hand, if it 
wasn't for their use of fuzzing, they would certainly have reduced coverage 
because the exact shape of a program that would cause a failure depends on 
number of registers available and compiler heuristics, both of which can change 
with unrelated changes to the JIT or if you switch hardware targets.

So these tests are great for testing things like register allocation, OSR, and 
type inference.  Even seemingly unrelated changes to JSC, or possibly even JSC 
bindings, could either cause or reveal bugs that these tests would catch.  
Hence it would be bad if they were not part of the LayoutTests.  We would lose 
coverage while gaining very little in return, since although these tests are on 
the slower end of the execution time spectrum, the other fast/js tests put 
together take much longer and probably don't catch as many juicy bugs.  
Certainly no other test in LayoutTests/fast/js does nearly as good of a job in 
covering the code paths that deal with register allocation under register 
pressure, or type inference under evil control flow, in the presence of an 
operation that would cause an OSR exit.

More broadly, I think this is a question of test economics.  Does this 
particular fuzzer test catch enough bugs to justify its run-time?  If yes then 
we should keep it.  And if nobody can recall a time when the test saved them 
from making a broken commit, or when it helped a bot watcher identify a 
genuinely broken changeset, then we should probably get rid of it.

-F


On Jun 13, 2012, at 11:58 AM, Dirk Pranke wrote:

> I agree that the fuzzer should be used to create dedicated layout
> tests, but we shouldn't run the fuzzer itself as part of the layout
> test regression. I would have no objection to it being a separate test
> step.
> 
> -- Dirk
> 
> On Tue, Jun 12, 2012 at 5:17 PM, Ojan Vafai  wrote:
>> See https://bugs.webkit.org/show_bug.cgi?id=87772.
>> 
>> It's great to use a fuzzer in order to find cases where we're broken and
>> then make reduced layout tests from those. The viewspec-parser tests are
>> themselves just a fuzzer though. Granted, they are deterministic by avoiding
>> using an actual random function, but I don't think throwing randomly
>> generated bits at a parser is appropriate for layout testing. If nothing
>> else it's very slow.
>> 
>> These tests regularly timeout on the Chromium debug bots and occasionally
>> timeout on the Apple Lion bots. Even on the bots where they don't timeout,
>> they're slow. I don't it makes sense to spend 1+ minutes running these 5
>> tests when more targeted reductions could get the same effective coverage
>> much faster.
>> 
>> Am I wrong? If not, does anyone object to moving these tests over to
>> ManualTests or just deleting them entirely?
>> 
>> Ojan
>> 
>> ___
>> webkit-dev mailing list
>> webkit-dev@lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>> 
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Darin Adler
On Jun 13, 2012, at 12:12 PM, Dirk Pranke  wrote:

> On Wed, Jun 13, 2012 at 12:05 PM, Darin Adler  wrote:
>> On Jun 12, 2012, at 5:17 PM, Ojan Vafai  wrote:
>> 
>>> It's great to use a fuzzer in order to find cases where we're broken and 
>>> then make reduced layout tests from those.
>> 
>> Generally we do require a test each time we fix a bug. So it’s a strategy 
>> for the project to always make reduced tests when we find a bug.
>> 
>> But using a fuzzer to find bugs and then making a regression test for each 
>> bug we find will not give us great coverage. We’d like tests that cover lots 
>> of the code paths in WebKit, even the ones without bugs.
>> 
>> I’m not saying we should necessarily keep fuzzer-style tests, but to replace 
>> them we would need to add tests with good coverage, going beyond regression 
>> tests for bugs that existed in the project at one point.
> 
> I have always been under the impression that  LayoutTests were not just 
> intended for preventing regressions to bugfixes, but that we should also be 
> adding tests to establish correctness (and hopefully achieve good coverage) 
> there.

That’s right. Did my words above give an impression to the contrary?

I am trying to say that we should be sure to keep good coverage when we remove 
a fuzzer-style test, possibly by adding tests that cover the same code in a 
different way.

I’m not making some kind of global statement about all the tests.

-- Darin
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Dirk Pranke
On Wed, Jun 13, 2012 at 12:05 PM, Darin Adler  wrote:
> On Jun 12, 2012, at 5:17 PM, Ojan Vafai  wrote:
>
>> It's great to use a fuzzer in order to find cases where we're broken and 
>> then make reduced layout tests from those.
>
>
> Generally we do require a test each time we fix a bug. So it’s a strategy for 
> the project to always make reduced tests when we find a bug.
>
> But using a fuzzer to find bugs and then making a regression test for each 
> bug we find will not give us great coverage. We’d like tests that cover lots 
> of the code paths in WebKit, even the ones without bugs.
>
> I’m not saying we should necessarily keep fuzzer-style tests, but to replace 
> them we would need to add tests with good coverage, going beyond regression 
> tests for bugs that existed in the project at one point.
>

I have always been under the impression that  LayoutTests were not
just intended for preventing regressions to bugfixes, but that we
should also be adding tests to establish correctness (and hopefully
achieve good coverage) there. Is that not the case?

Certainly I would expect the imported test suites to be testing
correctness and not just be regression bug fixes.

-- Dirk
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Darin Adler
On Jun 12, 2012, at 5:17 PM, Ojan Vafai  wrote:

> It's great to use a fuzzer in order to find cases where we're broken and then 
> make reduced layout tests from those.


Generally we do require a test each time we fix a bug. So it’s a strategy for 
the project to always make reduced tests when we find a bug.

But using a fuzzer to find bugs and then making a regression test for each bug 
we find will not give us great coverage. We’d like tests that cover lots of the 
code paths in WebKit, even the ones without bugs.

I’m not saying we should necessarily keep fuzzer-style tests, but to replace 
them we would need to add tests with good coverage, going beyond regression 
tests for bugs that existed in the project at one point.

At one point, I remember Geoff Garen encouraging a fuzzer-type approach to 
making some tests for an SVG path parser, as an alternative to my plan of 
making a test that covered every branch in the parser code. I don’t remember 
what we ended up doing in that case.

-- Darin
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Dirk Pranke
I agree that the fuzzer should be used to create dedicated layout
tests, but we shouldn't run the fuzzer itself as part of the layout
test regression. I would have no objection to it being a separate test
step.

-- Dirk

On Tue, Jun 12, 2012 at 5:17 PM, Ojan Vafai  wrote:
> See https://bugs.webkit.org/show_bug.cgi?id=87772.
>
> It's great to use a fuzzer in order to find cases where we're broken and
> then make reduced layout tests from those. The viewspec-parser tests are
> themselves just a fuzzer though. Granted, they are deterministic by avoiding
> using an actual random function, but I don't think throwing randomly
> generated bits at a parser is appropriate for layout testing. If nothing
> else it's very slow.
>
> These tests regularly timeout on the Chromium debug bots and occasionally
> timeout on the Apple Lion bots. Even on the bots where they don't timeout,
> they're slow. I don't it makes sense to spend 1+ minutes running these 5
> tests when more targeted reductions could get the same effective coverage
> much faster.
>
> Am I wrong? If not, does anyone object to moving these tests over to
> ManualTests or just deleting them entirely?
>
> Ojan
>
> ___
> webkit-dev mailing list
> webkit-dev@lists.webkit.org
> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


Re: [webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-13 Thread Dan Bernstein

On Jun 12, 2012, at 5:17 PM, Ojan Vafai  wrote:

> See https://bugs.webkit.org/show_bug.cgi?id=87772.
> 
> It's great to use a fuzzer in order to find cases where we're broken and then 
> make reduced layout tests from those. The viewspec-parser tests are 
> themselves just a fuzzer though. Granted, they are deterministic by avoiding 
> using an actual random function, but I don't think throwing randomly 
> generated bits at a parser is appropriate for layout testing. If nothing else 
> it's very slow.
> 
> These tests regularly timeout on the Chromium debug bots and occasionally 
> timeout on the Apple Lion bots. Even on the bots where they don't timeout, 
> they're slow. I don't it makes sense to spend 1+ minutes running these 5 
> tests when more targeted reductions could get the same effective coverage 
> much faster.
> 
> Am I wrong? If not, does anyone object to moving these tests over to 
> ManualTests or just deleting them entirely?

I am not familiar with the viewspec-parser tests and their history, but I agree 
in principle that fuzzers and their raw output rarely make for the best way to 
regression test.
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev


[webkit-dev] are fuzzer tests appropriate layout tests?

2012-06-12 Thread Ojan Vafai
See https://bugs.webkit.org/show_bug.cgi?id=87772.

It's great to use a fuzzer in order to find cases where we're broken and
then make reduced layout tests from those. The viewspec-parser tests are
themselves just a fuzzer though. Granted, they are deterministic by
avoiding using an actual random function, but I don't think throwing
randomly generated bits at a parser is appropriate for layout testing. If
nothing else it's very slow.

These tests regularly timeout on the Chromium debug bots and occasionally
timeout on the Apple Lion bots. Even on the bots where they don't timeout,
they're slow. I don't it makes sense to spend 1+ minutes running these 5
tests when more targeted reductions could get the same effective coverage
much faster.

Am I wrong? If not, does anyone object to moving these tests over to
ManualTests or just deleting them entirely?

Ojan
___
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev