Re: [webkit-dev] are fuzzer tests appropriate layout tests?
On Jun 13, 2012, at 1:32 PM, Geoffrey Garen wrote: >> These tests regularly timeout on the Chromium debug bots and occasionally >> timeout on the Apple Lion bots. > > WebKit has a clear policy about this: Tests must be fast enough not to time > out. We can fix this issue by making these tests shorter. > > I don't really see the connection to an abstract debate about fuzzers. > Fuzzers can be short-running, and non-fuzzers can be long-running. Also, if a fuzzer is deterministic (which it should be, if it's going to be in LayoutTests), there is probably a fairly mechanical way to split it into multiple faster tests. - Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
> These tests regularly timeout on the Chromium debug bots and occasionally > timeout on the Apple Lion bots. WebKit has a clear policy about this: Tests must be fast enough not to time out. We can fix this issue by making these tests shorter. I don't really see the connection to an abstract debate about fuzzers. Fuzzers can be short-running, and non-fuzzers can be long-running. Geoff ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
I guess I was saying two slightly different things ... 1) I have a strong bias for individual tests that are fast 2) I have a strong bias for individual tests that are simple, focused, easy to understand, and are predictable. All other things being equal (which of course they never are), I would prefer 100 different tests that can fail individually to 1 test that tests 100 different things. Of course you have to weigh this against coverage and establishing correctness; I wouldn't want to lose coverage, either. -- Dirk On Wed, Jun 13, 2012 at 12:17 PM, Filip Pizlo wrote: > Are we sure that we want to make this a general rule? > > We have two profitable fuzzers in fast/js that I believe deserve to be in > LayoutTests and should be run every time you make any JSC change: > > LayoutTests/fast/js/dfg-double-vote-fuzz.html > LayoutTests/fast/js/dfg-poison-fuzz.html > > Both are somewhat long-running (I seem to recall some buzz about them being > marked either SLOW or TIMEOUT on Chromium) but both have caught lots of bugs > in the JSC optimizing JIT. They generate ~1000 simple programs and eval > them, each program differing in the position of some evil operation. When > you get a crash or a fail, it's pretty easy to use them to quickly identify > what went wrong since the offending code is nice and tidy. On the other > hand, if it wasn't for their use of fuzzing, they would certainly have > reduced coverage because the exact shape of a program that would cause a > failure depends on number of registers available and compiler heuristics, > both of which can change with unrelated changes to the JIT or if you switch > hardware targets. > > So these tests are great for testing things like register allocation, OSR, > and type inference. Even seemingly unrelated changes to JSC, or possibly > even JSC bindings, could either cause or reveal bugs that these tests would > catch. Hence it would be bad if they were not part of the LayoutTests. We > would lose coverage while gaining very little in return, since although these > tests are on the slower end of the execution time spectrum, the other fast/js > tests put together take much longer and probably don't catch as many juicy > bugs. Certainly no other test in LayoutTests/fast/js does nearly as good of > a job in covering the code paths that deal with register allocation under > register pressure, or type inference under evil control flow, in the presence > of an operation that would cause an OSR exit. > > More broadly, I think this is a question of test economics. Does this > particular fuzzer test catch enough bugs to justify its run-time? If yes > then we should keep it. And if nobody can recall a time when the test saved > them from making a broken commit, or when it helped a bot watcher identify a > genuinely broken changeset, then we should probably get rid of it. > > -F > > > On Jun 13, 2012, at 11:58 AM, Dirk Pranke wrote: > >> I agree that the fuzzer should be used to create dedicated layout >> tests, but we shouldn't run the fuzzer itself as part of the layout >> test regression. I would have no objection to it being a separate test >> step. >> >> -- Dirk >> >> On Tue, Jun 12, 2012 at 5:17 PM, Ojan Vafai wrote: >>> See https://bugs.webkit.org/show_bug.cgi?id=87772. >>> >>> It's great to use a fuzzer in order to find cases where we're broken and >>> then make reduced layout tests from those. The viewspec-parser tests are >>> themselves just a fuzzer though. Granted, they are deterministic by avoiding >>> using an actual random function, but I don't think throwing randomly >>> generated bits at a parser is appropriate for layout testing. If nothing >>> else it's very slow. >>> >>> These tests regularly timeout on the Chromium debug bots and occasionally >>> timeout on the Apple Lion bots. Even on the bots where they don't timeout, >>> they're slow. I don't it makes sense to spend 1+ minutes running these 5 >>> tests when more targeted reductions could get the same effective coverage >>> much faster. >>> >>> Am I wrong? If not, does anyone object to moving these tests over to >>> ManualTests or just deleting them entirely? >>> >>> Ojan >>> >>> ___ >>> webkit-dev mailing list >>> webkit-dev@lists.webkit.org >>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >>> >> ___ >> webkit-dev mailing list >> webkit-dev@lists.webkit.org >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
Are we sure that we want to make this a general rule? We have two profitable fuzzers in fast/js that I believe deserve to be in LayoutTests and should be run every time you make any JSC change: LayoutTests/fast/js/dfg-double-vote-fuzz.html LayoutTests/fast/js/dfg-poison-fuzz.html Both are somewhat long-running (I seem to recall some buzz about them being marked either SLOW or TIMEOUT on Chromium) but both have caught lots of bugs in the JSC optimizing JIT. They generate ~1000 simple programs and eval them, each program differing in the position of some evil operation. When you get a crash or a fail, it's pretty easy to use them to quickly identify what went wrong since the offending code is nice and tidy. On the other hand, if it wasn't for their use of fuzzing, they would certainly have reduced coverage because the exact shape of a program that would cause a failure depends on number of registers available and compiler heuristics, both of which can change with unrelated changes to the JIT or if you switch hardware targets. So these tests are great for testing things like register allocation, OSR, and type inference. Even seemingly unrelated changes to JSC, or possibly even JSC bindings, could either cause or reveal bugs that these tests would catch. Hence it would be bad if they were not part of the LayoutTests. We would lose coverage while gaining very little in return, since although these tests are on the slower end of the execution time spectrum, the other fast/js tests put together take much longer and probably don't catch as many juicy bugs. Certainly no other test in LayoutTests/fast/js does nearly as good of a job in covering the code paths that deal with register allocation under register pressure, or type inference under evil control flow, in the presence of an operation that would cause an OSR exit. More broadly, I think this is a question of test economics. Does this particular fuzzer test catch enough bugs to justify its run-time? If yes then we should keep it. And if nobody can recall a time when the test saved them from making a broken commit, or when it helped a bot watcher identify a genuinely broken changeset, then we should probably get rid of it. -F On Jun 13, 2012, at 11:58 AM, Dirk Pranke wrote: > I agree that the fuzzer should be used to create dedicated layout > tests, but we shouldn't run the fuzzer itself as part of the layout > test regression. I would have no objection to it being a separate test > step. > > -- Dirk > > On Tue, Jun 12, 2012 at 5:17 PM, Ojan Vafai wrote: >> See https://bugs.webkit.org/show_bug.cgi?id=87772. >> >> It's great to use a fuzzer in order to find cases where we're broken and >> then make reduced layout tests from those. The viewspec-parser tests are >> themselves just a fuzzer though. Granted, they are deterministic by avoiding >> using an actual random function, but I don't think throwing randomly >> generated bits at a parser is appropriate for layout testing. If nothing >> else it's very slow. >> >> These tests regularly timeout on the Chromium debug bots and occasionally >> timeout on the Apple Lion bots. Even on the bots where they don't timeout, >> they're slow. I don't it makes sense to spend 1+ minutes running these 5 >> tests when more targeted reductions could get the same effective coverage >> much faster. >> >> Am I wrong? If not, does anyone object to moving these tests over to >> ManualTests or just deleting them entirely? >> >> Ojan >> >> ___ >> webkit-dev mailing list >> webkit-dev@lists.webkit.org >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >> > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
On Jun 13, 2012, at 12:12 PM, Dirk Pranke wrote: > On Wed, Jun 13, 2012 at 12:05 PM, Darin Adler wrote: >> On Jun 12, 2012, at 5:17 PM, Ojan Vafai wrote: >> >>> It's great to use a fuzzer in order to find cases where we're broken and >>> then make reduced layout tests from those. >> >> Generally we do require a test each time we fix a bug. So it’s a strategy >> for the project to always make reduced tests when we find a bug. >> >> But using a fuzzer to find bugs and then making a regression test for each >> bug we find will not give us great coverage. We’d like tests that cover lots >> of the code paths in WebKit, even the ones without bugs. >> >> I’m not saying we should necessarily keep fuzzer-style tests, but to replace >> them we would need to add tests with good coverage, going beyond regression >> tests for bugs that existed in the project at one point. > > I have always been under the impression that LayoutTests were not just > intended for preventing regressions to bugfixes, but that we should also be > adding tests to establish correctness (and hopefully achieve good coverage) > there. That’s right. Did my words above give an impression to the contrary? I am trying to say that we should be sure to keep good coverage when we remove a fuzzer-style test, possibly by adding tests that cover the same code in a different way. I’m not making some kind of global statement about all the tests. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
On Wed, Jun 13, 2012 at 12:05 PM, Darin Adler wrote: > On Jun 12, 2012, at 5:17 PM, Ojan Vafai wrote: > >> It's great to use a fuzzer in order to find cases where we're broken and >> then make reduced layout tests from those. > > > Generally we do require a test each time we fix a bug. So it’s a strategy for > the project to always make reduced tests when we find a bug. > > But using a fuzzer to find bugs and then making a regression test for each > bug we find will not give us great coverage. We’d like tests that cover lots > of the code paths in WebKit, even the ones without bugs. > > I’m not saying we should necessarily keep fuzzer-style tests, but to replace > them we would need to add tests with good coverage, going beyond regression > tests for bugs that existed in the project at one point. > I have always been under the impression that LayoutTests were not just intended for preventing regressions to bugfixes, but that we should also be adding tests to establish correctness (and hopefully achieve good coverage) there. Is that not the case? Certainly I would expect the imported test suites to be testing correctness and not just be regression bug fixes. -- Dirk ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
On Jun 12, 2012, at 5:17 PM, Ojan Vafai wrote: > It's great to use a fuzzer in order to find cases where we're broken and then > make reduced layout tests from those. Generally we do require a test each time we fix a bug. So it’s a strategy for the project to always make reduced tests when we find a bug. But using a fuzzer to find bugs and then making a regression test for each bug we find will not give us great coverage. We’d like tests that cover lots of the code paths in WebKit, even the ones without bugs. I’m not saying we should necessarily keep fuzzer-style tests, but to replace them we would need to add tests with good coverage, going beyond regression tests for bugs that existed in the project at one point. At one point, I remember Geoff Garen encouraging a fuzzer-type approach to making some tests for an SVG path parser, as an alternative to my plan of making a test that covered every branch in the parser code. I don’t remember what we ended up doing in that case. -- Darin ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
I agree that the fuzzer should be used to create dedicated layout tests, but we shouldn't run the fuzzer itself as part of the layout test regression. I would have no objection to it being a separate test step. -- Dirk On Tue, Jun 12, 2012 at 5:17 PM, Ojan Vafai wrote: > See https://bugs.webkit.org/show_bug.cgi?id=87772. > > It's great to use a fuzzer in order to find cases where we're broken and > then make reduced layout tests from those. The viewspec-parser tests are > themselves just a fuzzer though. Granted, they are deterministic by avoiding > using an actual random function, but I don't think throwing randomly > generated bits at a parser is appropriate for layout testing. If nothing > else it's very slow. > > These tests regularly timeout on the Chromium debug bots and occasionally > timeout on the Apple Lion bots. Even on the bots where they don't timeout, > they're slow. I don't it makes sense to spend 1+ minutes running these 5 > tests when more targeted reductions could get the same effective coverage > much faster. > > Am I wrong? If not, does anyone object to moving these tests over to > ManualTests or just deleting them entirely? > > Ojan > > ___ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] are fuzzer tests appropriate layout tests?
On Jun 12, 2012, at 5:17 PM, Ojan Vafai wrote: > See https://bugs.webkit.org/show_bug.cgi?id=87772. > > It's great to use a fuzzer in order to find cases where we're broken and then > make reduced layout tests from those. The viewspec-parser tests are > themselves just a fuzzer though. Granted, they are deterministic by avoiding > using an actual random function, but I don't think throwing randomly > generated bits at a parser is appropriate for layout testing. If nothing else > it's very slow. > > These tests regularly timeout on the Chromium debug bots and occasionally > timeout on the Apple Lion bots. Even on the bots where they don't timeout, > they're slow. I don't it makes sense to spend 1+ minutes running these 5 > tests when more targeted reductions could get the same effective coverage > much faster. > > Am I wrong? If not, does anyone object to moving these tests over to > ManualTests or just deleting them entirely? I am not familiar with the viewspec-parser tests and their history, but I agree in principle that fuzzers and their raw output rarely make for the best way to regression test. ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
[webkit-dev] are fuzzer tests appropriate layout tests?
See https://bugs.webkit.org/show_bug.cgi?id=87772. It's great to use a fuzzer in order to find cases where we're broken and then make reduced layout tests from those. The viewspec-parser tests are themselves just a fuzzer though. Granted, they are deterministic by avoiding using an actual random function, but I don't think throwing randomly generated bits at a parser is appropriate for layout testing. If nothing else it's very slow. These tests regularly timeout on the Chromium debug bots and occasionally timeout on the Apple Lion bots. Even on the bots where they don't timeout, they're slow. I don't it makes sense to spend 1+ minutes running these 5 tests when more targeted reductions could get the same effective coverage much faster. Am I wrong? If not, does anyone object to moving these tests over to ManualTests or just deleting them entirely? Ojan ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev