Simon, are you suggesting that we should only use pixel results for ref tests? If not, then we still need to come to a conclusion on this tolerance issue.
Dirk, implementing --tolerance in NRWT isn't that hard is it? Getting rid of --tolerance will be a lot of work of making sure all the pixel results that currently pass also pass with --tolerance=0. While I would support someone doing that work, I don't think we should block moving to NRWT on it. Ojan On Fri, Oct 8, 2010 at 1:03 PM, Simon Fraser <simon.fra...@apple.com> wrote: > I think the best solution to this pixel matching problem is ref tests. > > How practical would it be to use ref tests for SVG? > > Simon > > On Oct 8, 2010, at 12:43 PM, Dirk Pranke wrote: > > > Jeremy is correct; the Chromium port has seen real regressions that > > virtually no concept of a fuzzy match that I can imagine would've > > caught. > > new-run-webkit-tests doesn't currently support the tolerance concept > > at al, and I am inclined to argue that it shouldn't. > > > > However, I frequently am wrong about things, so it's quite possible > > that there are good arguments for supporting it that I'm not aware of. > > I'm not particularly interested in working on a tool that doesn't do > > what the group wants it to do, and I would like all of the other > > WebKit ports to be running pixel tests by default (and > > new-run-webkit-tests ;) ) since I think it catches bugs. > > > > As far as I know, the general sentiment on the list has been that we > > should be running pixel tests by default, and the reason that we > > aren't is largely due to the work involved in getting them back up to > > date and keeping them up to date. I'm sure that fuzzy matching reduces > > the work load, especially for the sort of mismatches caused by > > differences in the text antialiasing. > > > > In addition, I have heard concerns that we'd like to keep fuzzy > > matching because people might potentially get different results on > > machines with different hardware configurations, but I don't know that > > we have any confirmed cases of that (except for arguably the case of > > different code paths for gpu-accelerated rendering vs. unaccelerated > > rendering). > > > > If we made it easier to maintain the baselines (improved tooling like > > the chromium's rebaselining tool, add reftest support, etc.) are there > > still compelling reasons for supporting --tolerance -based testing as > > opposed to exact matching? > > > > -- Dirk > > > > On Fri, Oct 8, 2010 at 11:14 AM, Jeremy Orlow <jor...@chromium.org> > wrote: > >> I'm not an expert on Pixel tests, but my understanding is that in > Chromium > >> (where we've always run with tolerance 0) we've seen real regressions > that > >> would have slipped by with something like tolerance 0.1. When you have > >> 0 tolerance, it is more maintenance work, but if we can avoid > regressions, > >> it seems worth it. > >> J > >> > >> On Fri, Oct 8, 2010 at 10:58 AM, Nikolas Zimmermann > >> <zimmerm...@physik.rwth-aachen.de> wrote: > >>> > >>> Am 08.10.2010 um 19:53 schrieb Maciej Stachowiak: > >>> > >>>> > >>>> On Oct 8, 2010, at 12:46 AM, Nikolas Zimmermann wrote: > >>>> > >>>>> > >>>>> Am 08.10.2010 um 00:44 schrieb Maciej Stachowiak: > >>>>> > >>>>>> > >>>>>> On Oct 7, 2010, at 6:34 AM, Nikolas Zimmermann wrote: > >>>>>> > >>>>>>> Good evening webkit folks, > >>>>>>> > >>>>>>> I've finished landing svg/ pixel test baselines, which pass with > >>>>>>> --tolerance 0 on my 10.5 & 10.6 machines. > >>>>>>> As the pixel testing is very important for the SVG tests, I'd like > to > >>>>>>> run them on the bots, experimentally, so we can catch regressions > easily. > >>>>>>> > >>>>>>> Maybe someone with direct access to the leopard & snow leopard > bots, > >>>>>>> could just run "run-webkit-tests --tolerance 0 -p svg" and mail me > the > >>>>>>> results? > >>>>>>> If it passes, we could maybe run the pixel tests for the svg/ > >>>>>>> subdirectory on these bots? > >>>>>> > >>>>>> Running pixel tests would be great, but can we really expect the > >>>>>> results to be stable cross-platform with tolerance 0? Perhaps we > should > >>>>>> start with a higher tolerance level. > >>>>> > >>>>> Sure, we could do that. But I'd really like to get a feeling, for > what's > >>>>> problematic first. If we see 95% of the SVG tests pass with > --tolerance 0, > >>>>> and only a few need higher tolerances > >>>>> (64bit vs. 32bit aa differences, etc.), I could come up with a > per-file > >>>>> pixel test tolerance extension to DRT, if it's needed. > >>>>> > >>>>> How about starting with just one build slave (say. Mac Leopard) that > >>>>> runs the pixel tests for SVG, with --tolerance 0 for a while. I'd be > happy > >>>>> to identify the problems, and see > >>>>> if we can make it work, somehow :-) > >>>> > >>>> The problem I worry about is that on future Mac OS X releases, > rendering > >>>> of shapes may change in some tiny way that is not visible but enough > to > >>>> cause failures at tolerance 0. In the past, such false positives arose > from > >>>> time to time, which is one reason we added pixel test tolerance in the > first > >>>> place. I don't think running pixel tests on just one build slave will > help > >>>> us understand that risk. > >>> > >>> I think we'd just update the baseline to the newer OS X release, then, > >>> like it has been done for the tiger -> leopard, leopard -> snow leopard > >>> switch? > >>> platform/mac/ should always contain the newest release baseline, when > >>> therere are differences on leopard, the results go into > >>> platform/mac-leopard/ > >>> > >>>> Why not start with some low but non-zero tolerance (0.1?) and see if > we > >>>> can at least make that work consistently, before we try the bolder > step of > >>>> tolerance 0? > >>>> Also, and as a side note, we probably need to add more build slaves to > >>>> run pixel tests at all, since just running the test suite without > pixel > >>>> tests is already slow enough that the testers are often significantly > behind > >>>> the builders. > >>> > >>> Well, I thought about just running the pixel tests for the svg/ > >>> subdirectory as a seperate step, hence my request for tolerance 0, as > the > >>> baseline passes without problems at least on my & Dirks machine > already. > >>> I wouldnt' want to argue running 20.000+ pixel tests with tolerance 0 > as > >>> first step :-) But the 1000 SVG tests, might be fine, with tolerance 0? > >>> > >>> Even tolerance 0.1 as default for SVG would be fine with me, as long as > we > >>> can get the bots to run the SVG pixel tests :-) > >>> > >>> Cheers, > >>> Niko > >>> > >>> _______________________________________________ > >>> webkit-dev mailing list > >>> webkit-dev@lists.webkit.org > >>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > >> > >> > >> _______________________________________________ > >> webkit-dev mailing list > >> webkit-dev@lists.webkit.org > >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > >> > >> > > _______________________________________________ > > webkit-dev mailing list > > webkit-dev@lists.webkit.org > > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > > _______________________________________________ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev