Plan 3 seems like the best (and simplest) one until the infrastructure for the others (and/or a champion for fixing currently failing tests) is available.
What would it take to go with plan 3? I guess someone needs to rebaseline everything that's currently failing, check them in, and then someone (like bdash?) needs to flip a switch on the bots...? Did I miss anything? Are there instructions on how to do the rebaselining anywhere? I've only ever created pixel baselines for Chromium before (where we have a pretty neat tool that pretty much does it for you). J On Fri, Jan 8, 2010 at 9:23 AM, Pam Greene <p...@chromium.org> wrote: > And one very quick, short-term solution: > > 3. Generate new pixel results to match the current behavior, and check them > in as hypothetically correct. > > And of course if someone notices an existing problem and fixes it, they > check in corrected images then. It doesn't help find current problems, but > those are being missed now anyway. It does let the tests be run again > approximately immediately, even faster than waiting for test expectations > functionality, so we can catch regressions moving forward. > > - Pam > > On Thu, Jan 7, 2010 at 5:01 PM, Ojan Vafai <o...@chromium.org> wrote: > >> On Thu, Jan 7, 2010 at 10:22 AM, Darin Adler <da...@apple.com> wrote: >> >>> On Jan 7, 2010, at 10:19 AM, Dimitri Glazkov wrote: >>> > Are we planning to run pixel tests on the build bots? >>> >>> If we can get them green, we should. It’s a lot of work. We need a >>> volunteer to do that work. We’ve tried before. >> >> >> Two possible long-term solutions come to mind: >> 1. Turn the bots orange on pixel failures. They still need fixing, but are >> not as severe as text diff failures. I'm not a huge fan of this, but it's an >> option. >> 2. Add in a concept of expected failures and only turn the bots red for >> *unexpected* failurs. More details on this below. >> >> In chromium-land, there's an expectations file that lists expected >> failures and allows for distinguishing different types of failures (e.g. >> IMAGE vs. TEXT). It's like Skipped lists, but doesn't necessarily skip the >> test. Fixing the expected failures still needs doing of course, but can be >> done asynchronously. The primary advantage of this approach is that we can >> turn on pixel tests, keep the bots green and avoid further regressions. >> >> Would something like that make sense for WebKit as a whole? To be clear, >> we would be nearly as loathe to add tests to this file as we are about >> adding them to the Skipped lists. This just provides a way forward. >> >> While it's true that the bots used to be red more frequently with pixel >> tests turned on, for the most part, there weren't significant pixel >> regressions. Now, if you run the pixel tests on a clean build, there are a >> number of failures and a very large number of hash-mismatches that are >> within the failure tolerance level. >> >> -Ojan >> >> For reference, the format of the expectations file is something like this: >> >> // Fails the image diff but not the text diff. >> fast/forms/foo.html = IMAGE >> >> // Fails just the text diff. >> fast/forms/bar.html = TEXT >> >> // Fails both the image and text diffs. >> fast/forms/baz.html = IMAGE+TEXT >> >> // Skips this test (e.g. because it hangs run-webkit-tests or causes other >> tests to fail). >> SKIP : fast/forms/foo1.html = IMAGE >> >> _______________________________________________ >> webkit-dev mailing list >> webkit-dev@lists.webkit.org >> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev >> >> > > _______________________________________________ > webkit-dev mailing list > webkit-dev@lists.webkit.org > http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev > >
_______________________________________________ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev