Hi Philip,

Do you know how do reftests run in order to get that data?

I'm particularly curious about this Firefox-only failure:


It passes both on our automation and locally. I'm curious because I was the author of that test (whoops) and the Firefox fix (bug 1449010).

Does it use the same mechanism than our automation to wait for image decodes and such? Is there any way to see the test images?

IIRC one potential difference here is that Firefox blocks the load event for image loads, but doesn't decode images synchronously unlike other browsers, so we may fire the load event but not paint the image. Our reftest harnesses has use internal APIs to ensure that the screenshot is taken with all the images decoded.

I suspect that can't be the cause of this test failure, since the image is really small and I would've expected it to get synchronously decoded anyway (we sync-decode if fast by default), but I'm no expert about how wpt.fyi is set up, thus the curiosity, I'd love to be able to see the screenshots of that test.

Thanks in advance,

 -- Emilio

On 10/13/18 9:27 AM, Philip Jägenstedt wrote:
On Sat, Oct 13, 2018, 09:17 Philip Jägenstedt <foo...@chromium.org> wrote:

On Thu, Oct 11, 2018, 22:34 Boris Zbarsky <bzbar...@mit.edu> wrote:

On 10/11/18 4:22 PM, Philip Jägenstedt wrote:
https://gist.github.com/foolip/a77c88e62aa3cfc461c2879f3e5d4855 is a
list of tests that fail in Firefox Nightly, but pass in stable
versions of Chrome, Edge and Safari.

Or more precisely have some sub-test that has that property, right?

Right, since there's no way to link to a subtest, in those cases I've
linked to the test and it might take some work to spot which subtest it
was. If this is a problem I could improve the report.

Thanks for filing the tracking bug, l hope there's some failures in here
that point to problems that really affect web developers that can be fixed.

There's another crux worth mentioning. Tests can be definitely passing or
definitely failing, but then there are various crash/error/timeout/etc
results where the validity of the test is uncertain, or it's quite likely
to be a flake or infra issue. In my report I've been conservative and used
1 PASS + 3 FAIL as the criteria. Fiddling with these rules can reveal lots
more potential issues, and if you like I could provide reports on that too.

dev-platform mailing list

dev-platform mailing list

Reply via email to