Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
Thanks for this Philip. I have started raising bugs and blocking https://bugzilla.mozilla.org/show_bug.cgi?id=1498357. David On Fri, 14 Dec 2018 at 08:41, Philip Jägenstedt wrote: > On Fri, Oct 19, 2018 at 2:42 PM Philip Jägenstedt > wrote: > > > > On Wed, Oct 17, 2018 at 11:53 PM Boris Zbarsky wrote: > > > > > > On 10/13/18 3:27 AM, Philip Jägenstedt wrote: > > > > Fiddling with these rules can reveal lots > > > > more potential issues, and if you like I could provide reports on > that too. > > > > > > I would be pretty interested in that, yes. In particular, a report > > > where there is 1 "not PASS and not FAIL" and 3 "PASS" would be pretty > > > helpful, I suspect. > > > > Rerunning my script it's apparent that unreliable Edge results [1] > > leads to the same tests being considered lone failures or not for the > > other browsers. So, I've use the same set of runs for this report of > > what you suggested: > > https://gist.github.com/foolip/e6014c9bcc8ca405219bf18542eb5d69 > > > > It's not a long list, so I checked them all and they are timeouts. > > This is sometimes the failure mode for genuine problems, so looking > > over these might be valuable. > > Given the recent news [1] it won't be as relevant to consider the > status of EdgeHTML for prioritization in other engines. Given that and > the unreliable results, I've updated my script to consider only > Chrome, Firefox and Safari. I also the reports auto-updating on a > daily basis: > > https://foolip.github.io/ad-hoc-wpt-results-analysis/chrome-lone-failures.html > > https://foolip.github.io/ad-hoc-wpt-results-analysis/firefox-lone-failures.html > > https://foolip.github.io/ad-hoc-wpt-results-analysis/safari-lone-failures.html > > [1] https://github.com/MicrosoftEdge/MSEdge/blob/master/README.md > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
That's fantastic, there's a lot to triage but hopefully it's well worth it. If you create any ad-hoc mapping between failures and bugs, please comment on https://github.com/web-platform-tests/wpt.fyi/issues/64 and perhaps we can populate using that data when the linking feature exists. +Luke Bjerring FYI. On Fri, Dec 14, 2018 at 4:09 PM David Burns wrote: > Thanks for this Philip. > > I have started raising bugs and blocking > https://bugzilla.mozilla.org/show_bug.cgi?id=1498357. > > David > > On Fri, 14 Dec 2018 at 08:41, Philip Jägenstedt > wrote: > >> On Fri, Oct 19, 2018 at 2:42 PM Philip Jägenstedt >> wrote: >> > >> > On Wed, Oct 17, 2018 at 11:53 PM Boris Zbarsky >> wrote: >> > > >> > > On 10/13/18 3:27 AM, Philip Jägenstedt wrote: >> > > > Fiddling with these rules can reveal lots >> > > > more potential issues, and if you like I could provide reports on >> that too. >> > > >> > > I would be pretty interested in that, yes. In particular, a report >> > > where there is 1 "not PASS and not FAIL" and 3 "PASS" would be pretty >> > > helpful, I suspect. >> > >> > Rerunning my script it's apparent that unreliable Edge results [1] >> > leads to the same tests being considered lone failures or not for the >> > other browsers. So, I've use the same set of runs for this report of >> > what you suggested: >> > https://gist.github.com/foolip/e6014c9bcc8ca405219bf18542eb5d69 >> > >> > It's not a long list, so I checked them all and they are timeouts. >> > This is sometimes the failure mode for genuine problems, so looking >> > over these might be valuable. >> >> Given the recent news [1] it won't be as relevant to consider the >> status of EdgeHTML for prioritization in other engines. Given that and >> the unreliable results, I've updated my script to consider only >> Chrome, Firefox and Safari. I also the reports auto-updating on a >> daily basis: >> >> https://foolip.github.io/ad-hoc-wpt-results-analysis/chrome-lone-failures.html >> >> https://foolip.github.io/ad-hoc-wpt-results-analysis/firefox-lone-failures.html >> >> https://foolip.github.io/ad-hoc-wpt-results-analysis/safari-lone-failures.html >> >> [1] https://github.com/MicrosoftEdge/MSEdge/blob/master/README.md >> > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On Fri, Dec 14, 2018 at 9:41 AM Philip Jägenstedt wrote: > > On Fri, Oct 19, 2018 at 2:42 PM Philip Jägenstedt wrote: > > > > On Wed, Oct 17, 2018 at 11:53 PM Boris Zbarsky wrote: > > > > > > On 10/13/18 3:27 AM, Philip Jägenstedt wrote: > > > > Fiddling with these rules can reveal lots > > > > more potential issues, and if you like I could provide reports on that > > > > too. > > > > > > I would be pretty interested in that, yes. In particular, a report > > > where there is 1 "not PASS and not FAIL" and 3 "PASS" would be pretty > > > helpful, I suspect. > > > > Rerunning my script it's apparent that unreliable Edge results [1] > > leads to the same tests being considered lone failures or not for the > > other browsers. So, I've use the same set of runs for this report of > > what you suggested: > > https://gist.github.com/foolip/e6014c9bcc8ca405219bf18542eb5d69 > > > > It's not a long list, so I checked them all and they are timeouts. > > This is sometimes the failure mode for genuine problems, so looking > > over these might be valuable. > > Given the recent news [1] it won't be as relevant to consider the > status of EdgeHTML for prioritization in other engines. Given that and > the unreliable results, I've updated my script to consider only > Chrome, Firefox and Safari. I also the reports auto-updating on a > daily basis: > https://foolip.github.io/ad-hoc-wpt-results-analysis/chrome-lone-failures.html > https://foolip.github.io/ad-hoc-wpt-results-analysis/firefox-lone-failures.html > https://foolip.github.io/ad-hoc-wpt-results-analysis/safari-lone-failures.html > > [1] https://github.com/MicrosoftEdge/MSEdge/blob/master/README.md And, to spell it out, the effect of that is to increase the number of product-specific for all three by quite a lot. Firefox goes from ~700 to ~1300. Chrome went from ~300 to ~900, and I'm suggesting that we get to at least <500 and stay there. (I suspect many failures are for trivial reasons, so that it'll be easy to make progress in the beginning.) ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On Fri, Oct 19, 2018 at 2:42 PM Philip Jägenstedt wrote: > > On Wed, Oct 17, 2018 at 11:53 PM Boris Zbarsky wrote: > > > > On 10/13/18 3:27 AM, Philip Jägenstedt wrote: > > > Fiddling with these rules can reveal lots > > > more potential issues, and if you like I could provide reports on that > > > too. > > > > I would be pretty interested in that, yes. In particular, a report > > where there is 1 "not PASS and not FAIL" and 3 "PASS" would be pretty > > helpful, I suspect. > > Rerunning my script it's apparent that unreliable Edge results [1] > leads to the same tests being considered lone failures or not for the > other browsers. So, I've use the same set of runs for this report of > what you suggested: > https://gist.github.com/foolip/e6014c9bcc8ca405219bf18542eb5d69 > > It's not a long list, so I checked them all and they are timeouts. > This is sometimes the failure mode for genuine problems, so looking > over these might be valuable. Given the recent news [1] it won't be as relevant to consider the status of EdgeHTML for prioritization in other engines. Given that and the unreliable results, I've updated my script to consider only Chrome, Firefox and Safari. I also the reports auto-updating on a daily basis: https://foolip.github.io/ad-hoc-wpt-results-analysis/chrome-lone-failures.html https://foolip.github.io/ad-hoc-wpt-results-analysis/firefox-lone-failures.html https://foolip.github.io/ad-hoc-wpt-results-analysis/safari-lone-failures.html [1] https://github.com/MicrosoftEdge/MSEdge/blob/master/README.md ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On 10/19/18 8:42 AM, Philip Jägenstedt wrote: That's a bit odd, the is in the markup and would be when running manually or under automation. Are you sure that explains the difference? Yes. I filed https://github.com/web-platform-tests/wpt/issues/13625 -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On Wed, Oct 17, 2018 at 11:53 PM Boris Zbarsky wrote: > > On 10/13/18 3:27 AM, Philip Jägenstedt wrote: > > Fiddling with these rules can reveal lots > > more potential issues, and if you like I could provide reports on that too. > > I would be pretty interested in that, yes. In particular, a report > where there is 1 "not PASS and not FAIL" and 3 "PASS" would be pretty > helpful, I suspect. Rerunning my script it's apparent that unreliable Edge results [1] leads to the same tests being considered lone failures or not for the other browsers. So, I've use the same set of runs for this report of what you suggested: https://gist.github.com/foolip/e6014c9bcc8ca405219bf18542eb5d69 It's not a long list, so I checked them all and they are timeouts. This is sometimes the failure mode for genuine problems, so looking over these might be valuable. > By the way, I recently found some tests that fail when run directly but > pass in the harness. :( For example > http://w3c-test.org/html/infrastructure/common-dom-interfaces/collections/htmlallcollection.html > fails various subtests in all browsers due to the being > in the DOM when running directly. Not really sure what we can do with that. That's a bit odd, the is in the markup and would be when running manually or under automation. Are you sure that explains the difference? If it does, then just removing it from the markup and adapting any affected tests would be the way to go. I updated the test pretty recently, if you're confident it's broken can you file a wpt issue and assign me? [1] https://github.com/web-platform-tests/results-collection/issues/563 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On 10/13/18 3:27 AM, Philip Jägenstedt wrote: Fiddling with these rules can reveal lots more potential issues, and if you like I could provide reports on that too. I would be pretty interested in that, yes. In particular, a report where there is 1 "not PASS and not FAIL" and 3 "PASS" would be pretty helpful, I suspect. By the way, I recently found some tests that fail when run directly but pass in the harness. :( For example http://w3c-test.org/html/infrastructure/common-dom-interfaces/collections/htmlallcollection.html fails various subtests in all browsers due to the being in the DOM when running directly. Not really sure what we can do with that. -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On Wed, Oct 17, 2018 at 2:03 PM Emilio Cobos Álvarez wrote: > > On 10/17/18 11:56 AM, James Graham wrote: > > On 17/10/2018 10:12, James Graham wrote: > >> On 17/10/2018 01:23, Emilio Cobos Álvarez wrote: > >>> Hi Philip, > >>> > >>> Do you know how do reftests run in order to get that data? > >>> > >>> I'm particularly curious about this Firefox-only failure: > >>> > >>>css/selectors/selection-image-001.html > >>> > >>> It passes both on our automation and locally. I'm curious because I > >>> was the author of that test (whoops) and the Firefox fix (bug 1449010). > >>> > >>> Does it use the same mechanism than our automation to wait for image > >>> decodes and such? Is there any way to see the test images? > >> > >> It's using the same harness as we use in gecko, so it should be giving > >> the same results, but of course it's possible that there's some > >> difference in the configuration that could cause different results for > >> some tests. > >> > >> Unfortunately there isn't yet a way to see the images; because of the > >> number of failures per run, and the number of runs, putting all the > >> screenshots in the logs would be prohibitively large, but there is a > >> plan to start uploading previously unseen screenshots to wpt.fyi [1] > > > > OK, I investigated this and it turns out that we accidentally started > > uploading tbpl-style logs with screenshots for full runs when we turned > > on taskcluster for PRs. So the screenshot is available through > > > > https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://taskcluster-artifacts.net/U6OIGr7ZTjurDYjy_KgyCg/0/public/results/log_tbpl.log > > Thanks! So it looks that the reftest screenshots are taken on inactive > windows? > > We don't respect ::selection for inactive windows, so the failure now > makes sense. > > Still I think there's something fishy there, but it may be related to > the widget toolkit that is on wpt's CI or something... Thanks James for accidentally storing screenshots in Taskcluster logs and figuring out how to use them with reftest-analyzer, that's great and I'll pass along this tip to blink-dev as well :D ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On 10/17/18 11:56 AM, James Graham wrote: On 17/10/2018 10:12, James Graham wrote: On 17/10/2018 01:23, Emilio Cobos Álvarez wrote: Hi Philip, Do you know how do reftests run in order to get that data? I'm particularly curious about this Firefox-only failure: css/selectors/selection-image-001.html It passes both on our automation and locally. I'm curious because I was the author of that test (whoops) and the Firefox fix (bug 1449010). Does it use the same mechanism than our automation to wait for image decodes and such? Is there any way to see the test images? It's using the same harness as we use in gecko, so it should be giving the same results, but of course it's possible that there's some difference in the configuration that could cause different results for some tests. Unfortunately there isn't yet a way to see the images; because of the number of failures per run, and the number of runs, putting all the screenshots in the logs would be prohibitively large, but there is a plan to start uploading previously unseen screenshots to wpt.fyi [1] OK, I investigated this and it turns out that we accidentally started uploading tbpl-style logs with screenshots for full runs when we turned on taskcluster for PRs. So the screenshot is available through https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://taskcluster-artifacts.net/U6OIGr7ZTjurDYjy_KgyCg/0/public/results/log_tbpl.log Thanks! So it looks that the reftest screenshots are taken on inactive windows? We don't respect ::selection for inactive windows, so the failure now makes sense. Still I think there's something fishy there, but it may be related to the widget toolkit that is on wpt's CI or something... -- Emilio ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On 17/10/2018 10:12, James Graham wrote: On 17/10/2018 01:23, Emilio Cobos Álvarez wrote: Hi Philip, Do you know how do reftests run in order to get that data? I'm particularly curious about this Firefox-only failure: css/selectors/selection-image-001.html It passes both on our automation and locally. I'm curious because I was the author of that test (whoops) and the Firefox fix (bug 1449010). Does it use the same mechanism than our automation to wait for image decodes and such? Is there any way to see the test images? It's using the same harness as we use in gecko, so it should be giving the same results, but of course it's possible that there's some difference in the configuration that could cause different results for some tests. Unfortunately there isn't yet a way to see the images; because of the number of failures per run, and the number of runs, putting all the screenshots in the logs would be prohibitively large, but there is a plan to start uploading previously unseen screenshots to wpt.fyi [1] OK, I investigated this and it turns out that we accidentally started uploading tbpl-style logs with screenshots for full runs when we turned on taskcluster for PRs. So the screenshot is available through https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://taskcluster-artifacts.net/U6OIGr7ZTjurDYjy_KgyCg/0/public/results/log_tbpl.log ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On 17/10/2018 01:23, Emilio Cobos Álvarez wrote: Hi Philip, Do you know how do reftests run in order to get that data? I'm particularly curious about this Firefox-only failure: css/selectors/selection-image-001.html It passes both on our automation and locally. I'm curious because I was the author of that test (whoops) and the Firefox fix (bug 1449010). Does it use the same mechanism than our automation to wait for image decodes and such? Is there any way to see the test images? It's using the same harness as we use in gecko, so it should be giving the same results, but of course it's possible that there's some difference in the configuration that could cause different results for some tests. Unfortunately there isn't yet a way to see the images; because of the number of failures per run, and the number of runs, putting all the screenshots in the logs would be prohibitively large, but there is a plan to start uploading previously unseen screenshots to wpt.fyi [1] Having said that the infrastructure is all containerised and it's possible to repeat the run locally with relatively little effort. I'm happy to help out with that if you like. [1] https://github.com/web-platform-tests/wpt.fyi/issues/57 ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
Hi Philip, Do you know how do reftests run in order to get that data? I'm particularly curious about this Firefox-only failure: css/selectors/selection-image-001.html It passes both on our automation and locally. I'm curious because I was the author of that test (whoops) and the Firefox fix (bug 1449010). Does it use the same mechanism than our automation to wait for image decodes and such? Is there any way to see the test images? IIRC one potential difference here is that Firefox blocks the load event for image loads, but doesn't decode images synchronously unlike other browsers, so we may fire the load event but not paint the image. Our reftest harnesses has use internal APIs to ensure that the screenshot is taken with all the images decoded. I suspect that can't be the cause of this test failure, since the image is really small and I would've expected it to get synchronously decoded anyway (we sync-decode if fast by default), but I'm no expert about how wpt.fyi is set up, thus the curiosity, I'd love to be able to see the screenshots of that test. Thanks in advance, -- Emilio On 10/13/18 9:27 AM, Philip Jägenstedt wrote: On Sat, Oct 13, 2018, 09:17 Philip Jägenstedt wrote: On Thu, Oct 11, 2018, 22:34 Boris Zbarsky wrote: On 10/11/18 4:22 PM, Philip Jägenstedt wrote: https://gist.github.com/foolip/a77c88e62aa3cfc461c2879f3e5d4855 is a list of tests that fail in Firefox Nightly, but pass in stable versions of Chrome, Edge and Safari. Or more precisely have some sub-test that has that property, right? Right, since there's no way to link to a subtest, in those cases I've linked to the test and it might take some work to spot which subtest it was. If this is a problem I could improve the report. Thanks for filing the tracking bug, l hope there's some failures in here that point to problems that really affect web developers that can be fixed. There's another crux worth mentioning. Tests can be definitely passing or definitely failing, but then there are various crash/error/timeout/etc results where the validity of the test is uncertain, or it's quite likely to be a flake or infra issue. In my report I've been conservative and used 1 PASS + 3 FAIL as the criteria. Fiddling with these rules can reveal lots more potential issues, and if you like I could provide reports on that too. ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On Sat, Oct 13, 2018, 09:17 Philip Jägenstedt wrote: > On Thu, Oct 11, 2018, 22:34 Boris Zbarsky wrote: > >> On 10/11/18 4:22 PM, Philip Jägenstedt wrote: >> > https://gist.github.com/foolip/a77c88e62aa3cfc461c2879f3e5d4855 is a >> > list of tests that fail in Firefox Nightly, but pass in stable >> > versions of Chrome, Edge and Safari. >> >> Or more precisely have some sub-test that has that property, right? >> > > Right, since there's no way to link to a subtest, in those cases I've > linked to the test and it might take some work to spot which subtest it > was. If this is a problem I could improve the report. > > Thanks for filing the tracking bug, l hope there's some failures in here > that point to problems that really affect web developers that can be fixed. > There's another crux worth mentioning. Tests can be definitely passing or definitely failing, but then there are various crash/error/timeout/etc results where the validity of the test is uncertain, or it's quite likely to be a flake or infra issue. In my report I've been conservative and used 1 PASS + 3 FAIL as the criteria. Fiddling with these rules can reveal lots more potential issues, and if you like I could provide reports on that too. > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On Thu, Oct 11, 2018, 22:34 Boris Zbarsky wrote: > On 10/11/18 4:22 PM, Philip Jägenstedt wrote: > > https://gist.github.com/foolip/a77c88e62aa3cfc461c2879f3e5d4855 is a > > list of tests that fail in Firefox Nightly, but pass in stable > > versions of Chrome, Edge and Safari. > > Or more precisely have some sub-test that has that property, right? > Right, since there's no way to link to a subtest, in those cases I've linked to the test and it might take some work to spot which subtest it was. If this is a problem I could improve the report. Thanks for filing the tracking bug, l hope there's some failures in here that point to problems that really affect web developers that can be fixed. > ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1498357 to track these failures. -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform
Re: web-platform-tests that fail only in Firefox (from wpt.fyi data)
On 10/11/18 4:22 PM, Philip Jägenstedt wrote: https://gist.github.com/foolip/a77c88e62aa3cfc461c2879f3e5d4855 is a list of tests that fail in Firefox Nightly, but pass in stable versions of Chrome, Edge and Safari. Or more precisely have some sub-test that has that property, right? Thank you for putting this list together. -Boris ___ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform