Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
Thank you Adam. We should look into more options here, including some of the ones Maciej proposed. Alternatively, we could have just skipped the tests which are flaking so much. Alexey seemed to get CC'd on nearly every flake because they seemed to mostly be tests he wrote. :( -eric On Wed, Oct 20, 2010 at 6:01 PM, Adam Barth aba...@webkit.org wrote: I'll take care of it tonight. Adam On Wed, Oct 20, 2010 at 5:09 PM, Alexey Proskuryakov a...@webkit.org wrote: I'm still getting CC'ed by commit queue. Any objections to removing Bugzilla editbugs privilege from commit-queue until this is resolved? --- Comment #10 from WebKit Commit Bot commit-qu...@webkit.org 2010-10-20 17:01:11 PST --- The commit-queue encountered the following flaky tests while processing attachment 71284: transitions/transition-end-event-transform.html http/tests/appcache/fail-on-update-2.html Please file bugs against the tests. The author(s) of the test(s) have been CCed on this bug. The commit-queue is continuing to process your patch. - WBR, Alexey Proskuryakov ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
I'm still getting CC'ed by commit queue. Any objections to removing Bugzilla editbugs privilege from commit-queue until this is resolved? --- Comment #10 from WebKit Commit Bot commit-qu...@webkit.org 2010-10-20 17:01:11 PST --- The commit-queue encountered the following flaky tests while processing attachment 71284: transitions/transition-end-event-transform.html http/tests/appcache/fail-on-update-2.html Please file bugs against the tests. The author(s) of the test(s) have been CCed on this bug. The commit-queue is continuing to process your patch. - WBR, Alexey Proskuryakov ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
I'll take care of it tonight. Adam On Wed, Oct 20, 2010 at 5:09 PM, Alexey Proskuryakov a...@webkit.org wrote: I'm still getting CC'ed by commit queue. Any objections to removing Bugzilla editbugs privilege from commit-queue until this is resolved? --- Comment #10 from WebKit Commit Bot commit-qu...@webkit.org 2010-10-20 17:01:11 PST --- The commit-queue encountered the following flaky tests while processing attachment 71284: transitions/transition-end-event-transform.html http/tests/appcache/fail-on-update-2.html Please file bugs against the tests. The author(s) of the test(s) have been CCed on this bug. The commit-queue is continuing to process your patch. - WBR, Alexey Proskuryakov ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
[webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Tue, Oct 19, 2010 at 8:42 AM, Alexey Proskuryakov a...@webkit.org wrote: 15.10.2010, в 07:39, Eric Seidel написал(а): BTW, the commit-queue has started complaining publicly about flaky tests: https://bugs.webkit.org/show_bug.cgi?id=47698#c5 Hopefully this will bring further awareness to the issue. I find this extremely annoying and offensive. Half of my bugmail is already about bugs that I'm not interested in. Sorry Alexey, I certainly didn't intend to offend you. The problem we're trying to solve is currently there is no feedback loop for authors of flaky tests. If someone writes a flaky test, there's no mechanism for them to find out about it. It just sticks around and causes pain for everyone else. The idea behind this change is to create a feedback loop whereby authors of flaky tests can discover that their tests are flaky. Looking back at the history since this feature was enabled, it looks like you were CCed on 3 of the 4 bugs that encountered flaky tests. Here are the tests that flaked out: 1x http://trac.webkit.org/browser/trunk/LayoutTests/http/tests/appcache/404-manifest.html 2x http://trac.webkit.org/browser/trunk/LayoutTests/http/tests/appcache/insert-html-element-with-manifest-2.html According to SVN, you did write both of these tests, so the tool is accurately computing the author. This triggering more often than we expected. I'm not sure whether that's a statistical aberration. Here's how we calculated how much traffic this tool would generate: According to webkit-patch find-flaky-tests, the flakiest test fails about 7 times per 2000 revisions, which means it fails for 0.3% of test runs. The commit-queue lands about 30 patches per day, so that means the author of the flakiest test should get CCed on about one bug every ten days. Also, these bugs are close to the end of their lifecycle (because their patch is about to land), so they shouldn't generate more than 3 or 4 emails each. That boils down to about one or two emails per week for the flakiest test. Now, that calculation is a very rough approximation, and we might have missed some important factors. We're certainly open to other suggestions for how to close the loop on flaky tests if this approach generates too much email. Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
19.10.2010, в 11:16, Adam Barth написал(а): Also, these bugs are close to the end of their lifecycle (because their patch is about to land), so they shouldn't generate more than 3 or 4 emails each. That boils down to about one or two emails per week for the flakiest test. One e-mail (per week?) would perhaps make sense, even though flaky test is sometimes flaky code, so the blame becomes misplaced. Getting 3-4 automated e-mails per bug seems over the board. I agree that raising awareness of which tests or code areas are flaky seems useful. One problem I personally had was with digging up data on flakiness. The link for a dashboard that I found was http://test-results.appspot.com/dashboards/flakiness_dashboard.html - the URL was freezing my browser for several minutes on each move, and I couldn't make sense of what it was telling me UI-wise quickly enough. I'm not even sure how it's related to flakiness seen by commit queue, as it seems to be about chromium. Is there a better data source that I missed? - WBR, Alexey Proskuryakov ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Tue, Oct 19, 2010 at 11:44 AM, Alexey Proskuryakov a...@webkit.org wrote: I agree that raising awareness of which tests or code areas are flaky seems useful. One problem I personally had was with digging up data on flakiness. The link for a dashboard that I found was http://test-results.appspot.com/dashboards/flakiness_dashboard.html - the URL was freezing my browser for several minutes on each move, and I couldn't make sense of what it was telling me UI-wise quickly enough. I'm not even sure how it's related to flakiness seen by commit queue, as it seems to be about chromium. That dashboard currently only supports the Chromium bots. If other bots successfully switch over to new-run-webkit-tests, we'll be able to easily add them to that dashboard. The freezing issue is a recent one I plan on looking into soon. WebKit is ridiculously slow at rendering this HTML for some reason (it's a single large table). The UI is very dense and confusing, but it gives you quite a bit of useful information. Here's some limited documentation on making sense of the dashboard UI: http://sites.google.com/a/chromium.org/dev/developers/testing/flakiness-dashboard Ojan ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
webkit-patch find-flaky-tests can also show you what tests are recently flaky, but its not as nice as the dashboard. -eric On Tue, Oct 19, 2010 at 12:06 PM, Ojan Vafai o...@chromium.org wrote: On Tue, Oct 19, 2010 at 11:44 AM, Alexey Proskuryakov a...@webkit.org wrote: I agree that raising awareness of which tests or code areas are flaky seems useful. One problem I personally had was with digging up data on flakiness. The link for a dashboard that I found was http://test-results.appspot.com/dashboards/flakiness_dashboard.html - the URL was freezing my browser for several minutes on each move, and I couldn't make sense of what it was telling me UI-wise quickly enough. I'm not even sure how it's related to flakiness seen by commit queue, as it seems to be about chromium. That dashboard currently only supports the Chromium bots. If other bots successfully switch over to new-run-webkit-tests, we'll be able to easily add them to that dashboard. The freezing issue is a recent one I plan on looking into soon. WebKit is ridiculously slow at rendering this HTML for some reason (it's a single large table). The UI is very dense and confusing, but it gives you quite a bit of useful information. Here's some limited documentation on making sense of the dashboard UI: http://sites.google.com/a/chromium.org/dev/developers/testing/flakiness-dashboard Ojan ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Tue, Oct 19, 2010 at 11:44 AM, Alexey Proskuryakov a...@webkit.org wrote: 19.10.2010, в 11:16, Adam Barth написал(а): Also, these bugs are close to the end of their lifecycle (because their patch is about to land), so they shouldn't generate more than 3 or 4 emails each. That boils down to about one or two emails per week for the flakiest test. One e-mail (per week?) would perhaps make sense, even though flaky test is sometimes flaky code, so the blame becomes misplaced. Getting 3-4 automated e-mails per bug seems over the board. Maybe the thing to do is CC the author of the flaky test for the one bug comment and then immediately unCC them. That way they don't see the rest of the traffic on the bug. Adam I agree that raising awareness of which tests or code areas are flaky seems useful. One problem I personally had was with digging up data on flakiness. The link for a dashboard that I found was http://test-results.appspot.com/dashboards/flakiness_dashboard.html - the URL was freezing my browser for several minutes on each move, and I couldn't make sense of what it was telling me UI-wise quickly enough. I'm not even sure how it's related to flakiness seen by commit queue, as it seems to be about chromium. Is there a better data source that I missed? - WBR, Alexey Proskuryakov ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
19.10.2010, в 12:33, Adam Barth написал(а): Maybe the thing to do is CC the author of the flaky test for the one bug comment and then immediately unCC them. That way they don't see the rest of the traffic on the bug. That would still be two e-mails about a bug the person otherwise doesn't want to know about. I don't think that CC'ing is the right approach. - WBR, Alexey Proskuryakov ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Tue, Oct 19, 2010 at 12:41 PM, Alexey Proskuryakov a...@webkit.org wrote: 19.10.2010, в 12:33, Adam Barth написал(а): Maybe the thing to do is CC the author of the flaky test for the one bug comment and then immediately unCC them. That way they don't see the rest of the traffic on the bug. That would still be two e-mails about a bug the person otherwise doesn't want to know about. I don't think that CC'ing is the right approach. Do you see changes to bugs when you get removed from the CC? Do you have another suggestion for how to providing feedback to authors of flaky tests? Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Tue, Oct 19, 2010 at 1:30 PM, Adam Barth aba...@webkit.org wrote: On Tue, Oct 19, 2010 at 12:41 PM, Alexey Proskuryakov a...@webkit.org wrote: 19.10.2010, в 12:33, Adam Barth написал(а): Maybe the thing to do is CC the author of the flaky test for the one bug comment and then immediately unCC them. That way they don't see the rest of the traffic on the bug. That would still be two e-mails about a bug the person otherwise doesn't want to know about. I don't think that CC'ing is the right approach. Do you see changes to bugs when you get removed from the CC? Do you have another suggestion for how to providing feedback to authors of flaky tests? Email the author directly? Doesn't need to go through bugs.webkit.org, does it? ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Oct 19, 2010, at 1:30 PM, Adam Barth wrote: On Tue, Oct 19, 2010 at 12:41 PM, Alexey Proskuryakov a...@webkit.org wrote: 19.10.2010, в 12:33, Adam Barth написал(а): Maybe the thing to do is CC the author of the flaky test for the one bug comment and then immediately unCC them. That way they don't see the rest of the traffic on the bug. That would still be two e-mails about a bug the person otherwise doesn't want to know about. I don't think that CC'ing is the right approach. Do you see changes to bugs when you get removed from the CC? Do you have another suggestion for how to providing feedback to authors of flaky tests? It looks like the bot is adding a comment to the bug with the patch that was being processed when flakiness was detected, not the one that originally landed the tests believed to be flaky. Is that right? If so, that doesn't seem like a great way to notify the author of the original test. It seems like it would be better to comment in the bug that added the test. To be fair, it's also possible that the new patch caused the flakiness, so a separate comment there could be useful. Perhaps it would be useful to determine if the test in question has a track record of flakiness. If not, then maybe the presumption should be that the patch is the problem, not the test. On the other hand, if the test has always been flaky, then the new patch probably has nothing to do with it. Regards, Maciej ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Tue, Oct 19, 2010 at 1:45 PM, Maciej Stachowiak m...@apple.com wrote: On Oct 19, 2010, at 1:30 PM, Adam Barth wrote: On Tue, Oct 19, 2010 at 12:41 PM, Alexey Proskuryakov a...@webkit.org wrote: 19.10.2010, в 12:33, Adam Barth написал(а): Maybe the thing to do is CC the author of the flaky test for the one bug comment and then immediately unCC them. That way they don't see the rest of the traffic on the bug. That would still be two e-mails about a bug the person otherwise doesn't want to know about. I don't think that CC'ing is the right approach. Do you see changes to bugs when you get removed from the CC? Do you have another suggestion for how to providing feedback to authors of flaky tests? It looks like the bot is adding a comment to the bug with the patch that was being processed when flakiness was detected, not the one that originally landed the tests believed to be flaky. Is that right? If so, that doesn't seem like a great way to notify the author of the original test. It seems like it would be better to comment in the bug that added the test. To be fair, it's also possible that the new patch caused the flakiness, so a separate comment there could be useful. Perhaps it would be useful to determine if the test in question has a track record of flakiness. If not, then maybe the presumption should be that the patch is the problem, not the test. On the other hand, if the test has always been flaky, then the new patch probably has nothing to do with it. Another option is to file a new bug about the flakiness and ping that bug when we observe the test flake out. Adam ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
On Tue, Oct 19, 2010 at 1:51 PM, Adam Barth aba...@webkit.org wrote: Another option is to file a new bug about the flakiness and ping that bug when we observe the test flake out. I've considered this before. We'd have to write a bit of bugzilla.py code to make this work though. :) That's probably the best long term solution. We could then add links to these bugs in our sorry we're slow, tests are flaky messages too. -eric ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
Re: [webkit-dev] Closing the loop on flaky tests (was Re: Flaky test hit list)
Sorry, wrong account. On Tue, Oct 19, 2010 at 1:59 PM, Eric Seidel esei...@google.com wrote: On Tue, Oct 19, 2010 at 1:45 PM, Maciej Stachowiak m...@apple.com wrote: It looks like the bot is adding a comment to the bug with the patch that was being processed when flakiness was detected, not the one that originally landed the tests believed to be flaky. Is that right? Correct. The original message was intended as a notice to the person who's patch it was, explaining why there patch was taking so long. (Flaky tests often double, triple or more the total time it takes to commit a patch.) If so, that doesn't seem like a great way to notify the author of the original test. It seems like it would be better to comment in the bug that added the test. Interesting possibility. What started this discussion is that last night we made the commit-queue CC the original author of the flaky test every time we posted one of these we're slow to commit your patch because these tests are flaky messages. 4 flakes tests were hit last night after we added that message, 3 of which were caused by tests authored by Alexey -- hence he had extra mail in his inbox this morning and this discussion ensued. As Adam noted, this was likely a statistical fluke. Commenting on the original bug is a good idea, assuming the original commit had a bug link. To be fair, it's also possible that the new patch caused the flakiness, so a separate comment there could be useful. Perhaps it would be useful to determine if the test in question has a track record of flakiness. If not, then maybe the presumption should be that the patch is the problem, not the test. On the other hand, if the test has always been flaky, then the new patch probably has nothing to do with it. Definitely possible, but I've not ever seen this happen in practice. Generally either the commit-queue fails due to the new flakiness, or it gets landed and someone later finds and removes it. It would be rare to have the new patch be adding new flakiness and the old test author getting CC'd. Actually in that case, these CC's seem more useful, as the old test author would be made aware of changes causing his/her old test to go flaky. -eric ___ webkit-dev mailing list webkit-dev@lists.webkit.org http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev