Yes, I think that a one-time focused push to triage old "flake" tags might have a long-term impact. We currently have 35 open "flake" labeled bugs, with average age of about 350 days, with a significant number of very old bugs that are almost certainly obsolete.
That does not mean all is well, because I think often the solution is to disable the flake to restore signal, so then we need to watch the ignored tests. I have used the "sickbay" tag on Jira but we could choose a more self-explanatory one. Also it would be better to pull the info from the code directly. Kenn On Mon, May 10, 2021 at 12:32 PM Ahmet Altay <[email protected]> wrote: > Any suggestions on how to clean this up? We can organize a cleanup to > reduce the numbers a bit. Ideally we need to find a way to prevent the > future growth but a temporary reduction might make it easier for us to keep > reviewing and closing new issues. > > On Mon, May 10, 2021 at 9:11 AM Brian Hulette <[email protected]> wrote: > >> In addition to stale flake jiras, I think there are also many tracking >> tests that were disabled years ago due to flakiness. >> >> On Sat, May 8, 2021 at 1:39 PM Kenneth Knowles <[email protected]> wrote: >> >>> Oh the second chart is not automatically associated with the >>> board/filter. Here is the correct link: >>> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12350547&periodName=daily&daysprevious=300&selectedProjectId=12319527&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Aaverageage-report&atl_token=A5KQ-2QAV-T4JA-FDED_ea6ac783c727523cf6bfed04ba94ce91bb62da91_lin&Next=Next >>> >>> On Sat, May 8, 2021 at 1:37 PM Kenneth Knowles <[email protected]> wrote: >>> >>>> The second chart is clearly bad and getting worse. Our flake bugs are >>>> not getting addressed in a timely manner. >>>> >>>> Zooming in on the first chart for the last 3 months you can see a >>>> notable change: >>>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=464&projectKey=BEAM&view=reporting&chart=cumulativeFlowDiagram&swimlane=1174&swimlane=1175&column=2038&column=2039&column=2040&days=90. >>>> This will not change the average in the second chart very quickly. >>>> >>>> It may be just cleanup. That seems likely. Anecdotally, I have done a >>>> lot of triage recently of failures and I know of only two severe flakes >>>> (that you can count on seeing in a day/week). If so, then more cleanup >>>> would be valuable. This is why I ran the second report: I suspected that we >>>> had a lot of very old stale flake bugs that noone is looking at. >>>> >>>> Kenn >>>> >>>> On Fri, May 7, 2021 at 4:37 PM Ahmet Altay <[email protected]> wrote: >>>> >>>>> Thank you for sharing the charts. >>>>> >>>>> I know you are the messenger here, but I disagree with the message >>>>> that flakes are getting noticeably better. Number of open issues look >>>>> quite >>>>> large but at least stable. I will guess that some of those are stale and >>>>> seemingly we did a clean up in July 2020. We can try that again. Second >>>>> chart shows a bad picture IMO. Issues staying open for 500-600 days on >>>>> average sounds like really long. >>>>> >>>>> On Fri, May 7, 2021 at 1:42 PM Kenneth Knowles <[email protected]> >>>>> wrote: >>>>> >>>>>> Alright, I think it should be fixed. The underlying saved filter had >>>>>> not been shared. >>>>>> >>>>>> Kenn >>>>>> >>>>>> On Fri, May 7, 2021 at 8:02 AM Brian Hulette <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> The first link doesn't work for me, I just see a blank page with >>>>>>> some jira header and navbar. Do I need some additional permissions? >>>>>>> >>>>>>> If I click over to "Kanban Board" on the toggle at the top right I >>>>>>> see a card with "Error: The requested board cannot be viewed because it >>>>>>> either does not exist or you do not have permission to view it." >>>>>>> >>>>>>> Brian >>>>>>> >>>>>>> On Thu, May 6, 2021 at 5:56 PM Kenneth Knowles <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I spoke too soon? >>>>>>>> >>>>>>>> >>>>>>>> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=project-12319527&periodName=daily&daysprevious=300&selectedProjectId=12319527&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Aaverageage-report&atl_token=A5KQ-2QAV-T4JA-FDED_ea6ac783c727523cf6bfed04ba94ce91bb62da91_lin&Next=Next >>>>>>>> >>>>>>>> On Thu, May 6, 2021 at 5:54 PM Kenneth Knowles <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I made a quick* Jira chart to see how we are doing at flakes: >>>>>>>>> >>>>>>>>> >>>>>>>>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=464&projectKey=BEAM&view=reporting&chart=cumulativeFlowDiagram&swimlane=1174&swimlane=1175&column=2038&column=2039&column=2040 >>>>>>>>> >>>>>>>>> Looking a lot better recently at resolving them! (whether these >>>>>>>>> are new fixes or just resolving stale bugs, I love it) >>>>>>>>> >>>>>>>>> Kenn >>>>>>>>> >>>>>>>>> *AFAICT you need to make a saved search, then an agile board based >>>>>>>>> on the saved search, then you can look at reports >>>>>>>>> >>>>>>>>
