Yes, I think that a one-time focused push to triage old "flake" tags might
have a long-term impact. We currently have 35 open "flake" labeled bugs,
with average age of about 350 days, with a significant number of very old
bugs that are almost certainly obsolete.

That does not mean all is well, because I think often the solution is to
disable the flake to restore signal, so then we need to watch the ignored
tests. I have used the "sickbay" tag on Jira but we could choose a more
self-explanatory one. Also it would be better to pull the info from the
code directly.

Kenn

On Mon, May 10, 2021 at 12:32 PM Ahmet Altay <[email protected]> wrote:

> Any suggestions on how to clean this up? We can organize a cleanup to
> reduce the numbers a bit. Ideally we need to find a way to prevent the
> future growth but a temporary reduction might make it easier for us to keep
> reviewing and closing new issues.
>
> On Mon, May 10, 2021 at 9:11 AM Brian Hulette <[email protected]> wrote:
>
>> In addition to stale flake jiras, I think there are also many tracking
>> tests that were disabled years ago due to flakiness.
>>
>> On Sat, May 8, 2021 at 1:39 PM Kenneth Knowles <[email protected]> wrote:
>>
>>> Oh the second chart is not automatically associated with the
>>> board/filter. Here is the correct link:
>>> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=filter-12350547&periodName=daily&daysprevious=300&selectedProjectId=12319527&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Aaverageage-report&atl_token=A5KQ-2QAV-T4JA-FDED_ea6ac783c727523cf6bfed04ba94ce91bb62da91_lin&Next=Next
>>>
>>> On Sat, May 8, 2021 at 1:37 PM Kenneth Knowles <[email protected]> wrote:
>>>
>>>> The second chart is clearly bad and getting worse. Our flake bugs are
>>>> not getting addressed in a timely manner.
>>>>
>>>> Zooming in on the first chart for the last 3 months you can see a
>>>> notable change:
>>>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=464&projectKey=BEAM&view=reporting&chart=cumulativeFlowDiagram&swimlane=1174&swimlane=1175&column=2038&column=2039&column=2040&days=90.
>>>> This will not change the average in the second chart very quickly.
>>>>
>>>> It may be just cleanup. That seems likely. Anecdotally, I have done a
>>>> lot of triage recently of failures and I know of only two severe flakes
>>>> (that you can count on seeing in a day/week). If so, then more cleanup
>>>> would be valuable. This is why I ran the second report: I suspected that we
>>>> had a lot of very old stale flake bugs that noone is looking at.
>>>>
>>>> Kenn
>>>>
>>>> On Fri, May 7, 2021 at 4:37 PM Ahmet Altay <[email protected]> wrote:
>>>>
>>>>> Thank you for sharing the charts.
>>>>>
>>>>> I know you are the messenger here, but I disagree with the message
>>>>> that flakes are getting noticeably better. Number of open issues look 
>>>>> quite
>>>>> large but at least stable. I will guess that some of those are stale and
>>>>> seemingly we did a clean up in July 2020. We can try that again. Second
>>>>> chart shows a bad picture IMO. Issues staying open for 500-600 days on
>>>>> average sounds like really long.
>>>>>
>>>>> On Fri, May 7, 2021 at 1:42 PM Kenneth Knowles <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Alright, I think it should be fixed. The underlying saved filter had
>>>>>> not been shared.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Fri, May 7, 2021 at 8:02 AM Brian Hulette <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> The first link doesn't work for me, I just see a blank page with
>>>>>>> some jira header and navbar. Do I need some additional permissions?
>>>>>>>
>>>>>>> If I click over to "Kanban Board" on the toggle at the top right I
>>>>>>> see a card with "Error: The requested board cannot be viewed because it
>>>>>>> either does not exist or you do not have permission to view it."
>>>>>>>
>>>>>>> Brian
>>>>>>>
>>>>>>> On Thu, May 6, 2021 at 5:56 PM Kenneth Knowles <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I spoke too soon?
>>>>>>>>
>>>>>>>>
>>>>>>>> https://issues.apache.org/jira/secure/ConfigureReport.jspa?projectOrFilterId=project-12319527&periodName=daily&daysprevious=300&selectedProjectId=12319527&reportKey=com.atlassian.jira.jira-core-reports-plugin%3Aaverageage-report&atl_token=A5KQ-2QAV-T4JA-FDED_ea6ac783c727523cf6bfed04ba94ce91bb62da91_lin&Next=Next
>>>>>>>>
>>>>>>>> On Thu, May 6, 2021 at 5:54 PM Kenneth Knowles <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I made a quick* Jira chart to see how we are doing at flakes:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=464&projectKey=BEAM&view=reporting&chart=cumulativeFlowDiagram&swimlane=1174&swimlane=1175&column=2038&column=2039&column=2040
>>>>>>>>>
>>>>>>>>> Looking a lot better recently at resolving them! (whether these
>>>>>>>>> are new fixes or just resolving stale bugs, I love it)
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> *AFAICT you need to make a saved search, then an agile board based
>>>>>>>>> on the saved search, then you can look at reports
>>>>>>>>>
>>>>>>>>

Reply via email to