Hi all, We have a growing backlog of P1 issues and flaky test issues. The emails are not very actionable.
If something is "P1" but is not addressed, it is not really P1. This is why we have a bot automatically move things from P2 to P3. I think a test that is currently failing is P0 because in real time it blocks the community. It is an outage and we should do something about it within minutes or hours at the most. We have a policy that a flaky test is P1 because it causes enough trouble that we should do something about it within days or hours if we can (not weeks or months). Usually, we just disable failing or flaky tests. Some of them get fixed. But usually we have a P1 flaky test bug left open. Should these bugs be P2? Should we remove the "flake" tag when the test is disabled and rely on the "sickbay" tag? (also rename it to "disabled-test" for people who do not use Star Trek slang) What do you think? I am tired of seeing large and growing P1s. I don't want to just hide the problem, but I want to focus on the important problem that we have a chance of solving. Kenn
