*TL;DR: After a long triage session yesterday, I have a kind request for maintainers: if you could look at "ready for maintainer review" PRs in your areas, that would be great. *
PRs marked "ready for maintainer review" have passed initial triage and require Your attention. The tool's main focus now is to move any pull requests that simply fail initial validation out of your view (Draft and Close). Also Kaxil ran CodePilot reviews on many of those "ready" ones (and some others) yesterday, and we want to see if that is helpful for reviews as well. I have been busy the last few weeks and it took longer than expected, but yesterday I finally completed a four-week loop of trying it and triaged all (!) 500+ opened PRs in about 4 hours. During this triage, 540 open PRs decreased to 492. The average triage time was about 1 minute per PR during a focused session. Many issues were skipped automatically because they do not need triage. Triage is only for issues from non-collaborators and for already triaged issues that might need some action (like Draft / Not responded / Closed). More details and instructions on how to provide feedback follow for interested parties. ------ *# Feedback* I would also love to hear specific feedback from maintainers, reviewers, and contributors. I created the #auto-triage-feedback <https://apache-airflow.slack.com/archives/C0AQNS4DV2A> - I do not promise to engage with all feedback if there is a lot, as the tools are in early stages, meaning there might be many issues and areas for improvement. At this stage let's gather the feedback and try to refine it so more people can use it regularly, and possibly we can automate it further. Or maybe even we will learn that it does not help at all, and only gets in the way. *# Findings so far* Some current findings (See the stats below): * We have about 80% of our PRs currently coming from external contributors (i.e. non-committers, non-collaborators) - that's a lot of work for maintainers * About 40% of the PRs marked as "done" are already merged (which is good), and most of those received responses and incorporated the triage comments. Which is cool. * About 60% of them were closed without being merged—some immediately, but mostly following this path: Draft -> Triage -> No response (more than 2 weeks) -> Close. This means those are really drive-by-contributors. * We have 127 PRs that seem ready for the "ready for maintainer review" label. It would be great if in your reviews of contributor issues in "your" areas you focus on those. *# Current status of the tool* I have not yet asked others to participate much yet, but if anyone wants to try it, feel free to start using it - with `breeze pr auto-triage --reviews-for-me`. This will only select issues where CODEOWNERS automatically sets you as a reviewer or where you are mentioned. But for that, https://github.com/apache/airflow/pull/64669 will have to be merged. Yes, a lot of changes and tweaks have accumulated—sorry for such a huge PR. This will likely finally stabilise and I will refactor the algorithms, split them into smaller pieces, and then we can proceed with more incremental updates. Next week I am on a PyCon LT conference but I will focus mostly on incremental triaging and tweaking. It includes cumulative learning from about 20 smaller triage sessions I've done in the past weeks. I also have a few things to add after yesterday's longer session, specifically cleaning up the algorithmic choices to better determine default actions. The tool is not perfect yet, and requires careful choices especially since we still have many flakes. I had to do more manual assessment than I would like to - I hope we can stabilise them after 3.0.0 release. And make it more useful and I hope it will be ready for others to participate. I am also going to look at the responses—I guess in some cases the triage was "unfair," and I am trying to optimise it. It's still far from full automation; it requires close human supervision (as expected at this stage). I am iterating **fast** on it - learning with every triage run while also doing other things as well. I will try to make it really simple to follow. We have a TUI mode that is good for testing and debugging (and possibly later for a focused review mode - which we already have but it's not as useful) - but I found the CLI mode far more useful overall. TUI is far too much of a distraction - but might be cool if you want to focus on smaller groups of PRs to review and later can help with review- and we have Andre Ahlert who already contributes some nice improvements there. *# Stats* I've also built the `pr stats` command: https://github.com/apache/airflow/pull/64667 - happy to receive reviews, and this stats command still needs some tweaking and improvement, which will follow. I have also built stats and track the current status of triaged collaborator PRs. *## Triaged "final" state:* In short 40 out of 102 have already been merged after responding to triage, 62 have been closed without merging (no response on triage). https://ibb.co/Lz2pj259 - image was too large to attach *## Current open PRs status* * As of yesterday we had 492 open PRS * 400 of those are contributor PRs * 126 of those are "ready for maintainer review" * 200 of those are already drafted and triaged, waiting for the contributor's response (128) or they are simply unfinished drafts. Those stats will change daily - and there might be some missing things there that I will track and add any missing items over the coming days (After Easter). https://ibb.co/NHzdrQx - - image was too large to attach Also - if you have general feedback and comments to it - feel free. I will pick it up after Easter - and for those who celebrate Easter, have a happy, AI-free, family-focused one. I certainly plan it this way. J.
