Some update from the most recent attempt of Apache Airflow to improve our issue handling using some "new-ish" features available in GitHub: Issue Forms and GitHub Discussions.
TL;DR; We don not have yet results but we think in Airflow making the full use of GitHub Form Templates and endorsing the use of GitHub Discussions where appropriate (and at the issue entry time), should help us a lot by "gating" issues better, improving their quality and getting our users into "pre-clasifying" the issues (or indeed using Discussions where it is a better choice). *Problems we observed:* * with the standard MARKDOWN templates we have many issues where people did not provide useful information (version of airflow, operating system etc.) * we had a number of issues where users would simply delete the markdown template content straight away and replaced with their own issue description - without "reproducible steps", or really to ask a question about their deployment problems without even trying to attempt to investigate it. We've ended up with many "discussion" kind of question posted as issues. Then we would "convert" such issues into discussion but it required maintainers comment and explanation. Mostly it was because the users did not even know they can (and should) open a discussion instead. * the markdown templates were difficult to read/fill in - it was not clear what you should do with the parts which were relevant - we left instructions in the comments, which were sometimes left/sometimes deleted, generally the issues were "structur-ish" rather than "structured". * often people opened Airflow 1.10 issues even if it reached End-Of-Life in June (they should open discussions instead - which is still great because even if there are no way we will handle the issues but either us or other users can help them still for workaround or even directing t) *Discovery* Then we discovered that we already can use "Forms for GitHub Issues" (beta feature but it is available for everyone now - https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/syntax-for-issue-forms). This looked like not only a great way to structure and pre-classify the issues (by the users) but also seems to be a great communication tool with the users. We believe it will allow us to address most of the issues above. Basically instead of committers having to explain the rules over and over and instruct the users how they should report the issues - it's all there explained really clearly and nicely while the user enters the issue. *Solution* We defined forms with required fields that cannot be "skipped". also when you do not fill a "textarea" entry it is marked as "Not Provided" rather than deleted. We added helpful comments and hints as well as explanation in which cases you should use GitHub Discussions (including lack of ability to select 1.10 version and link to open GitHub Discussions) instead of issues (with direct link). We've added logo and welcoming message to "soften" the "more formalized form entry need. We made it clear that if there is no reproduction steps, the users should open GitHub Discussion instead. We've added more issue types - we've separated "Airflow Core", "Airflow Providers" (we have more than 70 providers that extend Airflow's capabilities) and "Airflow Docs" issues - automatically applying correct labels, so we do not have to do triage ourselves. We also created a "maintainer only" issue type with allows to enter pretty much free-form information (for tasks/todos etc.) and we've added a required "checkbox" to confirm you are maintainer, to discourage people from using it to raise their "free form" questions there. We wanted it to be "easy" for committers to enter such "free form" issue but "not easy" to skip structured information by the users - at the same time guiding them to use "Discussions" which are much "easier" to enter any content and ask questions. What is even more - this structured form will allow us to automate some stuff if we find it is needed. For example if someone submits an entry without providing "reproduction steps" we can write a bot to automatically convert such issue into discussion. Or automatically close an issue if someone opens a "free-form" one while not being maintainer. *Demo/Take a look yourselves* You can try it yourself here: https://github.com/apache/airflow/issues/new/choose And see some screenshots how our issue form entries looks like: - https://ibb.co/Lxgx7xn - https://ibb.co/0rYqm1Q - https://ibb.co/fX45nBB - https://ibb.co/dbTR9GV PR introducing it is here: https://github.com/apache/airflow/pull/17855 We really hope it will drive the issue quality up and issue number down eventually. J. On Thu, Jul 15, 2021 at 5:17 PM Jarek Potiuk <ja...@potiuk.com> wrote: > Good luck :). > > On Thu, Jul 15, 2021 at 3:53 PM Erik Ritter <erik...@apache.org> wrote: > >> Thank you all for the great advice and thoughts! I see a lot of alignment >> between items suggested here and things we've tried or thought about. >> >> Being proactive about closing issues against old versions of Superset is >> definitely something we want to do, but haven't gone through and done a >> mass issue closing yet. That could certainly help cull the issue count. >> >> We have set up some bots on github to close stale issues, but their >> timeline is probably too long (I think we wait 3 or even 6 months before >> closing a stale issue). Setting a more aggressive timeline for closing >> feels like a reasonable approach. >> >> A lot of the issues we see seem to be legitimate bugs/issues with the >> product, but Superset has such a large surface area (supporting dozens of >> database query engines) that many committers don't have the time or ability >> to replicate someone's environment to truly repro an issue. I'd imagine >> making it easier for new contributors to set up a dev environment and test >> with their own data could help solve this issue. Our new developer >> onboarding docs/workflow is somewhat lacking, but enabling more community >> members to resolve issues could provide the incentives to improve it. >> >> Finally, I do think we have an issue with committers/PMC members paying >> attention to issues filed, probably because Github email spam is all too >> common, and we pretty much all have jobs that touch more than only >> Superset. As mentioned, assigning work isn't really possible for an open >> source Apache project, but I would hope that we could improve the triaging >> rate of issues at least. Perhaps gamifying in some way, or creating a >> volunteer group to do a bi-weekly or monthly triage session could help us >> start to get some control over the issues filed. >> >> Thanks again for all the great advice, and I'm glad to know it's not only >> us having these issues. We've got some plans in the works to improve this >> state, and hopefully in a few months I can follow up on this thread with a >> success story! >> >> Erik >> >> On 2021/07/09 12:01:06, Paul Angus <p...@angus.uk.com.INVALID> wrote: >> > Hi Erik, >> > >> > My 2c - you (the project) may have to 'take a view' on some of the >> issues if they date back to older versions. >> > >> > In my experience fixing bugs obviously requires replicating the >> problem, fixing it and confirming that its fixed, but if the issue was >> reported against an old version, then someone needs to replicate it in the >> old version (to be sure its not PICNIC - Problem In Chair Not In Computer) >> then replicate it in the current version to check that its still a bug, >> and _then_ start fixing it... >> > >> > Volunteers are unlikely to want to take on all that effort, so it does >> mean an ever increasing number of old bugs (as Jarek says, only >> stakeholders tend to allocate resources to that kind of thing). >> > >> > So I would start by seeing if you have a lot of really old bug reports >> and consider closing them as a matter of course. Huge bug lists also tend >> to put people of as they don't know where to start and don't feel that they >> can even make a dent in the pile. >> > >> > Kind Regards >> > >> > >> > Paul Angus >> > >> > -----Original Message----- >> > From: Jarek Potiuk <ja...@potiuk.com> >> > Sent: Friday, July 9, 2021 12:51 PM >> > To: dev <dev@community.apache.org> >> > Subject: Re: Issue Management in Apache Projects >> > >> > We are struggling with it as well in Apache Airflow. >> > >> > I can write about some of the things we actively do to try to bring it >> down (and we can see how it will work after some time). >> > We have not succeeded yet (we also have ~800 issues opened) but we for >> example have ~130 opened PR and we used to have > 200 of them so we see the >> sign of improvement. >> > >> > * we triage and respond to the issues pretty quickly and "aggressively". >> > I.e when there is not enough information or the issue is very likely to >> be caused by external factor, we close the issue explaining what's missing, >> what the author should do, what information should be provided and add info >> that it will re-open as soon as more information is provided. I found >> closing issues in this case works much better for motivation of the user to >> add more information (or save the hassle of maintaining status and closing >> the issue later). >> > >> > * we have automated stale-bot that closes inactive issues and PRs after >> (30 day inactivity = notice, + 7 day = closing) >> > >> > * when the user raises the issue which is a question, we actively >> redirect the user to "Discussions" rather than issue and .... close the >> issue :). We found "GitHub Discussions" pretty useful and active, and more >> and more users are opening discussions rather than issues. This keeps the >> "issues" >> > down to some "real" issues. >> > >> > * we have a triage team that virtually meets from time to time and >> actively reviews, classifies the issues (adds labels) but also runs some >> stats on which areas are "under-staffed". They meet semi-regularly and >> discuss and send some summaries. >> > >> > * we continuously encourage new users to contribute and add more >> committers especially in the areas that are "under-staffed" (recently UI >> committers "team" and "Kubernetes" team has greatly increased in capacity) >> and it immediately improved the situation there) >> > >> > * what helps there is that some of those committers are full-time >> employed or part-time paid as freelancers by important stakeholders in the >> project (Astronomer, Google). Also those stakeholders are fully aware of >> the value it brings, so they gladly pay the committers for their community >> effort, even if it is not directly responding to their needs (disclaimer - >> I am one of those freelancers that is part-time paid by the stakeholders) >> > >> > * the rule we have is that we do not need issues at all. People are >> encouraged (in the docs and workshops) to open directly PRs rather than >> issues >> > >> > * we added "Are you willing to submit PR?" question in the issue >> template. >> > When the issue is relatively simple and the user says "yes" we assign >> the user to it. When the answer is missing - we actively ask the user if >> there is a will to submit the PR. More often than not, the users are >> willing to when encouraged (at least initially). >> > >> > * we mark the issues that are simple as "good-first-issue" which then >> lands in http://github.com/apache/airflow/contribute . More often than >> not we have people commenting "Hey I want to implement this, can you assign >> me?" >> > which we do pretty immediately when they ask. That often works and we >> have new contributors :). >> > >> > * we have a "really quick to start" development environment for Airflow >> (Called Breeze) that we continuously improve and try to make easier to >> start contributing. >> > >> > Last but not least. We put a lot of effort into training, guiding and >> encouraging new contributors to contribute to Airflow: >> > >> > * we run semi-regular workshops for new contributors - we **just** >> started Airflow Summit 2021 yesterday and for example today we have the >> "first time contributor's workshop" >> > >> https://airflowsummit.org/sessions/2021/workshop-contributing-apache-airflow/ >> > - 3 hours hands-on when we teach the new contributors how to contribute. >> > This is I think 5th or 6th time we do it (we have a few physical events >> and over last 1.5 year we had I think 4 online ones). This time we have 20 >> people who signed up - from literally all over the world (and BTW. all >> proceedings from that cheap 50 USD workshop go to Apache Software >> Foundation as donation). >> > >> > * yesterday was a "community" day at the Summit where we had three >> talks encouraging people to contribute: >> > >> > >> https://airflowsummit.org/sessions/2021/contributing-journey-becoming-leading-contributor/ >> > - the road of Kaxil, the PMC of Airflow through committership >> https://airflowsummit.org/sessions/2021/contributing-first-steps/ - the >> first steps by a fresh contributor to Airlfow who shared his experiences >> https://airflowsummit.org/sessions/2021/dont-have-to-wait/ - "You don't >> have to wait for someone to fix it for you" - the talk from one of the >> committers to Airflow, Leah and her co-worker Rachel >> > >> > And we have quite few more talks for those who want to start >> contributing to Airflow: >> > >> > https://airflowsummit.org/sessions/2021/guide-airflow-architecture/ - >> The newcomer's guide to Airflow Architecture >> > >> > And finally, there are things we plan based on some upcoming features in >> > GitHub: >> > >> > * we are eyeing very closely the new GitHub Issues introduced recently: >> > https://github.blog/2021-06-23-introducing-new-github-issues/ . They >> seem to be much more developer-friendly and automation-friendly and they >> might help with better organizing/handling the issues. I am working with >> Github Issues Product Manager (we are going to have a meeting about it next >> week) to enable the new GitHub Issues for the whole Apache Software >> Foundation (I agreed that with Infra) and I hope very soon we will get it >> for all ASF projects (as an option to use) >> > >> > * we are waiting for Codespaces General Availability and our >> development environment is prepared to be used there out-of-the-box. This >> will make even easier path for new contributors to start contributing their >> code straight from the GitHub UI. https://github.com/features/codespaces. >> > >> > Sorry for such a long mail - this is basically a summary of ~ year of >> discussing and acting in this area. >> > >> > I hope some of those might be helpful :) >> > >> > J. >> > >> > >> > On Thu, Jul 8, 2021 at 11:33 PM Christopher <ctubb...@apache.org> >> wrote: >> > >> > > Hi Erik, >> > > >> > > Do you have a good understanding of *why* there are more issues being >> > > opened than being closed? If so, that might hint at some possible >> > > solutions. >> > > >> > > For example, if you just don't have enough people to write code, then >> > > the PMC could focus on inviting new committers to try to grow the >> > > community, or mentoring new developers. >> > > >> > > If, on the other hand, the quality of the issues is poor, such that >> > > they aren't very actionable, you could ask for more information from >> > > the reporter, and add a label that shows its status, such as "waiting >> > > on reporter". If no response is given in a reasonable time, you can >> > > close old issues. You can also try to address issue quality using >> > > GitHub issue templates: >> > > >> > > >> https://docs.github.com/en/communities/using-templates-to-encourage-us >> > > eful-issues-and-pull-requests/configuring-issue-templates-for-your-rep >> > > ository >> > > >> > > You could also set up something to auto-close very old issues that >> > > haven't been updated in a long time, under the premise that they are >> > > probably not relevant anymore. If they are, they can always be >> > > re-opened. >> > > >> > > You can also use GitHub "projects" (which I see you're already using: >> > > https://github.com/apache/superset/projects) to help organize >> related >> > > tasks, so they can be closed when the overall project is done. >> > > >> > > If the problem is that your committers aren't paying attention to >> open >> > > issues, you can try to ping your community's dev@ list to remind >> > > people of how many issues are outstanding, as a way of encouraging >> > > people to help triage, close, and bring down the number. You could >> try >> > > to find other ways to "gamify" the count, too. But, ultimately, it >> > > comes down to volunteer effort. >> > > >> > > If the problem is that your committers are having trouble tracking >> the >> > > activity on GitHub, you can double check your mailing list >> > > configuration to ensure activity gets copied to a notifications@ or >> > > issues@ list that your committers can track (you can also configure >> > > them to go to dev@, but that tends to get spammy and redundant, >> > > especially for your committers who are happy seeing the notification >> > > dots and/or emails directly from GitHub). >> > > >> > > Ultimately, you'll need to figure out why the situation is the way it >> > > is, and address it accordingly. You won't be able to force volunteer >> > > community members to participate to bring the number down, but >> perhaps >> > > there's ways to encourage them, depending on why it's happening in >> the >> > > first place. >> > > >> > > On Thu, Jul 8, 2021 at 5:04 PM Erik Ritter <erik.t.rit...@gmail.com> >> > > wrote: >> > > > >> > > > Hi all, >> > > > >> > > > I'm a PMC member for Apache Superset, and we've recently been >> > > > struggling with the number of issues reported in our Github repo. >> > > > We're currently >> > > at > >> > > > 800 open issues, and are having trouble keeping up with responding >> > > > and addressing all the user issues and feedback. We were curious if >> > > > any other Apache projects had a way of managing Github issues that >> works for them. >> > > We >> > > > were considering setting up a bot that assigns new issues to a >> > > > random committer/PMC member, but are open to other ideas too. >> Thanks >> > > > for your >> > > help >> > > > and advice! >> > > > >> > > > Best, >> > > > Erik Ritter >> > > >> > > --------------------------------------------------------------------- >> > > To unsubscribe, e-mail: dev-unsubscr...@community.apache.org >> > > For additional commands, e-mail: dev-h...@community.apache.org >> > > >> > > >> > >> > -- >> > +48 660 796 129 >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@community.apache.org >> For additional commands, e-mail: dev-h...@community.apache.org >> >> > > -- > +48 660 796 129 > -- +48 660 796 129