Given that reaching consensus in both communities seems like a harder task than just deciding our policy. in the Beam side Why don't we just go ahead and vote around this + build the tool, and if the Flink guys are interested they can take it, no?
in the future we can share that code. On Wed, May 16, 2018 at 3:31 PM Piotr Nowojski <[email protected]> wrote: > The question is what would such tool offer on top of over a Github’s view of PR sorted by “least recently updated”: https://github.com/apache/flink/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc <https://github.com/apache/flink/pulls?q=is:pr+is:open+sort:updated-asc> > ? Maybe it would be good enough to have a policy about waiting x months/weeks for a contributor to respond. If he doesn’t, we either take over PR we or close it. Having “clean” view of oldest PRs would be beneficial for reviewing PRs in ~FIFO order as part of community work. > Having > > On 16 May 2018, at 10:57, Fabian Hueske <[email protected]> wrote: > > > > Hi, > > > > I'm not objecting closing stale PRs. > > We have quite a few PRs with very little chance of being merged and I would > > certainly appreciate cleaning up those. > > However, I think we should not automate closing PRs for the reasons I gave > > before. > > > > A tool that reminds us of state PRs as proposed by Till seems like a good > > idea and might actually help to re-adjust priorities from time to time. > > > > @Yazdan: Right now there is no assignment happening. Committers decide > > which PRs they want to review and merge. > > > > Best, Fabian > > > > 2018-05-16 4:26 GMT+02:00 Yaz Sh <[email protected]>: > > > >> I have questions in this regard (you guys might have addressed it in this > >> email chain): > >> how PRs get assigned to a reviewer apart of a contributor tag someone? > >> what if PR never gets a reviewer attention and it became in-active due to > >> long review respond? should Bot assign a reviewer to a PR based on > >> reviewers interest (i.e., defined via tags) and notify he/she if PR is > >> waiting for review? > >> > >> > >> Cheers, > >> /Yazdan > >> > >> > >>> On May 15, 2018, at 12:06 PM, Thomas Weise <[email protected]> wrote: > >>> > >>> I like Till's proposal to notify the participants on the PR to PTAL. But > >> I > >>> would also suggest to auto-close when no action is taken, with a friendly > >>> reminder that PRs can be reopened anytime. > >>> > >>> The current situation with 350 open PRs may send a signal to contributors > >>> that it may actually be too much hassle to get a change committed in > >> Flink. > >>> Since the count keeps on rising, this is also not a past problem. Pruning > >>> inactive PRs may help, that will also give a more accurate picture of the > >>> lack of review bandwidth, if that is the root cause. > >>> > >>> Thanks, > >>> Thomas > >>> > >>> > >>> > >>> > >>> > >>> On Tue, May 15, 2018 at 8:24 AM, Ted Yu <[email protected]> wrote: > >>> > >>>> How does the bot decide whether the PR is waiting for reviews or is > >> being > >>>> abandoned by contributor ? > >>>> > >>>> How about letting the bot count the number of times contributor pings > >>>> committer(s) for review ? > >>>> When unanswered ping count crosses some threshold, say 3, the bot > >> publishes > >>>> the JIRA and PR somewhere. > >>>> > >>>> Cheers > >>>> > >>>> On Tue, May 15, 2018 at 8:19 AM, Till Rohrmann <[email protected]> > >>>> wrote: > >>>> > >>>>> I'm a bit torn here because I can see the pros and cons for both sides. > >>>>> > >>>>> Maybe a compromise could be to not have a closing but a monitoring bot > >>>>> which notifies us about inactive PRs. This could then trigger an > >>>>> investigation of the underlying problem and ultimately lead to a > >>>> conscious > >>>>> decision to close or keep the PR open. As such it is not strictly > >>>> necessary > >>>>> to have such a bot but it would at least remind us to make a decision > >>>> about > >>>>> older PRs with no activity. > >>>>> > >>>>> Cheers, > >>>>> Till > >>>>> > >>>>> On Tue, May 15, 2018 at 3:26 PM, Chesnay Schepler < [email protected]> > >>>>> wrote: > >>>>> > >>>>>> /So far I did it twice for older PRs. In both cases I didn’t get any > >>>>>> response and I even forgot in which PRs I had asked this question, so > >>>>> now I > >>>>>> can not even close them :S/ > >>>>>> > >>>>>> To be honest this sounds more like an issue with how your organize > >> your > >>>>>> work. No amount of closing PRs can fix that. > >>>>>> With GitBox we can assign reviewers to a PR, but I'm not sure whether > >>>> it > >>>>>> only allows committers to assign people. > >>>>>> Bookmarks or text files might help as well./ > >>>>>> / > >>>>>> > >>>>>> /Regarding only 30% blocked on contributor. I wonder what would be > >> this > >>>>>> number if we tried to ask in the rest of old PRs “Hey, are you still > >>>>>> interested in reviewing/merging this?”. If old PR is waiting for a > >>>>>> reviewer for X months, it doesn’t mean that’s not abandoned. Even if > >> it > >>>>>> doesn’t, reducing our overhead by those 30% of the PRs is something./ > >>>>>> > >>>>>> No doubt the number would be higher if we were to go back, but as i > >>>>>> explained earlier that is not a reason to close it. If a PR is > >>>> abandoned > >>>>>> because we messed up we should still /try /to get it in. > >>>>>> > >>>>>> /This is kind of whole point of what I was proposing. If the PR author > >>>> is > >>>>>> still there, and can respond/bump/interrupt the closing timeout, > >> that’s > >>>>>> great. Gives us even more sense of urgency to review it./ > >>>>>> > >>>>>> Unfortunately knowing that it is more urgent is irrelevant, as we > >>>>>> currently don't have the manpower to review them. Reviving them now > >>>> would > >>>>>> serve no purpose. The alternative is to wait until more people become > >>>>>> active reviewers. > >>>>>> > >>>>>> /As a last resort, closing PR after timeout is not the end of the > >>>> world. > >>>>>> It always can be reopened./ > >>>>>> > >>>>>> Let's be realistic here, it will not be reopened. > >>>>>> > >>>>>> > >>>>>> On 15.05.2018 14:21, Piotr Nowojski wrote: > >>>>>> > >>>>>>> I agree that we have other, even more important, problems with > >>>> reviewing > >>>>>>> PR and community, but that shouldn’t block us from trying to clean up > >>>>>>> things a little bit and minimise the effort needed for reviewing PRs. > >>>>> Now > >>>>>>> before reviewing/picking older PRs I had to ask this “Hey, are you > >>>> still > >>>>>>> interested in merging this?” manually and wait for the response. If > >> it > >>>>>>> doesn’t come, I have to remember to go back and close PR, which I of > >>>>> course > >>>>>>> forget to do. Bah, so far I did it twice for older PRs. In both cases > >>>> I > >>>>>>> didn’t get any response and I even forgot in which PRs I had asked > >>>> this > >>>>>>> question, so now I can not even close them :S Wasted effort and > >> wasted > >>>>> time > >>>>>>> on context switching for me and for everyone else that will have to > >>>>> scroll > >>>>>>> pass or look on those PR to check their status. > >>>>>>> > >>>>>>> Keep in mind that I am not proposing to close the PR automatically > >>>>>>> straight on after 3 months of inactivity. Only after asking a > >> question > >>>>>>> whether original contributor is still there and he is interested in > >>>> the > >>>>> PR > >>>>>>> to be reviewed. > >>>>>>> > >>>>>>> for Flink 1.5, I merged a contribution from PR #1990 after it was > >>>>>>>> requested a few times by users. > >>>>>>>> > >>>>>>> This is kind of whole point of what I was proposing. If the PR author > >>>> is > >>>>>>> still there, and can respond/bump/interrupt the closing timeout, > >>>> that’s > >>>>>>> great. Gives us even more sense of urgency to review it. On the other > >>>>> hand > >>>>>>> if there is no response (no interest from the author for whatever the > >>>>>>> reasons) and nobody so far has picked this PR to review/merge, what’s > >>>>> the > >>>>>>> point of keeping such PR open? As a last resort, closing PR after > >>>>> timeout > >>>>>>> is not the end of the world. It always can be reopened. > >>>>>>> > >>>>>>> Regarding only 30% blocked on contributor. I wonder what would be > >> this > >>>>>>> number if we tried to ask in the rest of old PRs “Hey, are you still > >>>>>>> interested in reviewing/merging this?”. If old PR is waiting for a > >>>>> reviewer > >>>>>>> for X months, it doesn’t mean that’s not abandoned. Even if it > >>>> doesn’t, > >>>>>>> reducing our overhead by those 30% of the PRs is something. > >>>>>>> > >>>>>>> Piotrek > >>>>>>> > >>>>>>> On 15 May 2018, at 10:10, Fabian Hueske <[email protected]> wrote: > >>>>>>>> > >>>>>>>> I'm with Chesnay on this issue. > >>>>>>>> Stale PRs, i.e., a PR where a contributor becomes inactive, are one > >>>> of > >>>>>>>> our > >>>>>>>> smallest issues, IMO. > >>>>>>>> > >>>>>>>> There are more reasons for the high number of PRs. > >>>>>>>> * Lack of timely reviews. > >>>>>>>> * Not eagerly closing PRs that have no or very little chance of > >> being > >>>>>>>> merged. Common reasons are > >>>>>>>> 1) lack of interest in the feature by committers, > >>>>>>>> 2) too extensive changes and hence time consuming reviews, or > >>>>>>>> 3) bad quality. > >>>>>>>> > >>>>>>>> For 1), there are lots of older JIRA issues, that have low priority > >>>> but > >>>>>>>> which are picked up by contributors. In the contribution guidelines > >>>> we > >>>>>>>> ask > >>>>>>>> contributors to let us know when they want to work on an issue. So > >>>> far > >>>>>>>> our > >>>>>>>> attitude has been, yes sure go ahead. I've seen very little attempts > >>>> of > >>>>>>>> warning somebody to work on issues that won't be easily merged. > >>>>>>>> Once a PR has been opened, we should also be honest and let > >>>>> contributors > >>>>>>>> know that it has no chance or might take a while to get reviewed. > >>>>>>>> For 2) this is typically not so much of an issue if the feature is > >>>>>>>> interesting. However, if 1) and 2) meet, chances to get a change in > >>>>> drop > >>>>>>>> even more. > >>>>>>>> > >>>>>>>> A common "strategy" to deal with PRs that fall into 1), 2), or 3) is > >>>> to > >>>>>>>> not > >>>>>>>> look at them or giving a shallow review. > >>>>>>>> Of course, contributors become unresponsive if we don't look at > >> their > >>>>> PRs > >>>>>>>> for weeks or months. But that's not their fault. > >>>>>>>> Instead, I think we should be honest and communicate the chances of > >> a > >>>>> PR > >>>>>>>> more clearly. > >>>>>>>> > >>>>>>>> Browsing over the list of open PRs, I feel that most older PRs fall > >>>>> into > >>>>>>>> the category of low-priority (or even unwanted) features. > >>>>>>>> Bug fixes or features that the committers care about usually make it > >>>>> into > >>>>>>>> the code base. > >>>>>>>> In case a contributor becomes inactive, committers often take over > >> an > >>>>>>>> push > >>>>>>>> a contribution over the line. > >>>>>>>> > >>>>>>>> So, I do not support an auto-close mechanism. > >>>>>>>> I think a better way to address the issue is better communication, > >>>> more > >>>>>>>> eagerly closing PRs with no chance, and putting more reviewing > >>>> effort. > >>>>>>>> IMO, we should only eagerly close PRs that have no chance of being > >>>>>>>> merged. > >>>>>>>> PRs with low-prio features might be picked up later (for Flink 1.5, > >> I > >>>>>>>> merged a contribution from PR #1990 after it was requested a few > >>>> times > >>>>> by > >>>>>>>> users). > >>>>>>>> > >>>>>>>> However, I think we could do a pass over the oldest PRs and check if > >>>> we > >>>>>>>> can > >>>>>>>> close a bunch. > >>>>>>>> There are quite a few contributions (many for flink-ml) that I don't > >>>>> see > >>>>>>>> a > >>>>>>>> chance for getting merged. > >>>>>>>> > >>>>>>>> Best, Fabian > >>>>>>>> > >>>>>>>> > >>>>>>>> - > >>>>>>>> > >>>>>>>> 2018-05-15 9:13 GMT+02:00 Chesnay Schepler <[email protected]>: > >>>>>>>> > >>>>>>>> -1 > >>>>>>>>> > >>>>>>>>> For clarification (since the original mail indicates otherwise), > >> the > >>>>>>>>> number of pull requests that this would affect is fairly small. > >>>>>>>>> Only about 25-30% of all open PRs are blocked on the contributor, > >>>> the > >>>>>>>>> remaining ones are actually blocked on the review. > >>>>>>>>> Thus is reject the premise that one has to search through that many > >>>>> PRs > >>>>>>>>> to > >>>>>>>>> find something to review, there is plenty. > >>>>>>>>> > >>>>>>>>> I believe it to be highly unfair for us to close PRs due to > >>>>> inactivity, > >>>>>>>>> when the primary cause has been /our /own inactivity. > >>>>>>>>> If a PR is opened and the first comment comes in 3 months late, > >>>> then I > >>>>>>>>> don't blame the contributor for not responding. > >>>>>>>>> IMO we owe it to the contributor to evaluate their PR, and if > >>>>> necessary > >>>>>>>>> bring it to a merge-able state (to a certain extend). > >>>>>>>>> > >>>>>>>>> There's also the fact that closing these PRs outright would waste a > >>>>> lot > >>>>>>>>> of > >>>>>>>>> good contributions. > >>>>>>>>> > >>>>>>>>> Finally, this solution is overkill for what we want to achieve. If > >>>> we > >>>>>>>>> want > >>>>>>>>> to make it easier to find PRs to review all we need is > >>>>>>>>> GitBox integration and tagging or PRs. That's it. We could have a > >>>>> /fully > >>>>>>>>> /tagged PR list /tomorrow/, if we really wanted to. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 15.05.2018 05:10, Ted Yu wrote: > >>>>>>>>> > >>>>>>>>> bq. this pull request requires a review, please simply write any > >>>>>>>>>> comment. > >>>>>>>>>> > >>>>>>>>>> Shouldn't the wording of such comment be known before hand ? > >>>>>>>>>> > >>>>>>>>>> Otherwise pull request waiting for committers' review may be > >>>>>>>>>> mis-classified. > >>>>>>>>>> > >>>>>>>>>> Cheers > >>>>>>>>>> > >>>>>>>>>> On Mon, May 14, 2018 at 7:59 PM, blues zheng <[email protected]> > >>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> +1 for the proposal. > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Best, > >>>>>>>>>>> blues > >>>>>>>>>>> On 05/14/2018 20:58, Ufuk Celebi wrote: > >>>>>>>>>>> Hey Piotr, > >>>>>>>>>>> > >>>>>>>>>>> thanks for bringing this up. I really like this proposal and also > >>>>> saw > >>>>>>>>>>> it work successfully at other projects. So +1 from my side. > >>>>>>>>>>> > >>>>>>>>>>> - I like the approach with a notification one week before > >>>>>>>>>>> automatically closing the PR > >>>>>>>>>>> - I think a bot will the best option as these kinds of things are > >>>>>>>>>>> usually followed enthusiastically in the beginning but eventually > >>>>>>>>>>> loose traction > >>>>>>>>>>> > >>>>>>>>>>> We can enable better integration with GitHub by using ASF GitBox > >>>>>>>>>>> (https://gitbox.apache.org/setup/) but we should discuss that in > >>>> a > >>>>>>>>>>> separate thread. > >>>>>>>>>>> > >>>>>>>>>>> – Ufuk > >>>>>>>>>>> > >>>>>>>>>>> On Mon, May 14, 2018 at 12:04 PM, Piotr Nowojski > >>>>>>>>>>> <[email protected]> wrote: > >>>>>>>>>>> > >>>>>>>>>>> Hey, > >>>>>>>>>>>> > >>>>>>>>>>>> We have lots of open pull requests and quite some of them are > >>>>>>>>>>>> > >>>>>>>>>>>> stale/abandoned/inactive. Often such old PRs are impossible to > >>>>> merge > >>>>>>>>>>> due > >>>>>>>>>>> to > >>>>>>>>>>> conflicts and it’s easier to just abandon and rewrite them. > >>>>> Especially > >>>>>>>>>>> there are some PRs which original contributor created long time > >>>> ago, > >>>>>>>>>>> someone else wrote some comments/review and… that’s about it. > >>>>> Original > >>>>>>>>>>> contributor never shown up again to respond to the comments. > >>>>>>>>>>> Regardless > >>>>>>>>>>> of > >>>>>>>>>>> the reason such PRs are clogging the GitHub, making it difficult > >>>> to > >>>>>>>>>>> keep > >>>>>>>>>>> track of things and making it almost impossible to find a little > >>>> bit > >>>>>>>>>>> old > >>>>>>>>>>> (for example 3+ months) PRs that are still valid and waiting for > >>>>>>>>>>> reviews. > >>>>>>>>>>> To do something like that, one would have to dig through tens or > >>>>>>>>>>> hundreds > >>>>>>>>>>> of abandoned PRs. > >>>>>>>>>>> > >>>>>>>>>>> What I would like to propose is to agree on some inactivity dead > >>>>> line, > >>>>>>>>>>>> > >>>>>>>>>>>> lets say 3 months. After crossing such deadline, PRs should be > >>>>>>>>>>> marked/commented as “stale”, with information like: > >>>>>>>>>>> > >>>>>>>>>>> “This pull request has been marked as stale due to 3 months of > >>>>>>>>>>>> > >>>>>>>>>>>> inactivity. It will be closed in 1 week if no further activity > >>>>>>>>>>> occurs. If > >>>>>>>>>>> you think that’s incorrect or this pull request requires a > >> review, > >>>>>>>>>>> please > >>>>>>>>>>> simply write any comment.” > >>>>>>>>>>> > >>>>>>>>>>> Either we could just agree on such policy and enforce it manually > >>>>>>>>>>>> (maybe > >>>>>>>>>>>> > >>>>>>>>>>>> with some simple tooling, like a simple script to list inactive > >>>>> PRs - > >>>>>>>>>>> seems > >>>>>>>>>>> like couple of lines in python by using PyGithub) or we could > >>>> think > >>>>>>>>>>> about > >>>>>>>>>>> automating this action. There are some bots that do exactly this > >>>>> (like > >>>>>>>>>>> this > >>>>>>>>>>> one: https://github.com/probot/stale < https://github.com/probot/ > >>>>> stale > >>>>>>>>>>>> > >>>>>>>>>>> ), > >>>>>>>>>>> but probably they would need to be adopted to limitations of our > >>>>>>>>>>> Apache > >>>>>>>>>>> repository (we can not add labels and we can not close the PRs > >> via > >>>>>>>>>>> GitHub). > >>>>>>>>>>> > >>>>>>>>>>> What do you think about it? > >>>>>>>>>>>> > >>>>>>>>>>>> Piotrek > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >> > >>
