Re: Faster PR reviews

Sam Corbett Wed, 03 May 2017 09:38:30 -0700

I used GitHub to check the number of pull requests we've merged to a fewof our projects over the last year in two month-chunks [1]. Each listbegins today and works backwards:


brooklyn-server: 86, 79, 85, 68, 95, 78.
brooklyn-client: 7, 4, 3, 8, 2, 4.
brooklyn-library: 11, 12, 9, 7, 17, 8.
total: 104, 95, 97, 83, 114, 90.

It's my opinion that these numbers show a consistency in the rate atwhich we merge pull requests. Since a few committers have recentlyjoined I expect the rate to increase over the next few months. I wouldlike to know other figures like the average length of time betweenopening a pull request and a committer merging it and the average lengthof time before the first review comment. I'm sure there are many otherinteresting metrics that could inform this discussion.

Maybe there's just one or two cases that we have managed badly.Identifying those pull requests and understanding the cause of the delayto each one would be valuable.

Regarding the complexity of Brooklyn and speaking as someone that hasused it in anger (sometimes literally) I would posit that many of ushave been caught out several times by the corner cases and unexpectedinteractions between components that complexity so often implies. Thisleads to cautious reviews and slow turnaround times. Quick-scan reviewssolve the speed at which we get code into the product. We need to thinkharder about how we reduce its overall complexity.

Sam

1. I used GitHub's interface to search for "is:prmerged:2017-03-04..2017-05-03", etc.



On 03/05/2017 09:41, Aled Sage wrote:

Hi Alex,
I agree with you that we have a problem with the reviewing of some ofour PRs - it's a bad situation for all concerned when these PRs stayopen for as long as they do.
I agree with your "eye-ball test" for a certain class of PR. I thinkwhere we probably disagree is where the line is for "low-risk" PRs.There are some examples where they "eye-ball test" would certainlyhave helped.
I deliberately expanded the scope of this discussion, because there'snot one single solution - I'm not "turning this discussion in toguidelines about PR". I'm adding to the discussion.
*Contributing Factors*
Breaking this down into the contributing factors, I believe those are:

1. Some PRs don't get enough attention from reviewers.

2. Not as much time as we'd like is spent by the community reviewing PRs.

3. The bar for being merged is too high (for at least some PRs).

4. Some PRs are very hard to review.
There are clearly examples of (1) where it's a no-brainer we shouldhave given it more attention, commented and merged. Your proposal forthe "eye-ball test" will help for some.
However, there are also examples of (1) that are caused by (4) -commonly due to the complexity of the code being changed (e.g.config-inheritance), sometimes made worse by it changing many things(so it's more daunting to review).
Given (2) and (3), it suggests we should spread that time across morePRs (i.e. some PRs that are getting a very thorough review could getless, and folk try to use that "saved" time on some other PRs). I'mnot convinced that would actually happen in practice though!
*Solutions*
1. Add the "eye-ball test" to our reviewer guidelines (as described byAlex) - and adjust our perception of "low-risk" over time, as we seehow it works.
2. Guidelines for PRs - what will make the reviewers' job considerablyeasier, so we can get things merged faster? For example, small andfocused PRs, with good test coverage, linking to a jira issue whereappropriate.
3. Invest more time simplifying Brooklyn (see below).


*Complexity of Brooklyn*
I've heard from quite a few people that certain areas of Brooklyn arefar too hard to understand. Some people avoid reviewing PRs that touchit, because it's so hard to understand the implications - they focustheir time on PRs that they feel more confident to review.
This is a symptom of an overly complex project. It would be great tofind more time to simplify things - e.g. to delete things fromBrooklyn, to make things more consistent, to refactor or even rewritesome sections, and to add more javadoc.
*Accepted Limitations to Timely Review*
PRs that make far reaching changes to low-level details of Brooklynwill always require a thorough review. Clearly we should try to findthe time for that promptly, but should always view them as high-risk.
*YOML*
If you insist on generalising YOML here, rather than a separate emailthread specifically about it, then: we should have commented veryquickly and discussed it promptly on the mailing list - at the levelof whether we want it (ignoring much of its technical details). If itwas pretty much anyone but Alex, then we should have commented saying:
   "Very interesting, but this is a huge amount of code to add and
   maintain in Brooklyn. Can you instead create a new github project
   for this library, so that it can be worked on and maintained
   separately? We'd then be interested to see how it can be used within
   Brooklyn. Can you close this PR and let us know when/where you
   create that library."
Like I said, that's for pretty much anyone but Alex. The difference isthat Alex wrote the first version of our yaml/camp parsing and knowsit better than anyone else. That original code definitely deserves are-write: it's become increasingly complicated as the supported yamlhas evolved. Alex has investigated different approaches and has comeup with a way that could greatly improve that code, and be used inother places as well. Doing that in Brooklyn is simpler for him,because it can evolve in tandem to satisfy requirements of Brooklyn.
I therefore suggest we discuss YOML separately, rather than generalising.

Aled


On 03/05/2017 02:13, Alex Heneveld wrote:
Aled,

> *Light-weight Review*
> I agree with you - where PRs look sensible, low-risk and unittested we should take more risk and> merge them sooner (even if there's not been time for a thoroughreview by the community).
I'm saying something a little different: we should _try_ for athorough review of *all* PRs. Which I think is uncontroversial.
> What should we do with a PR when we aren't able to review things inas much depth as we'd like?
This is the question I'm asking, to ensure we handle PR's in a goodtime frame. To summarise, I'm suggesting we make more of an effort,and we fall back to an "eyeball test" a certain period of time (7days max, less if it's simple?), triage the review to look at:
* clearly helpful & not obviously wrong
* low-risk / doesn't break compatibility
* good test coverage (and passing)
* likely to be maintained
If these can't be confirmed, the reviewer should say what they havedoubts about, maybe suggest what the contributor could do to help, orappeal to other committers more familiar with an area. In any caseget a discussion going.
If these do seem confirmed, I still suggest we don't mergeimmediately in the absence of a thorough review, but ping specificcommitters likely to be interested. If no thorough review after afew more days, _then_ merge.
I'm not suggesting any heavyweight process, but just enough to puthealthy forces on us as reviewers.
This is not a theoretical question, nor is it restricted to the YOMLPR. We're pretty good with most of our PRs and reviews but there areplenty of examples where we've dropped the ball. Look at [1] whichis tiny and tests-only and took nine days to get a review. Or [2]which yes combines a few related-but-different things but is by nomeans a hard thing to review. It would take far more time to splitthat up into 3 branches, test those locally, then babysit each ofthose PR's than it would take for a reviewer to just get on with areview. It's been sitting there for 2 months and doesn't even have acomment.
This is not a good state of affairs. Turning this discussion in toguidelines about PR's misses the point. If there's any change to ourdocs/process made as a result of this discussion I'd like to see theeyeball test added to a review process discussion.
Finally re YOML, there is an ML thread started when the issue wasraised. There was chatter beforehand but it wasn't an easy thing todiscuss until there was prototype code. The point is for 7 monthsthere have been no comments in any of these places, even after I'verun a public session explaining it and private sessions and the PRitself says how it can be tested and how it is insulated from therest of the code (Thomas I think you missed that point). As there isan ML thread and an open issue, either of which would be a fine placeto comment, but no one is -- the suggestion of a new separate MLthread to solve the problem is bizarre. I say this is _exactly_ thesituation when we need guidelines for how we handle PR's that are notbeing reviewed in a timely way.
Best
Alex


[1] https://github.com/apache/brooklyn-server/pull/600
[2] https://github.com/apache/brooklyn-server/pull/575



On 02/05/2017 19:21, Aled Sage wrote:
Hi Alex,

Interesting question. A few initial thoughts:

*YOML*
YOML (PR #363) is an exceptional case - we should not use that as anexample when discussing this meta-question. The PR is 12,000 lines(including comments/notes), and was not discussed on the mailinglist before it was submitted. I suggest we have a separate emailthread specifically about merging that PR, as there are certainlyvery useful things we'd get from YOML.
*Small PRs*
We should strongly encourage small focused PRs on a single thing,wherever possible. That will make review faster, easier and lowerrisk. For such PRs, we should strive for review+merge within days (7days being an upper bound in normal circumstances, hopefully).
We can add some brief guidelines to this effect athttp://brooklyn.apache.org/developers/how-to-contribute.html
*Changing low-level Brooklyn*
PRs that change low-level things in Brooklyn (e.g. changes toconfig-inheritance etc) deserve thorough review. They are high-riskas the unforeseen consequences of the changes can be very subtle,and break downstream blueprints that rely on old ways of doing things.
*Light-weight Review*
I agree with you - where PRs look sensible, low-risk and unit testedwe should take more risk and merge them sooner (even if there's notbeen time for a thorough review by the community).
Aled


On 02/05/2017 15:50, Duncan Johnston Watt wrote:
Hi Alex
This is probably covered already but I guess there needs to be animpactassessment (by submitter?) before something is waved through bydefault.
Best

Duncan
On 2 May 2017 at 06:52, Alex Heneveld<[email protected]>
wrote:
Hi Brooklyners-
As many of you know, my YOML PR #363 [1] has been open for awhile. This
sets up a foundation for giving better documentation and feedback and
hugely simplifying how we do our parsing. However it's a very bigPR. I'meager to have people spend some time using it and ideallyextending it --
but here I wanted to raise a meta-question:
*W**hat should we do with a PR when we aren't able to reviewthings in as
much depth as we'd like?*


One option -- call it (A) -- is to say if we can't review things
thoroughly in a reasonable timeframe, we do a lighter review andif the PR
looks promising and safe we merge it.
The other option -- call it (B) -- is to leave PRs open for aslong as it
takes for us to do the complete review.
I think most people have been approaching this with a mindset of(B), andwhile that's great for code quality and shared code understanding,if wecan't deliver on that quickly, it's frankly anti-social. Thecontributor
has to deal with merge conflicts (and the rudeness of his or her
contribution being ignored), and Brooklyn loses velocity. My PR is an
extreme example but many have been affected by slow reviews, and Ithink
the expectation that reviews have to be so thorough is part of the
problem: it even discourages reviewers, as if you're not anexpert in an
area you probably don't feel qualified to review.
We have good test coverage so product risk of (A) is small, and wehavegreat coders so I've no worry about us being able to solveproblems that(A) might introduce. We should be encouraging reviewers to lookat any
area, and we need to solve the problem of slow reviews.
*I propose that the**standard we apply is that we quickly eithermerge PRs
or identify what the contributor needs to resolve.
*I'm all for thorough reviews and shared understanding, but if wecan't dothis quickly I suggest we are better to sacrifice those thingsrather thanblock contributions, stifle innovation, and discourage reviews byinsisting
on a standards that we struggle to sustain.

As a general rule of thumb, maybe something like:

(1) After 7 days of no activity on a PR we go with an "eyeball test";
unless the following statement is untrue we say:

/I haven't done as much review as I'd like, but the code is clearly
helpful, not risky or obviously wrong or breaking compatibility,it has
good test coverage, and we can reasonably expect the contributor or
committers to maintain it. Leaving open a bit longer in casesomeone elsewants to review more but if nothing further in the next few days,let's
merge.
/(If there are committers who are likely to be specificallyinterested,
call them out as CC.)

(2) After 3 more days, if no activity, merge it.
And we encourage _anyone_ to review anything. If the aboveresponse isthe baseline, everyone in our community is qualified to do it orbetter and
we'll be grateful!

Best
Alex


[1]  https://github.com/apache/brooklyn-server/pull/363

Re: Faster PR reviews

Reply via email to