On 8 February 2015 at 09:55:42, Karthik Kambatla
(ka...@cloudera.com<mailto:ka...@cloudera.com>) wrote:
On Fri, Feb 6, 2015 at 6:14 PM, Colin P. McCabe <cmcc...@apache.org> wrote:
> I think it's healthy to have lots of JIRAs that are "patch available."
> It means that there is a lot of interest in the project and people
> want to contribute. It would be unhealthy if JIRAs that really needed
> to get in were not getting in. But beyond a few horror stories, that
> usually doesn't seem to happen.
>
> I agree that we should make an effort to review things that come from
> new contributors. I always set aside some time each week to look
> through the new JIRAs on the list and review ones that I feel like I
> can do.
>
> I think the "patch manager" for a patch should be the person who
> submitted it. As Chris suggested, if nobody is reviewing, email
> people who reviewed earlier and ask why. Or email the list and ask if
> this is the right approach, and bring attention to the issue.
>
It is definitely great if contributors could reach out to potential
reviewers and follow-up. However, newer contributors find it hard to figure
out who to reach out to, and leaving it on them is not very welcoming.
Also, people often find one issue, post a fix out of good will and move on.
They might not be motivated enough to be the "patch manager" for it. I
understand that if it is an important patch, someone else will take it up
and get it in. If it is not or if it is duplicated, that JIRA/patch needs
to be cleaned up.
Having just looked through the set of JIRAs which I filed but didn't submit
patches, they fall into the category "there's something not right here but I'm
not prepared to invest the time to get a patch in, as that patch is at risk of
being neglected. Why bother".
Which means that queue length isn't a good metric of health, not if it
suppresses the creation/submission of new patches.
Indeed, given that foundational bit of a queue theory, "A queue forms in a
channel if egress rate < ingress rate", the metric of a functional process
should really be channel throughput, ideally in a steady state where patches
are being applied, returned for rework, rejected or postponed for sound reasons
"this is a post java-7 feature", with postponed ones being returned to.
That's what we should be measuring then: throughput (higher=better) and latency
(lower=better).
I put some time in on saturday looking at issues. As well as closing some of my
own and others as an WONTFIX, I did some detailed work on a couple.
It took ~30 minutes for rigorous review of a patch. One from an active
contributor was bigger (HADOOP-11042), but they were already familiar with the
expectations of testing and knew their way around the codebase. Another was a
smaller patch (HADOOP-3619), but because the patch was further from what we
like, it took more effort in providing feedback, and will need another
go-around. That's not the fault of the contributor, its just that he wasn't
familiar with the codebase and some of our test needs. I could have just fixed
the code there-and-then but I'd then be unable to review the patch myself. I
also think that by helping the patch submitter complete the patch, they learn
more about the hadoop dev process, so future patches will be better.
That was me this weekend then: two patches nurtured along the way, and an
implicit commitment to review their next iterations. One more hour booked out
of my calendar for later this month.
Which is why I would love to see what we can do about speeding up that review
process with better tooling, Gerrit being one whose name keeps coming up. I
know the argument against it "splits discussion", but is that worse than
"languishes without any feedback on the JIRA at all"?
Getting a release out (for a release manager) is already some work. Adding
more responsibilities to the RM (a voluntary role) makes it less enticing.
And, distributing the work among multiple workers (with domain knowledge)
might be more efficient.
I think finding someone to volunteer as patch manager would be equally
unenticing. We effectively have some: Chris N looks after windows problems
across the codebase, AW does bash, I do some of the build and object store
stuff. But its invariably "when there's spare time", and with a single
individual worrying about the problem, the throughput on that area is now a
metric of how much spare time that person has. The queue length is therefore a
function of other work the individual is doing and other personal commitments.
At the very least, we need to identify >1 person who cares about an area, and
have them collaborate. Having better tooling to aid that collaboration is also
important. Jira and manual patch viewing isn't enough, at least for me trying
to review some work in my spare time
So: where next? Can we set gerrit up to be an optional mechanism for reviewing
things, so that those of us who do want to experiment with using to review
patches can try it?
-steve