Hi JB, You're right, I was thinking more about changes to core when talking about the technical-excellence bar.
I think there still needs to be some bar for new features and extension, but I also think it can be much lower (as nobody is breaking anything by merging this). An example of where we still need a bar here is tests. If a new IO has a test that the reviewer thinks will be flaky, that flaky test will cause problems for _every_ Beam committer, and it's fair to ask for the test to be changed. Given that the bar is lower for new extensions, I think we need a good way of marking these things so that Beam users know they are not as mature as other parts of Beam. Traditionally we've used @Experimental, but @Experimental has been overloaded to mean other things as well. Maybe we need to introduce a new annotation? Reuven On Tue, Feb 20, 2018 at 5:48 AM, Jean-Baptiste Onofré <j...@nanthrax.net> wrote: > Hi Reuven > > I agree with all your points except maybe in term of bar level, especially > on new features (like extensions or IOs). If the PRs on the core should be > heavily reviewed, I'm more in favor of merging the PR pretty fast even if > not perfect. It's not a technical topic, it's really a contribution and > community topic. > > Thanks anyway, the dashboard is a good idea ! > > Regards > JB > Le 19 févr. 2018, à 19:33, Reuven Lax <re...@google.com> a écrit: >> >> There have been a number of threads on code reviews (most recently on a >> "State of the project" email). These threads have died out without much >> resolution, but I'm not sure that the concerns have gone away. >> >> First of all, I'm of the opinion that a code-review bar for Beam commits >> is critical to success of the project. This is a system with many subtle >> semantics, which might not be obvious at first glance. Beam pipelines >> process user data, and the consequence of certain bugs might mean >> corrupting user data and aggregations - something to avoid at all cost if >> we want Beam to be trusted. Finally Beam pipelines often run at extremely >> high scale; while many of our committers have a strong intuition for what >> can go wrong when running at high scale, not everybody who wants to >> contribute will have this experience. >> >> >> However, we also cannot afford to let our policy get in the way of >> building a community. We *must* remain a friendly place to develop and >> contribute. >> >> >> When I look at concerns people have had on on code reviews (and I've been >> browsing most PRs this past year), I see a few common threads: >> >> >> *Review Latency* >> >> Latency on code reviews can be too high. At various times folks (most >> recently, Ahmet and I) have tried to regularly look for stale PRs and ping >> them, but latency still remains high. >> >> >> *Pedantic* >> >> Overly-pedantic comments (change variable names, etc.) can be >> frustrating. The PR author can feel like they are being forced to make >> meaningless changes just so the reviewer will allow merging. Note that this >> is sometimes in the eye of the beholder - the reviewer may not think all >> these comments are pedantic. >> >> >> *Don't Do This* >> >> Sometimes a reviewer rejects an entire PR, saying that this should not be >> done. There are various reasons given: this won't scale, this will break >> backwards compatibility, this will break a specific runner, etc. The PR >> author may not always understand or agree with these reasons, and this can >> leave hurt feelings. >> >> >> I would like open discussion about ways of making our code-review policy >> more welcoming. I'll seed the discussion with a few ideas: >> >> >> *Code Review Dashboard and Automation* >> >> We should invest in adding a code-review dashboard to our site, tracking >> stale PRs by reviewer. Quick turnaround on code reviews is essential >> building community, so all Beam committers should consider reviewing code >> as important as their own coding. Spark has built a PR dashboard ( >> https://spark-prs.appspot.com/) which they’ve found better than Github’s >> dashboard; we could easily fork this dashboard. There are also tools that >> will automatically ping reviewers (mention-bot and hey there are two such >> tools). We can also make sure that new PRs are auto assigned a reviewer >> (e.g. https://github.com/imsky/pull-review) >> >> >> *Code Review Response SLA* >> >> It would be great if we could agree on a response-time SLA for Beam code >> reviews. The response might be “I am unable to do the review until next >> week,” however even that is better than getting no response. >> >> >> *Guideline Document* >> >> I think we should have a guideline document, explaining common reasons a >> reviewer might reject an approach in a PR. e.g. "This will cause scaling >> problems," "This will cause problems for XXX runner," "This is backwards >> incompatible." Reviewers can point to this doc as part of their comments, >> along with extra flavor. e.g. “as per the guideline doc, this will cause >> scaling problems, and here’s why.” >> >> >> *Guidelines on Comments* >> >> Not everyone agrees on which comments are pedantic, which makes it hard >> to have specific guidelines here. One general guideline me might adopt: if >> it'll take less time for the reviewer to make the changes themselves, it's >> not an appropriate comment. The reviewer should fix those issues a >> follow-on PR. This adds a bit more burden on reviewers, but IMO is worth it >> to keep Beam a friendly environment. This is especially important for >> first-time contributors, who might otherwise lost interest. If the author >> is a seasoned Beam contributor, we can expect more out of them. >> >> >> We should also make sure that these fixups serve as educational moments >> for the new contributor. “Thanks for the contribution! I’ll be making some >> changes during the merge so that the code stays consistent across the >> codebase - keep an eye on them.” >> >> >> Would love to hear more thoughts. >> >> >> Reuven >> >>