Apache has a procedure for major pieces of code that are developed off-list. It is called “IP clearance”[1]. It’s fairly onerous, and I believe it would apply even if the people who developed the code were committers to the project.
So, another good reason to develop code on-list. Julian [1] http://incubator.apache.org/ip-clearance/ > On Jan 3, 2019, at 5:35 PM, Jonathan Wei <jon...@apache.org> wrote: > > Thanks for raising these concerns! > > My initial thoughts: > - I agree that separation of design review and code-level review for major > changes would be more efficient > > - I agree that a clear, more formalized process for handling major changes > would be helpful for contributors: > - Define what is considered a major change > - Define a standard proposal structure, KIP-style proposal format sounds > good to me > > - I think it's too rigid to have a policy of "no code at all with the > initial proposal" > - Code samples can be useful references for understanding aspects of a > design > - In some cases it's necessary to run experiments to fully understand a > problem and determine an appropriate design, or to determine whether > something is even worth doing before committing to the work of fleshing out > a proposal, prototype code is a natural outcome of that and I'm not against > someone providing such code for reference > - I tend to view design/code as things that are often developed > simultaneously in an intertwined way > >> Let's not be naive this is very rare that a contributor will accept that > his work is to be thrown, usually devs takes coding as personal creation > and they get attached to it. > > If we have a clear review process that emphasizes the need for early > consensus building, with separate design and code review, then I feel we've > done enough and don't need a hard rule against having some code linked with > the initial proposal. If a potential contributor then still wants to go > ahead and write a lot of code that may be rejected or change significantly, > the risks were made clear. > >> Once code is written hard to think abstract. > > I can see the validity of the concern, but I personally don't see it as a > huge risk. My impression from the Druid PR reviews I've seen is that our > reviewers are able to keep abstract design vs. implementation details > separate and consider alternate designs when reviewing. > > To summarize I think it's probably enough to have a policy along the lines > of: > - Create more formalized guidelines for proposals and what changes require > proposals > - Separate design and code review for major changes, with design review > first, code-level review after reaching consensus on the design. > - Code before the design review is completed is just for reference, not > regarded as a candidate for review/merging. > > - Jon > > > On Thu, Jan 3, 2019 at 12:48 PM Slim Bouguerra <slim.bougue...@gmail.com> > wrote: > >> On Thu, Jan 3, 2019 at 12:16 PM Clint Wylie <clint.wy...@imply.io> wrote: >> >>> I am definitely biased in this matter as an owner of another large PR >> that >>> wasn't preceded by a direct proposal or dev list discussion, and in >> general >>> I agree that proposal first is usually better, but I think in some rarer >>> cases approaching a problem code first *is* the most appropriate way to >>> have a discussion. >> >> >> I am wondering here what is the case where code first is better? >> In general when you are writing code you have an idea about what you want >> to change, why you want to change and why you want to change it. >> I do not see what is wrong with sharing this primitive ideas and thoughts >> as an abstract proposal (at least to avoid overlapping) >> >> I see nothing wrong with it so long as the author >>> accepts that the PR is treated as a combined proposal and proof of >> concept, >>> and fair game to be radically changed via discussion or even rejected, >>> which sounds like Gian's attitude on the matter and is mine as well with >> my >>> compression stuff. >> >> >> Let's not be naive this is very rare that a contributor will accept that >> his work is to be thrown, usually devs takes coding as personal creation >> and they get attached to it. >> To my point you can take a look on some old issue in the Druid forum >> https://github.com/apache/incubator-druid/pull/3755#issuecomment-265667690 >> and am sure other communities have similar problems. >> So leaving the door open to some side cases is not a good idea in my >> opinion and will lead to similar issue in the future. >> >> This seems to me especially likely to happen in cases >>> where an approach still needs proven to be a viable idea *to the author*, >>> so that a much more productive discussion can be had in the first place. >>> >>> I think there is a trade off, I don't think we want to discourage >>> experimentation by walling it off behind mandatory discussions before it >>> can even start, but I do think formalizing the process for large changes >> is >>> a good thing, especially since we probably can't expect the wider >> community >>> to have the same attitude towards a large PR getting discarded as a >>> committer might. I think the Kafka approach is reasonable, a bit more >>> formal than our design review process but not overbearing. >> >> >> Can you please explain what is overbearing ? what can be changed to make it >> easy ? >> Most of the points are kind of the actual questions that you want to >> address before hand anyway isn't ? >> >> >>> Going code first >>> should be in general discouraged, but when it does happen, it seems like >>> opening DIP/an issue/starting a mailing list thread or whatever we go >> with >>> to have a more high level design discussion alongside the reference PR >>> could alleviate some of these complaints? >> >> >> What are the complaints ? >> >> >>> +1 for "DIP" heh, I think making >>> them in the form of github issues is probably appropriate, with a dev >> list >>> thread to announce them perhaps? >>> >> >> I think github issue with [Proposal] header like >> https://github.com/apache/incubator-druid/issues/4349 is good to me, >> >> Thanks! >> >> >>> On Thu, Jan 3, 2019 at 10:28 AM Slim Bouguerra <bs...@apache.org> wrote: >>> >>>> Thanks everyone for interacting with this thread. >>>> >>>> The fact that i, Roman, Jihoon and others in the past (FJ >>>> >>> >> https://groups.google.com/forum/#!msg/druid-user/gkUEsAYIfBA/6B2GJdLkAgAJ) >>>> raised this point indicates that PRs without a proposal are indeed an >>> issue >>>> and we need to solve it. >>>> >>>> Something Similar to KIP maybe called DIPs is fine with me. >>>> What i strive to see is the following: >>>> >>>> [Step 1] formalize what is the kind of work that needs a formal >>> Proposal, I >>>> think Roman and Jihoon has already covered that pretty well. am +1 on >>> that. >>>> >>>> >>> >> https://lists.apache.org/thread.html/9b30d893bdb6bb633cf6a9a700183ffb5b98f115330531a55328ac77@%3Cdev.druid.apache.org%3E >>>> >>>> I am strongly in favor of the separation of Proposal Review and (later) >>>> Code review PRs. My main reasons: >>>> Most importantly code reviewing will introduce lot of noise and will >>>> ultimately make the GitHub page unreadable. >>>> Avoid overlapping of work. >>>> Once code is written hard to think abstract. >>>> Separate page for Design review later can always be used it as a Design >>>> document that is readable and code free-ish. >>>> As i said the goal of this first round is to see if the community agree >>>> about such change, then make the process of design more inclusive thus >>>> other contributors can submit a counter proposals. >>>> >>>> [Step 2] IF everybody agree about that point Step 2 is to define which >>>> medium is used to Publish a primitive form of a CODE FREE Abstract >>> Proposal >>>> containing at least the following bullet points. >>>> - The problem description and motivation >>>> - Overview of the proposed change >>>> - Operational impact (compatibility/ plans to upgrades) public API >>> changes, >>>> configuration changes, algorithm, and so on >>>> - Expected benefits and drawbacks >>>> - Rationale and alternatives >>>> - Estimate Time to Deliver if possible. >>>> >>>> The way i think this can be is a Github issue where member of the >>> community >>>> will interact via comments and the author will be updating the >>> description >>>> in the light of comments provided by the community. >>>> >>>> During and near the end of the design discussions the author/s can >> start >>>> writing POCs to help guide the review process this naturally will be a >>> Pull >>>> request with actual code. >>>> >>>> *Now the most important thing is that we need to agree that any work >> that >>>> does not align with this formal process will be ignored and the author >>> will >>>> be asked to start with a DIP* >>>> *That is what i meant with “If it didn’t happen on the mailing list, >> it >>>> didn’t happen.”* >>>> >>>> Thanks and happy coding! >>>> >>>> On Wed, Jan 2, 2019 at 6:47 PM Gian Merlino <g...@apache.org> wrote: >>>> >>>>> One of the advantages I see with a more formal process is (like Kafka >>>> KIPs) >>>>> is that it levels the playing field a bit and sets some ground rules >>> for >>>>> working together. In a way it can help encourage contributions by >>> making >>>> it >>>>> clear what is expected of potential contributors. >>>>> >>>>> We have a design review process today that is not as formal as KIPs, >>> but >>>> is >>>>> somewhat heavier than the one you describe. Maybe we could tweak our >>>>> current one by starting to do design reviews separately from PRs. >> i.e., >>>> for >>>>> anything that meets our 'design review' criteria, do that on the dev >>> list >>>>> or in a separate issue, and keep the PR focused on code-level stuff. >>> That >>>>> way we don't end up trying to do both at once. And it makes it easier >>> to >>>>> start talking about design before the code is ready, which would be >>>> better. >>>>> >>>>> On Wed, Jan 2, 2019 at 6:10 PM Julian Hyde <jh...@apache.org> wrote: >>>>> >>>>>> It’s really hard to say no to a contribution when someone has put >> in >>> a >>>>>> significant amount of work. >>>>>> >>>>>> The following approach is simple and works really well: Before you >>>> start >>>>>> work, log a case, describing the problem. When you have some ideas >>>> about >>>>>> design, add those to the case. When you have a code branch, add its >>> URL >>>>> to >>>>>> the case. And so forth. At any point in the proceedings, people can >>>> chime >>>>>> in with their opinions. >>>>>> >>>>>> In my opinion, a formal “design review” process is not necessary. >>> Just >>>>>> build consensus iteratively, by starting the conversation early in >>> the >>>>>> process. >>>>>> >>>>>> Julian >>>>>> >>>>>> >>>>>>> On Jan 2, 2019, at 12:37 PM, Gian Merlino <g...@apache.org> >> wrote: >>>>>>> >>>>>>> In this particular case: please consider the PR as a proposal. >>> Don't >>>>> feel >>>>>>> like just because there is code there that takes a certain >>> approach, >>>>> that >>>>>>> the approach is somehow sacred. I had to implement something to >>>>>> crystallize >>>>>>> my own thinking about how the problem could be approached. I >> won't >>> be >>>>>>> disappointed if, as a community, we decide a different direction >> is >>>>>> better >>>>>>> and the code all gets thrown away. That's one of the reasons >> that I >>>>>> removed >>>>>>> the 0.14.0 milestone that was added to the patch. (I don't want >> to >>>> rush >>>>>> it, >>>>>>> nor do I think that's a good idea.) >>>>>>> >>>>>>> In general: Sounds like we could do with some more formalization >>>> around >>>>>>> what a proposal looks like, which sorts of changes need one, and >>> when >>>>> in >>>>>>> the dev cycle it is appropriate. FWIW I think Kafka's process is >>> more >>>>> or >>>>>>> less fine, and would be okay with adopting it for Druid if people >>>> like >>>>>> it. >>>>>>> Right now our standards for what requires a "design review" are >>> very >>>>>>> similar to the Kafka community standards for what requires a KIP, >>> so >>>> we >>>>>>> have some familiarity with those concepts. However we don't >>> separate >>>> PR >>>>>>> review and proposal discussion as strictly as they do, which >> seems >>> to >>>>> be >>>>>>> the foundation for the feeling of exclusion that is being felt >>> here. >>>>>>> >>>>>>> Separately: I just redid the description on >>>>>>> https://github.com/apache/incubator-druid/pull/6794 to be more >>>>>> proposal-y. >>>>>>> I followed the KIP style: >>>>>>> >>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals >>>>>> . >>>>>>> Please refresh the page and see if it looks more useful. >>>>>>> >>>>>>> Gian >>>>>>> >>>>>>> On Wed, Jan 2, 2019 at 10:52 AM Julian Hyde <jh...@apache.org> >>>> wrote: >>>>>>> >>>>>>>> Slim, >>>>>>>> >>>>>>>> I agree with your points that offline development is bad for >>>>> community. >>>>>>>> But I don’t think you need much mentor help. You have raised >> valid >>>>>> issues >>>>>>>> and the Druid community needs to decide what its development >>>> practices >>>>>>>> should be. >>>>>>>> >>>>>>>> Julian >>>>>>>> >>>>>>>> >>>>>>>>> On Jan 2, 2019, at 10:29 AM, Slim Bouguerra <bs...@apache.org> >>>>> wrote: >>>>>>>>> >>>>>>>>> Hello everyone and hope you all have very good holidays. >>>>>>>>> >>>>>>>>> First, this email is not directed on the author or the PR >>>>>>>>> https://github.com/apache/incubator-druid/pull/6794 it self, >>> but >>>> i >>>>>> see >>>>>>>>> this PR as a perfect example. >>>>>>>>> >>>>>>>>> One of the foundation of Apache Way or what i would simply call >>>> open >>>>>>>> source >>>>>>>>> community driven development is that "Technical decisions are >>>>>> discussed, >>>>>>>>> decided, and archived publicly. >>>>>>>>> developpement" >>>>>>>>> Which means that big technical changes such as the one brought >>> by >>>>>> #/6794 >>>>>>>>> should have started as a proposal and round of discussions >> about >>>> the >>>>>>>> major >>>>>>>>> changes designs not as 11K line of code. >>>>>>>>> I believe such openness will promote a lot of good benefits >> such >>>> as: >>>>>>>>> >>>>>>>>> - ensures community health and growth. >>>>>>>>> - ensures everyone can participate not only the authors and his >>>>>>>> co-workers. >>>>>>>>> - ensures that the project is driven by the community and not a >>>> given >>>>>>>>> company or an individual. >>>>>>>>> - ensures that there is consensus (not saying 100% agreement;) >>>>> however >>>>>> it >>>>>>>>> means that all individuals will accept the current progress on >>> the >>>>>>>> project >>>>>>>>> until some better proposal is put forth. >>>>>>>>> >>>>>>>>> Personally such BIG offline PR makes me feel excluded and >> doesn't >>>>> give >>>>>>>> me a >>>>>>>>> sense that i belong to a community at all. >>>>>>>>> >>>>>>>>> To prevent such off list development i think as a Druid >> Community >>>> we >>>>>> need >>>>>>>>> to stick to the apache way “If it didn’t happen on the mailing >>>> list, >>>>> it >>>>>>>>> didn’t happen.” >>>>>>>>> >>>>>>>>> I would appreciate if some of the Apache mentor help with this. >>>>>>>>> Thanks >>>>>>>> >>>>>>>> >>>>>>>> >>>> --------------------------------------------------------------------- >>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org >>>>>>>> For additional commands, e-mail: dev-h...@druid.apache.org >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> >> --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org >>>>>> For additional commands, e-mail: dev-h...@druid.apache.org >>>>>> >>>>>> >>>>> >>>> >>> >> >> >> -- >> >> B-Slim >> _______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______ >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org For additional commands, e-mail: dev-h...@druid.apache.org