Small contributions don’t need any design review, whereas large contributions need significant review. I don’t think we should require an additional step for those (many) small contributions. But who decides whether a contribution fits into the small or large category?
I think the solution is for authors to log a case (or send an email to dev) before they start work on any contribution. Then committers can request a more heavy-weight process if they think it is needed. Julian > On Jan 7, 2019, at 11:24 AM, Gian Merlino <g...@apache.org> wrote: > > It sounds like splitting design from code review is a common theme in a few > of the posts here. How does everyone feel about making a point of > encouraging design reviews to be done as issues, separate from the pull > request, with the expectations that (1) the design review issue > ("proposal") should generally appear somewhat _before_ the pull request; > (2) pull requests should _not_ have design review happen on them, meaning > there should no longer be PRs with design review tags, and we should move > the design review approval process to the issue rather than the PR. > > For (1), even if we encourage design review discussions to start before a > pull request appears, I don't see an issue with them running concurrently > for a while at some point. > > On Thu, Jan 3, 2019 at 5:35 PM Jonathan Wei <jon...@apache.org> wrote: > >> Thanks for raising these concerns! >> >> My initial thoughts: >> - I agree that separation of design review and code-level review for major >> changes would be more efficient >> >> - I agree that a clear, more formalized process for handling major changes >> would be helpful for contributors: >> - Define what is considered a major change >> - Define a standard proposal structure, KIP-style proposal format sounds >> good to me >> >> - I think it's too rigid to have a policy of "no code at all with the >> initial proposal" >> - Code samples can be useful references for understanding aspects of a >> design >> - In some cases it's necessary to run experiments to fully understand a >> problem and determine an appropriate design, or to determine whether >> something is even worth doing before committing to the work of fleshing out >> a proposal, prototype code is a natural outcome of that and I'm not against >> someone providing such code for reference >> - I tend to view design/code as things that are often developed >> simultaneously in an intertwined way >> >>> Let's not be naive this is very rare that a contributor will accept that >> his work is to be thrown, usually devs takes coding as personal creation >> and they get attached to it. >> >> If we have a clear review process that emphasizes the need for early >> consensus building, with separate design and code review, then I feel we've >> done enough and don't need a hard rule against having some code linked with >> the initial proposal. If a potential contributor then still wants to go >> ahead and write a lot of code that may be rejected or change significantly, >> the risks were made clear. >> >>> Once code is written hard to think abstract. >> >> I can see the validity of the concern, but I personally don't see it as a >> huge risk. My impression from the Druid PR reviews I've seen is that our >> reviewers are able to keep abstract design vs. implementation details >> separate and consider alternate designs when reviewing. >> >> To summarize I think it's probably enough to have a policy along the lines >> of: >> - Create more formalized guidelines for proposals and what changes require >> proposals >> - Separate design and code review for major changes, with design review >> first, code-level review after reaching consensus on the design. >> - Code before the design review is completed is just for reference, not >> regarded as a candidate for review/merging. >> >> - Jon >> >> >> On Thu, Jan 3, 2019 at 12:48 PM Slim Bouguerra <slim.bougue...@gmail.com> >> wrote: >> >>> On Thu, Jan 3, 2019 at 12:16 PM Clint Wylie <clint.wy...@imply.io> >> wrote: >>> >>>> I am definitely biased in this matter as an owner of another large PR >>> that >>>> wasn't preceded by a direct proposal or dev list discussion, and in >>> general >>>> I agree that proposal first is usually better, but I think in some >> rarer >>>> cases approaching a problem code first *is* the most appropriate way to >>>> have a discussion. >>> >>> >>> I am wondering here what is the case where code first is better? >>> In general when you are writing code you have an idea about what you want >>> to change, why you want to change and why you want to change it. >>> I do not see what is wrong with sharing this primitive ideas and thoughts >>> as an abstract proposal (at least to avoid overlapping) >>> >>> I see nothing wrong with it so long as the author >>>> accepts that the PR is treated as a combined proposal and proof of >>> concept, >>>> and fair game to be radically changed via discussion or even rejected, >>>> which sounds like Gian's attitude on the matter and is mine as well >> with >>> my >>>> compression stuff. >>> >>> >>> Let's not be naive this is very rare that a contributor will accept that >>> his work is to be thrown, usually devs takes coding as personal creation >>> and they get attached to it. >>> To my point you can take a look on some old issue in the Druid forum >>> >> https://github.com/apache/incubator-druid/pull/3755#issuecomment-265667690 >>> and am sure other communities have similar problems. >>> So leaving the door open to some side cases is not a good idea in my >>> opinion and will lead to similar issue in the future. >>> >>> This seems to me especially likely to happen in cases >>>> where an approach still needs proven to be a viable idea *to the >> author*, >>>> so that a much more productive discussion can be had in the first >> place. >>>> >>>> I think there is a trade off, I don't think we want to discourage >>>> experimentation by walling it off behind mandatory discussions before >> it >>>> can even start, but I do think formalizing the process for large >> changes >>> is >>>> a good thing, especially since we probably can't expect the wider >>> community >>>> to have the same attitude towards a large PR getting discarded as a >>>> committer might. I think the Kafka approach is reasonable, a bit more >>>> formal than our design review process but not overbearing. >>> >>> >>> Can you please explain what is overbearing ? what can be changed to make >> it >>> easy ? >>> Most of the points are kind of the actual questions that you want to >>> address before hand anyway isn't ? >>> >>> >>>> Going code first >>>> should be in general discouraged, but when it does happen, it seems >> like >>>> opening DIP/an issue/starting a mailing list thread or whatever we go >>> with >>>> to have a more high level design discussion alongside the reference PR >>>> could alleviate some of these complaints? >>> >>> >>> What are the complaints ? >>> >>> >>>> +1 for "DIP" heh, I think making >>>> them in the form of github issues is probably appropriate, with a dev >>> list >>>> thread to announce them perhaps? >>>> >>> >>> I think github issue with [Proposal] header like >>> https://github.com/apache/incubator-druid/issues/4349 is good to me, >>> >>> Thanks! >>> >>> >>>> On Thu, Jan 3, 2019 at 10:28 AM Slim Bouguerra <bs...@apache.org> >> wrote: >>>> >>>>> Thanks everyone for interacting with this thread. >>>>> >>>>> The fact that i, Roman, Jihoon and others in the past (FJ >>>>> >>>> >>> >> https://groups.google.com/forum/#!msg/druid-user/gkUEsAYIfBA/6B2GJdLkAgAJ) >>>>> raised this point indicates that PRs without a proposal are indeed an >>>> issue >>>>> and we need to solve it. >>>>> >>>>> Something Similar to KIP maybe called DIPs is fine with me. >>>>> What i strive to see is the following: >>>>> >>>>> [Step 1] formalize what is the kind of work that needs a formal >>>> Proposal, I >>>>> think Roman and Jihoon has already covered that pretty well. am +1 on >>>> that. >>>>> >>>>> >>>> >>> >> https://lists.apache.org/thread.html/9b30d893bdb6bb633cf6a9a700183ffb5b98f115330531a55328ac77@%3Cdev.druid.apache.org%3E >>>>> >>>>> I am strongly in favor of the separation of Proposal Review and >> (later) >>>>> Code review PRs. My main reasons: >>>>> Most importantly code reviewing will introduce lot of noise and will >>>>> ultimately make the GitHub page unreadable. >>>>> Avoid overlapping of work. >>>>> Once code is written hard to think abstract. >>>>> Separate page for Design review later can always be used it as a >> Design >>>>> document that is readable and code free-ish. >>>>> As i said the goal of this first round is to see if the community >> agree >>>>> about such change, then make the process of design more inclusive >> thus >>>>> other contributors can submit a counter proposals. >>>>> >>>>> [Step 2] IF everybody agree about that point Step 2 is to define >> which >>>>> medium is used to Publish a primitive form of a CODE FREE Abstract >>>> Proposal >>>>> containing at least the following bullet points. >>>>> - The problem description and motivation >>>>> - Overview of the proposed change >>>>> - Operational impact (compatibility/ plans to upgrades) public API >>>> changes, >>>>> configuration changes, algorithm, and so on >>>>> - Expected benefits and drawbacks >>>>> - Rationale and alternatives >>>>> - Estimate Time to Deliver if possible. >>>>> >>>>> The way i think this can be is a Github issue where member of the >>>> community >>>>> will interact via comments and the author will be updating the >>>> description >>>>> in the light of comments provided by the community. >>>>> >>>>> During and near the end of the design discussions the author/s can >>> start >>>>> writing POCs to help guide the review process this naturally will be >> a >>>> Pull >>>>> request with actual code. >>>>> >>>>> *Now the most important thing is that we need to agree that any work >>> that >>>>> does not align with this formal process will be ignored and the >> author >>>> will >>>>> be asked to start with a DIP* >>>>> *That is what i meant with “If it didn’t happen on the mailing list, >>> it >>>>> didn’t happen.”* >>>>> >>>>> Thanks and happy coding! >>>>> >>>>> On Wed, Jan 2, 2019 at 6:47 PM Gian Merlino <g...@apache.org> wrote: >>>>> >>>>>> One of the advantages I see with a more formal process is (like >> Kafka >>>>> KIPs) >>>>>> is that it levels the playing field a bit and sets some ground >> rules >>>> for >>>>>> working together. In a way it can help encourage contributions by >>>> making >>>>> it >>>>>> clear what is expected of potential contributors. >>>>>> >>>>>> We have a design review process today that is not as formal as >> KIPs, >>>> but >>>>> is >>>>>> somewhat heavier than the one you describe. Maybe we could tweak >> our >>>>>> current one by starting to do design reviews separately from PRs. >>> i.e., >>>>> for >>>>>> anything that meets our 'design review' criteria, do that on the >> dev >>>> list >>>>>> or in a separate issue, and keep the PR focused on code-level >> stuff. >>>> That >>>>>> way we don't end up trying to do both at once. And it makes it >> easier >>>> to >>>>>> start talking about design before the code is ready, which would be >>>>> better. >>>>>> >>>>>> On Wed, Jan 2, 2019 at 6:10 PM Julian Hyde <jh...@apache.org> >> wrote: >>>>>> >>>>>>> It’s really hard to say no to a contribution when someone has put >>> in >>>> a >>>>>>> significant amount of work. >>>>>>> >>>>>>> The following approach is simple and works really well: Before >> you >>>>> start >>>>>>> work, log a case, describing the problem. When you have some >> ideas >>>>> about >>>>>>> design, add those to the case. When you have a code branch, add >> its >>>> URL >>>>>> to >>>>>>> the case. And so forth. At any point in the proceedings, people >> can >>>>> chime >>>>>>> in with their opinions. >>>>>>> >>>>>>> In my opinion, a formal “design review” process is not necessary. >>>> Just >>>>>>> build consensus iteratively, by starting the conversation early >> in >>>> the >>>>>>> process. >>>>>>> >>>>>>> Julian >>>>>>> >>>>>>> >>>>>>>> On Jan 2, 2019, at 12:37 PM, Gian Merlino <g...@apache.org> >>> wrote: >>>>>>>> >>>>>>>> In this particular case: please consider the PR as a proposal. >>>> Don't >>>>>> feel >>>>>>>> like just because there is code there that takes a certain >>>> approach, >>>>>> that >>>>>>>> the approach is somehow sacred. I had to implement something to >>>>>>> crystallize >>>>>>>> my own thinking about how the problem could be approached. I >>> won't >>>> be >>>>>>>> disappointed if, as a community, we decide a different >> direction >>> is >>>>>>> better >>>>>>>> and the code all gets thrown away. That's one of the reasons >>> that I >>>>>>> removed >>>>>>>> the 0.14.0 milestone that was added to the patch. (I don't want >>> to >>>>> rush >>>>>>> it, >>>>>>>> nor do I think that's a good idea.) >>>>>>>> >>>>>>>> In general: Sounds like we could do with some more >> formalization >>>>> around >>>>>>>> what a proposal looks like, which sorts of changes need one, >> and >>>> when >>>>>> in >>>>>>>> the dev cycle it is appropriate. FWIW I think Kafka's process >> is >>>> more >>>>>> or >>>>>>>> less fine, and would be okay with adopting it for Druid if >> people >>>>> like >>>>>>> it. >>>>>>>> Right now our standards for what requires a "design review" are >>>> very >>>>>>>> similar to the Kafka community standards for what requires a >> KIP, >>>> so >>>>> we >>>>>>>> have some familiarity with those concepts. However we don't >>>> separate >>>>> PR >>>>>>>> review and proposal discussion as strictly as they do, which >>> seems >>>> to >>>>>> be >>>>>>>> the foundation for the feeling of exclusion that is being felt >>>> here. >>>>>>>> >>>>>>>> Separately: I just redid the description on >>>>>>>> https://github.com/apache/incubator-druid/pull/6794 to be more >>>>>>> proposal-y. >>>>>>>> I followed the KIP style: >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals >>>>>>> . >>>>>>>> Please refresh the page and see if it looks more useful. >>>>>>>> >>>>>>>> Gian >>>>>>>> >>>>>>>> On Wed, Jan 2, 2019 at 10:52 AM Julian Hyde <jh...@apache.org> >>>>> wrote: >>>>>>>> >>>>>>>>> Slim, >>>>>>>>> >>>>>>>>> I agree with your points that offline development is bad for >>>>>> community. >>>>>>>>> But I don’t think you need much mentor help. You have raised >>> valid >>>>>>> issues >>>>>>>>> and the Druid community needs to decide what its development >>>>> practices >>>>>>>>> should be. >>>>>>>>> >>>>>>>>> Julian >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Jan 2, 2019, at 10:29 AM, Slim Bouguerra < >> bs...@apache.org> >>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hello everyone and hope you all have very good holidays. >>>>>>>>>> >>>>>>>>>> First, this email is not directed on the author or the PR >>>>>>>>>> https://github.com/apache/incubator-druid/pull/6794 it >> self, >>>> but >>>>> i >>>>>>> see >>>>>>>>>> this PR as a perfect example. >>>>>>>>>> >>>>>>>>>> One of the foundation of Apache Way or what i would simply >> call >>>>> open >>>>>>>>> source >>>>>>>>>> community driven development is that "Technical decisions are >>>>>>> discussed, >>>>>>>>>> decided, and archived publicly. >>>>>>>>>> developpement" >>>>>>>>>> Which means that big technical changes such as the one >> brought >>>> by >>>>>>> #/6794 >>>>>>>>>> should have started as a proposal and round of discussions >>> about >>>>> the >>>>>>>>> major >>>>>>>>>> changes designs not as 11K line of code. >>>>>>>>>> I believe such openness will promote a lot of good benefits >>> such >>>>> as: >>>>>>>>>> >>>>>>>>>> - ensures community health and growth. >>>>>>>>>> - ensures everyone can participate not only the authors and >> his >>>>>>>>> co-workers. >>>>>>>>>> - ensures that the project is driven by the community and >> not a >>>>> given >>>>>>>>>> company or an individual. >>>>>>>>>> - ensures that there is consensus (not saying 100% >> agreement;) >>>>>> however >>>>>>> it >>>>>>>>>> means that all individuals will accept the current progress >> on >>>> the >>>>>>>>> project >>>>>>>>>> until some better proposal is put forth. >>>>>>>>>> >>>>>>>>>> Personally such BIG offline PR makes me feel excluded and >>> doesn't >>>>>> give >>>>>>>>> me a >>>>>>>>>> sense that i belong to a community at all. >>>>>>>>>> >>>>>>>>>> To prevent such off list development i think as a Druid >>> Community >>>>> we >>>>>>> need >>>>>>>>>> to stick to the apache way “If it didn’t happen on the >> mailing >>>>> list, >>>>>> it >>>>>>>>>> didn’t happen.” >>>>>>>>>> >>>>>>>>>> I would appreciate if some of the Apache mentor help with >> this. >>>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> --------------------------------------------------------------------- >>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org >>>>>>>>> For additional commands, e-mail: dev-h...@druid.apache.org >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>> --------------------------------------------------------------------- >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org >>>>>>> For additional commands, e-mail: dev-h...@druid.apache.org >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> B-Slim >>> _______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______ >>> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org For additional commands, e-mail: dev-h...@druid.apache.org