Apache has a procedure for major pieces of code that are developed off-list. It 
is called “IP clearance”[1]. It’s fairly onerous, and I believe it would apply 
even if the people who developed the code were committers to the project.

So, another good reason to develop code on-list.

Julian

[1] http://incubator.apache.org/ip-clearance/

> On Jan 3, 2019, at 5:35 PM, Jonathan Wei <jon...@apache.org> wrote:
> 
> Thanks for raising these concerns!
> 
> My initial thoughts:
> - I agree that separation of design review and code-level review for major
> changes would be more efficient
> 
> - I agree that a clear, more formalized process for handling major changes
> would be helpful for contributors:
>  - Define what is considered a major change
>  - Define a standard proposal structure, KIP-style proposal format sounds
> good to me
> 
> - I think it's too rigid to have a policy of "no code at all with the
> initial proposal"
>  - Code samples can be useful references for understanding aspects of a
> design
>  - In some cases it's necessary to run experiments to fully understand a
> problem and determine an appropriate design, or to determine whether
> something is even worth doing before committing to the work of fleshing out
> a proposal, prototype code is a natural outcome of that and I'm not against
> someone providing such code for reference
>  - I tend to view design/code as things that are often developed
> simultaneously in an intertwined way
> 
>> Let's not be naive this is very rare that a contributor will accept that
> his work is to be thrown, usually devs takes coding as personal creation
> and they get attached to it.
> 
> If we have a clear review process that emphasizes the need for early
> consensus building, with separate design and code review, then I feel we've
> done enough and don't need a hard rule against having some code linked with
> the initial proposal. If a potential contributor then still wants to go
> ahead and write a lot of code that may be rejected or change significantly,
> the risks were made clear.
> 
>> Once code is written hard to think abstract.
> 
> I can see the validity of the concern, but I personally don't see it as a
> huge risk. My impression from the Druid PR reviews I've seen is that our
> reviewers are able to keep abstract design vs. implementation details
> separate and consider alternate designs when reviewing.
> 
> To summarize I think it's probably enough to have a policy along the lines
> of:
> - Create more formalized guidelines for proposals and what changes require
> proposals
> - Separate design and code review for major changes, with design review
> first, code-level review after reaching consensus on the design.
> - Code before the design review is completed is just for reference, not
> regarded as a candidate for review/merging.
> 
> - Jon
> 
> 
> On Thu, Jan 3, 2019 at 12:48 PM Slim Bouguerra <slim.bougue...@gmail.com>
> wrote:
> 
>> On Thu, Jan 3, 2019 at 12:16 PM Clint Wylie <clint.wy...@imply.io> wrote:
>> 
>>> I am definitely biased in this matter as an owner of another large PR
>> that
>>> wasn't preceded by a direct proposal or dev list discussion, and in
>> general
>>> I agree that proposal first is usually better, but I think in some rarer
>>> cases approaching a problem code first *is* the most appropriate way to
>>> have a discussion.
>> 
>> 
>> I am wondering here what is the case where code first is better?
>> In general when you are writing code you have an idea about what you want
>> to change, why you want to change and why you want to change it.
>> I do not see what is wrong with sharing this primitive ideas and thoughts
>> as an abstract proposal (at least to avoid overlapping)
>> 
>> I see nothing wrong with it so long as the author
>>> accepts that the PR is treated as a combined proposal and proof of
>> concept,
>>> and fair game to be radically changed via discussion or even rejected,
>>> which sounds like Gian's attitude on the matter and is mine as well with
>> my
>>> compression stuff.
>> 
>> 
>> Let's not be naive this is very rare that a contributor will accept that
>> his work is to be thrown, usually devs takes coding as personal creation
>> and they get attached to it.
>> To my point you can take a look on some old issue in the Druid forum
>> https://github.com/apache/incubator-druid/pull/3755#issuecomment-265667690
>> and am sure other communities have similar problems.
>> So leaving the door open to some side cases is not a good idea in my
>> opinion and will lead to similar issue in the future.
>> 
>> This seems to me especially likely to happen in cases
>>> where an approach still needs proven to be a viable idea *to the author*,
>>> so that a much more productive discussion can be had in the first place.
>>> 
>>> I think there is a trade off, I don't think we want to discourage
>>> experimentation by walling it off behind mandatory discussions before it
>>> can even start, but I do think formalizing the process for large changes
>> is
>>> a good thing, especially since we probably can't expect the wider
>> community
>>> to have the same attitude towards a large PR getting discarded as a
>>> committer might. I think the Kafka approach is reasonable, a bit more
>>> formal than our design review process but not overbearing.
>> 
>> 
>> Can you please explain what is overbearing ? what can be changed to make it
>> easy ?
>> Most of the points are kind of the actual questions that you want to
>> address before hand anyway isn't ?
>> 
>> 
>>> Going code first
>>> should be in general discouraged, but when it does happen, it seems like
>>> opening DIP/an issue/starting a mailing list thread or whatever we go
>> with
>>> to have a more high level design discussion alongside the reference PR
>>> could alleviate some of these complaints?
>> 
>> 
>> What are the complaints ?
>> 
>> 
>>> +1 for "DIP" heh, I think making
>>> them in the form of github issues is probably appropriate, with a dev
>> list
>>> thread to announce them perhaps?
>>> 
>> 
>> I think  github issue with [Proposal] header like
>> https://github.com/apache/incubator-druid/issues/4349 is good to me,
>> 
>> Thanks!
>> 
>> 
>>> On Thu, Jan 3, 2019 at 10:28 AM Slim Bouguerra <bs...@apache.org> wrote:
>>> 
>>>> Thanks everyone for interacting with this thread.
>>>> 
>>>> The fact that i, Roman, Jihoon  and others in the past (FJ
>>>> 
>>> 
>> https://groups.google.com/forum/#!msg/druid-user/gkUEsAYIfBA/6B2GJdLkAgAJ)
>>>> raised this point indicates that PRs without a proposal are indeed an
>>> issue
>>>> and we need to solve it.
>>>> 
>>>> Something Similar to KIP maybe called DIPs is fine with me.
>>>> What i strive to see is the following:
>>>> 
>>>> [Step 1] formalize what is the kind of work that needs a formal
>>> Proposal, I
>>>> think Roman and Jihoon has already covered that pretty well. am +1 on
>>> that.
>>>> 
>>>> 
>>> 
>> https://lists.apache.org/thread.html/9b30d893bdb6bb633cf6a9a700183ffb5b98f115330531a55328ac77@%3Cdev.druid.apache.org%3E
>>>> 
>>>> I am strongly in favor of the separation of Proposal Review and (later)
>>>> Code review PRs. My  main reasons:
>>>> Most importantly code reviewing will introduce lot of noise and will
>>>> ultimately make  the GitHub page unreadable.
>>>> Avoid overlapping of work.
>>>> Once code is written hard to think abstract.
>>>> Separate page for Design review later can always be used it as a Design
>>>> document that is readable and code free-ish.
>>>> As i said the goal of this first round is to see if the community agree
>>>> about such change, then make the process of design more inclusive thus
>>>> other contributors can submit a counter proposals.
>>>> 
>>>> [Step 2] IF everybody agree about that point Step 2 is to define which
>>>> medium is used to Publish a primitive form of a CODE FREE Abstract
>>> Proposal
>>>> containing at least the following bullet points.
>>>> - The problem description and motivation
>>>> - Overview of the proposed change
>>>> - Operational impact (compatibility/ plans to upgrades) public API
>>> changes,
>>>> configuration changes, algorithm, and so on
>>>> - Expected benefits and drawbacks
>>>> - Rationale and alternatives
>>>> - Estimate Time to Deliver if possible.
>>>> 
>>>> The way i think this can be is a Github issue where member of the
>>> community
>>>> will interact via comments and the author will be updating the
>>> description
>>>> in the light of comments provided by the community.
>>>> 
>>>> During and near the end of the design discussions the author/s can
>> start
>>>> writing POCs to help guide the review process this naturally will be a
>>> Pull
>>>> request with actual code.
>>>> 
>>>> *Now the most important thing is that we need to agree that any work
>> that
>>>> does not align with this formal process will be ignored and the author
>>> will
>>>> be asked to start with a DIP*
>>>> *That is what i meant with  “If it didn’t happen on the mailing list,
>> it
>>>> didn’t happen.”*
>>>> 
>>>> Thanks and happy coding!
>>>> 
>>>> On Wed, Jan 2, 2019 at 6:47 PM Gian Merlino <g...@apache.org> wrote:
>>>> 
>>>>> One of the advantages I see with a more formal process is (like Kafka
>>>> KIPs)
>>>>> is that it levels the playing field a bit and sets some ground rules
>>> for
>>>>> working together. In a way it can help encourage contributions by
>>> making
>>>> it
>>>>> clear what is expected of potential contributors.
>>>>> 
>>>>> We have a design review process today that is not as formal as KIPs,
>>> but
>>>> is
>>>>> somewhat heavier than the one you describe. Maybe we could tweak our
>>>>> current one by starting to do design reviews separately from PRs.
>> i.e.,
>>>> for
>>>>> anything that meets our 'design review' criteria, do that on the dev
>>> list
>>>>> or in a separate issue, and keep the PR focused on code-level stuff.
>>> That
>>>>> way we don't end up trying to do both at once. And it makes it easier
>>> to
>>>>> start talking about design before the code is ready, which would be
>>>> better.
>>>>> 
>>>>> On Wed, Jan 2, 2019 at 6:10 PM Julian Hyde <jh...@apache.org> wrote:
>>>>> 
>>>>>> It’s really hard to say no to a contribution when someone has put
>> in
>>> a
>>>>>> significant amount of work.
>>>>>> 
>>>>>> The following approach is simple and works really well: Before you
>>>> start
>>>>>> work, log a case, describing the problem. When you have some ideas
>>>> about
>>>>>> design, add those to the case. When you have a code branch, add its
>>> URL
>>>>> to
>>>>>> the case. And so forth. At any point in the proceedings, people can
>>>> chime
>>>>>> in with their opinions.
>>>>>> 
>>>>>> In my opinion, a formal “design review” process is not necessary.
>>> Just
>>>>>> build consensus iteratively, by starting the conversation early in
>>> the
>>>>>> process.
>>>>>> 
>>>>>> Julian
>>>>>> 
>>>>>> 
>>>>>>> On Jan 2, 2019, at 12:37 PM, Gian Merlino <g...@apache.org>
>> wrote:
>>>>>>> 
>>>>>>> In this particular case: please consider the PR as a proposal.
>>> Don't
>>>>> feel
>>>>>>> like just because there is code there that takes a certain
>>> approach,
>>>>> that
>>>>>>> the approach is somehow sacred. I had to implement something to
>>>>>> crystallize
>>>>>>> my own thinking about how the problem could be approached. I
>> won't
>>> be
>>>>>>> disappointed if, as a community, we decide a different direction
>> is
>>>>>> better
>>>>>>> and the code all gets thrown away. That's one of the reasons
>> that I
>>>>>> removed
>>>>>>> the 0.14.0 milestone that was added to the patch. (I don't want
>> to
>>>> rush
>>>>>> it,
>>>>>>> nor do I think that's a good idea.)
>>>>>>> 
>>>>>>> In general: Sounds like we could do with some more formalization
>>>> around
>>>>>>> what a proposal looks like, which sorts of changes need one, and
>>> when
>>>>> in
>>>>>>> the dev cycle it is appropriate. FWIW I think Kafka's process is
>>> more
>>>>> or
>>>>>>> less fine, and would be okay with adopting it for Druid if people
>>>> like
>>>>>> it.
>>>>>>> Right now our standards for what requires a "design review" are
>>> very
>>>>>>> similar to the Kafka community standards for what requires a KIP,
>>> so
>>>> we
>>>>>>> have some familiarity with those concepts. However we don't
>>> separate
>>>> PR
>>>>>>> review and proposal discussion as strictly as they do, which
>> seems
>>> to
>>>>> be
>>>>>>> the foundation for the feeling of exclusion that is being felt
>>> here.
>>>>>>> 
>>>>>>> Separately: I just redid the description on
>>>>>>> https://github.com/apache/incubator-druid/pull/6794 to be more
>>>>>> proposal-y.
>>>>>>> I followed the KIP style:
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>>>>>> .
>>>>>>> Please refresh the page and see if it looks more useful.
>>>>>>> 
>>>>>>> Gian
>>>>>>> 
>>>>>>> On Wed, Jan 2, 2019 at 10:52 AM Julian Hyde <jh...@apache.org>
>>>> wrote:
>>>>>>> 
>>>>>>>> Slim,
>>>>>>>> 
>>>>>>>> I agree with your points that offline development is bad for
>>>>> community.
>>>>>>>> But I don’t think you need much mentor help. You have raised
>> valid
>>>>>> issues
>>>>>>>> and the Druid community needs to decide what its development
>>>> practices
>>>>>>>> should be.
>>>>>>>> 
>>>>>>>> Julian
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jan 2, 2019, at 10:29 AM, Slim Bouguerra <bs...@apache.org>
>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> Hello everyone and hope you all have very good holidays.
>>>>>>>>> 
>>>>>>>>> First, this email is not directed on the author or the PR
>>>>>>>>> https://github.com/apache/incubator-druid/pull/6794  it self,
>>> but
>>>> i
>>>>>> see
>>>>>>>>> this PR as a perfect example.
>>>>>>>>> 
>>>>>>>>> One of the foundation of Apache Way or what i would simply call
>>>> open
>>>>>>>> source
>>>>>>>>> community driven development is that "Technical decisions are
>>>>>> discussed,
>>>>>>>>> decided, and archived publicly.
>>>>>>>>> developpement"
>>>>>>>>> Which means that big technical  changes such as the one brought
>>> by
>>>>>> #/6794
>>>>>>>>> should have started as a proposal and round of discussions
>> about
>>>> the
>>>>>>>> major
>>>>>>>>> changes designs not as 11K line of code.
>>>>>>>>> I believe such openness will promote a lot of good benefits
>> such
>>>> as:
>>>>>>>>> 
>>>>>>>>> - ensures community health and growth.
>>>>>>>>> - ensures everyone can participate not only the authors and his
>>>>>>>> co-workers.
>>>>>>>>> - ensures that the project is driven by the community and not a
>>>> given
>>>>>>>>> company or an individual.
>>>>>>>>> - ensures that there is consensus (not saying 100% agreement;)
>>>>> however
>>>>>> it
>>>>>>>>> means that all individuals will accept the current progress on
>>> the
>>>>>>>> project
>>>>>>>>> until some better proposal is put forth.
>>>>>>>>> 
>>>>>>>>> Personally such BIG offline PR makes me feel excluded and
>> doesn't
>>>>> give
>>>>>>>> me a
>>>>>>>>> sense that i belong to  a community at all.
>>>>>>>>> 
>>>>>>>>> To prevent such off list development i think as a Druid
>> Community
>>>> we
>>>>>> need
>>>>>>>>> to stick to the apache way “If it didn’t happen on the mailing
>>>> list,
>>>>> it
>>>>>>>>> didn’t happen.”
>>>>>>>>> 
>>>>>>>>> I would appreciate if some of the Apache mentor help with this.
>>>>>>>>> Thanks
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
>>>>>>>> For additional commands, e-mail: dev-h...@druid.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
>>>>>> For additional commands, e-mail: dev-h...@druid.apache.org
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> --
>> 
>> B-Slim
>> _______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org

Reply via email to