Thanks for great suggestion.

+1 for this proposal.

Regards,
Chiwan Park

> On May 13, 2016, at 1:44 AM, Nick Dimiduk <ndimi...@apache.org> wrote:
> 
> For what it's worth, this is very close to how HBase attempts to manage the
> community load. We break out components (in Jira), with a list of named
> component maintainers. Actually, having components alone has given a Big
> Bang for the buck because when properly labeled, it makes it really easy
> for part-timers to channel their efforts with precision.
> 
> As a flink user, I'm +1 for this proposal as well :)
> 
> On Thursday, May 12, 2016, Aljoscha Krettek <aljos...@apache.org> wrote:
> 
>> +1
>> 
>> The ideas seem good and the proposed number of components seems reasonable.
>> With this, we should also then cleanup the JIRA to make it actually usable.
>> 
>> On Thu, 12 May 2016 at 18:09 Stephan Ewen <se...@apache.org <javascript:;>>
>> wrote:
>> 
>>> All maintainer candidates are only proposals so far. No indication of
>> lead
>>> or anything so far.
>>> 
>>> Let's first see if we agree on the structure proposed here, and if we
>> take
>>> the components as suggested here or if we refine the list.
>>> Am 12.05.2016 17:45 schrieb "Robert Metzger" <rmetz...@apache.org
>> <javascript:;>>:
>>> 
>>>> tl;dr: +1
>>>> 
>>>> I also like the proposal a lot. Our community is growing at a quite
>> fast
>>>> pace and we need to have some structure in place to still keep track of
>>>> everything going on.
>>>> 
>>>> I'm happy to see that the proposal mentions cleaning up our JIRA. This
>> is
>>>> something that has been annoying me for quite a while, but its too big
>> to
>>>> do it alone. If maintainers could take care of their components, we
>>> should
>>>> have covered already a lot there.
>>>> 
>>>> One question regarding the "chair" or "lead" role for components: Is
>> the
>>>> first name in the list of maintainers the lead?
>>>> 
>>>> I would actually suggest to wait until all proposed maintainers agreed
>> to
>>>> the proposal. It doesn't make sense to make somebody a maintainer of
>>>> something if they disagree or are not aware of it.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, May 12, 2016 at 2:13 PM, Maximilian Michels <m...@apache.org
>> <javascript:;>>
>>>> wrote:
>>>> 
>>>>> +1 for the initiative. With a better process we will improve the
>>>>> quality of the Flink development and give us more time to focus.
>>>>> 
>>>>> Could we have another category "Infrastructure"? This would concern
>>>>> things like CI, nightly deployment of snapshots/documentation, ASF
>>>>> Infra communication. Robert and me could be the initial maintainers
>>>>> for that.
>>>>> 
>>>>> On Thu, May 12, 2016 at 1:52 PM, Stephan Ewen <se...@apache.org
>> <javascript:;>>
>>> wrote:
>>>>>> Yes, Matthias, that was supposed to be you.
>>>>>> Sorry from another guy who frequently has his name misspelled ;-)
>>>>>> 
>>>>>> On Thu, May 12, 2016 at 1:27 PM, Matthias J. Sax <mj...@apache.org
>> <javascript:;>>
>>>>> wrote:
>>>>>> 
>>>>>>> +1 from my side.
>>>>>>> 
>>>>>>> Happy to be the maintainer for Storm-Compatibiltiy (at least I
>> guess
>>>>>>> it's me, even the correct spelling would be with two 't' :P)
>>>>>>> 
>>>>>>> -Matthias
>>>>>>> 
>>>>>>> On 05/12/2016 12:56 PM, Till Rohrmann wrote:
>>>>>>>> +1 for the proposal
>>>>>>>> On May 12, 2016 12:13 PM, "Stephan Ewen" <se...@apache.org
>> <javascript:;>>
>>> wrote:
>>>>>>>> 
>>>>>>>>> Yes, Gabor Gevay, that did refer to you!
>>>>>>>>> 
>>>>>>>>> Sorry for the ambiguity...
>>>>>>>>> 
>>>>>>>>> On Thu, May 12, 2016 at 10:46 AM, Márton Balassi <
>>>>>>> balassi.mar...@gmail.com <javascript:;>
>>>>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> +1 for the proposal
>>>>>>>>>> @ggevay: I do think that it refers to you. :)
>>>>>>>>>> 
>>>>>>>>>> On Thu, May 12, 2016 at 10:40 AM, Gábor Gévay <
>> gga...@gmail.com <javascript:;>
>>>> 
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hello,
>>>>>>>>>>> 
>>>>>>>>>>> There are at least three Gábors in the Flink community,  :)
>> so
>>>>>>>>>>> assuming that the Gábor in the list of maintainers of the
>>> DataSet
>>>>> API
>>>>>>>>>>> is referring to me, I'll be happy to do it. :)
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Gábor G.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 2016-05-10 11:24 GMT+02:00 Stephan Ewen <se...@apache.org
>> <javascript:;>>:
>>>>>>>>>>>> Hi everyone!
>>>>>>>>>>>> 
>>>>>>>>>>>> We propose to establish some lightweight structures in the
>>> Flink
>>>>> open
>>>>>>>>>>>> source community and development process,
>>>>>>>>>>>> to help us better handle the increased interest in Flink
>>>> (mailing
>>>>>>>>> list
>>>>>>>>>>> and
>>>>>>>>>>>> pull requests), while not overwhelming the
>>>>>>>>>>>> committers, and giving users and contributors a good
>>> experience.
>>>>>>>>>>>> 
>>>>>>>>>>>> This proposal is triggered by the observation that we are
>>>> reaching
>>>>>>>>> the
>>>>>>>>>>>> limits of where the current community can support
>>>>>>>>>>>> users and guide new contributors. The below proposal is
>> based
>>> on
>>>>>>>>>>>> observations and ideas from Till, Robert, and me.
>>>>>>>>>>>> 
>>>>>>>>>>>> ========
>>>>>>>>>>>> Goals
>>>>>>>>>>>> ========
>>>>>>>>>>>> 
>>>>>>>>>>>> We try to achieve the following
>>>>>>>>>>>> 
>>>>>>>>>>>>  - Pull requests get handled in a timely fashion
>>>>>>>>>>>>  - New contributors are better integrated into the
>> community
>>>>>>>>>>>>  - The community feels empowered on the mailing list.
>>>>>>>>>>>>    But questions that need the attention of someone that
>> has
>>>> deep
>>>>>>>>>>>> knowledge of a certain part of Flink get their attention.
>>>>>>>>>>>>  - At the same time, the committers that are knowledgeable
>>>> about
>>>>>>>>> many
>>>>>>>>>>> core
>>>>>>>>>>>> parts do not get completely overwhelmed.
>>>>>>>>>>>>  - We don't overlook threads that report critical issues.
>>>>>>>>>>>>  - We always have a pretty good overview of what the status
>>> of
>>>>>>>>> certain
>>>>>>>>>>>> parts of the system are.
>>>>>>>>>>>>      -> What are often encountered known issues
>>>>>>>>>>>>      -> What are the most frequently requested features
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ========
>>>>>>>>>>>> Problems
>>>>>>>>>>>> ========
>>>>>>>>>>>> 
>>>>>>>>>>>> Looking into the process, there are two big issues:
>>>>>>>>>>>> 
>>>>>>>>>>>> (1) Up to now, we have been relying on the fact that
>>> everything
>>>>> just
>>>>>>>>>>>> "organizes itself", driven by best effort. That assumes
>>>>>>>>>>>> that everyone feels equally responsible for every part,
>>>> question,
>>>>> and
>>>>>>>>>>>> contribution. At the current state, this is impossible
>>>>>>>>>>>> to maintain, it overwhelms the committers and contributors.
>>>>>>>>>>>> 
>>>>>>>>>>>> Example: Pull requests are picked up by whoever wants to
>> pick
>>>> them
>>>>>>>>> up.
>>>>>>>>>>> Pull
>>>>>>>>>>>> requests that are a lot of work, have little
>>>>>>>>>>>> chance of getting in, or relate to less active components
>> are
>>>>>>>>> sometimes
>>>>>>>>>>> not
>>>>>>>>>>>> picked up. When contributors are pretty
>>>>>>>>>>>> loaded already, it may happen that no one eventually feels
>>>>>>>>> responsible
>>>>>>>>>> to
>>>>>>>>>>>> pick up a pull request, and it falls through the cracks.
>>>>>>>>>>>> 
>>>>>>>>>>>> (2) There is no good overview of what are known
>> shortcomings,
>>>>>>>>> efforts,
>>>>>>>>>>> and
>>>>>>>>>>>> requested features for different parts of the system.
>>>>>>>>>>>> This information exists in various peoples' heads, but is
>> not
>>>>> easily
>>>>>>>>>>>> accessible for new people. The Flink JIRA is not well
>>>>>>>>>>>> maintained, it is not easy to draw insights from that.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ===========
>>>>>>>>>>>> The Proposal
>>>>>>>>>>>> ===========
>>>>>>>>>>>> 
>>>>>>>>>>>> Since we are building a parallel system, the natural
>> solution
>>>>> seems
>>>>>>>>> to
>>>>>>>>>>> be:
>>>>>>>>>>>> partition the workload ;-)
>>>>>>>>>>>> 
>>>>>>>>>>>> We propose to define a set of components for Flink. Each
>>>>> component is
>>>>>>>>>>>> maintained or tracked by one or more
>>>>>>>>>>>> people - let's call them maintainers. It is important to
>> note
>>>>> that we
>>>>>>>>>>> don't
>>>>>>>>>>>> suggest the maintainers as an authoritative role, but
>>>>>>>>>>>> simply as committers or contributors that visibly step up
>> for
>>> a
>>>>>>>>> certain
>>>>>>>>>>>> component, and mainly track and drive the efforts
>>>>>>>>>>>> pertaining to that component.
>>>>>>>>>>>> 
>>>>>>>>>>>> It is also important to realize that we do not want to
>> suggest
>>>>> that
>>>>>>>>>>> people
>>>>>>>>>>>> get less involved with certain parts and components, because
>>>>>>>>>>>> they are not the maintainers. We simply want to make sure
>> that
>>>>> each
>>>>>>>>>> pull
>>>>>>>>>>>> request or question or contribution has in the end
>>>>>>>>>>>> one person (or a small set of people) responsible for
>> catching
>>>> and
>>>>>>>>>>> tracking
>>>>>>>>>>>> it, if it was not worked on by the pro-active
>>>>>>>>>>>> community.
>>>>>>>>>>>> 
>>>>>>>>>>>> For some components, having multiple maintainers will be
>>>> helpful.
>>>>> In
>>>>>>>>>> that
>>>>>>>>>>>> case, one maintainer should be the "chair" or "lead"
>>>>>>>>>>>> and make sure that no issue of that component gets lost
>>> between
>>>>> the
>>>>>>>>>>>> multiple maintainers.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> A maintainers' role is:
>>>>>>>>>>>> -----------------------------
>>>>>>>>>>>> 
>>>>>>>>>>>>  - Have an overview of which of the open pull requests
>> relate
>>>> to
>>>>>>>>> their
>>>>>>>>>>>> component
>>>>>>>>>>>>  - Drive the pull requests relating to the component to
>>>>> resolution
>>>>>>>>>>>>      => Moderate the decision whether the feature should be
>>>>> merged
>>>>>>>>>>>>      => Make sure the pull request gets a shepherd.
>>>>>>>>>>>>           In many cases, the maintainers would shepherd
>>>>> themselves.
>>>>>>>>>>>>      => In case the shepherd becomes inactive, the
>>> maintainers
>>>>> need
>>>>>>>>> to
>>>>>>>>>>>> find a new shepherd.
>>>>>>>>>>>> 
>>>>>>>>>>>>  - Have an overview of what are the known issues of their
>>>>> component
>>>>>>>>>>>>  - Have an overview of what are the frequently requested
>>>>> features of
>>>>>>>>>>> their
>>>>>>>>>>>> component
>>>>>>>>>>>> 
>>>>>>>>>>>>  - Have an overview of which contributors are doing very
>> good
>>>>> work
>>>>>>>>> in
>>>>>>>>>>>> their component,
>>>>>>>>>>>>    would be candidates for committers, and should be
>> mentored
>>>>>>>>> towards
>>>>>>>>>>> that.
>>>>>>>>>>>> 
>>>>>>>>>>>>  - Resolve email threads that have been brought to their
>>>>> attention,
>>>>>>>>>>>> because deeper
>>>>>>>>>>>>    component knowledge is required for that thread.
>>>>>>>>>>>> 
>>>>>>>>>>>> A maintainers' role is NOT:
>>>>>>>>>>>> ----------------------------------
>>>>>>>>>>>> 
>>>>>>>>>>>>  - Review all pull requests of that component
>>>>>>>>>>>>  - Answer every mail with questions about that component
>>>>>>>>>>>>  - Fix all bugs and implement all features of that
>> components
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> We imagine the following way that the community and the
>>>>> maintainers
>>>>>>>>>>>> interact:
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>> ---------------------------------------------------------------------------------------------------------
>>>>>>>>>>>> 
>>>>>>>>>>>>  - Pull requests should be tagged by component. Since we
>>> cannot
>>>>> add
>>>>>>>>>>> labels
>>>>>>>>>>>> at this point, we need
>>>>>>>>>>>>    to rely on the following:
>>>>>>>>>>>>     => The pull request opener should name the pull request
>>>> like
>>>>>>>>>>>> "[FLINK-XXX] [component] Title"
>>>>>>>>>>>>     => Components can be (re) tagged by adding special
>>> comments
>>>>> in
>>>>>>>>> the
>>>>>>>>>>>> pull request ("==> component client")
>>>>>>>>>>>>     => With some luck, GitHub and Apache Infra will allow
>> us
>>> to
>>>>> use
>>>>>>>>>>> labels
>>>>>>>>>>>> at some point
>>>>>>>>>>>> 
>>>>>>>>>>>>  - When pull requests are associated with a component, the
>>>>>>>>> maintainers
>>>>>>>>>>>> will manage them
>>>>>>>>>>>>    (decision whether to add, find shepherd, catch dropped
>>> pull
>>>>>>>>>> requests)
>>>>>>>>>>>> 
>>>>>>>>>>>>  - We assume that maintainers frequently reach out to other
>>>>>>>>> community
>>>>>>>>>>>> members and ask them if they want
>>>>>>>>>>>>    to shepherd a pull request.
>>>>>>>>>>>> 
>>>>>>>>>>>>  - On the mailing list, everyone should feel equally
>>> empowered
>>>> to
>>>>>>>>>> answer
>>>>>>>>>>>> and discuss.
>>>>>>>>>>>>    If at some point in the discussion, some deep technical
>>>>> knowledge
>>>>>>>>>>> about
>>>>>>>>>>>> a component is required,
>>>>>>>>>>>>    the maintainer(s) should be drawn into the discussion.
>>>>>>>>>>>>    Because the Mailing List infrastructure has no support
>> to
>>>> tag
>>>>>>>>>>> threads,
>>>>>>>>>>>> here are some simple workarounds:
>>>>>>>>>>>> 
>>>>>>>>>>>>    => One possibility is to put the maintainers' mail
>>> addresses
>>>>> on
>>>>>>>>> cc
>>>>>>>>>>> for
>>>>>>>>>>>> the thread, so they get the mail
>>>>>>>>>>>>          not just via l the mailing list
>>>>>>>>>>>>    => Another way would be to post something like
>>> "+maintainer
>>>>>>>>>> runtime"
>>>>>>>>>>> in
>>>>>>>>>>>> the thread and the "runtime"
>>>>>>>>>>>>         maintainers would have a filter/alert on these
>>> keywords
>>>>> in
>>>>>>>>>> their
>>>>>>>>>>>> mail program.
>>>>>>>>>>>> 
>>>>>>>>>>>>  - We assume that maintainers will reach out to community
>>>> members
>>>>>>>>> that
>>>>>>>>>>> are
>>>>>>>>>>>> very active and helpful in
>>>>>>>>>>>>    a component, and will ask them if they want to be added
>> as
>>>>>>>>>>> maintainers.
>>>>>>>>>>>>    That will make it visible that those people are experts
>>> for
>>>>> that
>>>>>>>>>> part
>>>>>>>>>>>> of Flink.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ======================================
>>>>>>>>>>>> Maintainers: Committers and Contributors
>>>>>>>>>>>> ======================================
>>>>>>>>>>>> 
>>>>>>>>>>>> It helps if maintainers are committers (since we want them
>> to
>>>>> resolve
>>>>>>>>>>> pull
>>>>>>>>>>>> requests which often involves
>>>>>>>>>>>> merging them).
>>>>>>>>>>>> 
>>>>>>>>>>>> Components with multiple maintainers can easily have
>>>> non-committer
>>>>>>>>>>>> contributors in addition to committer
>>>>>>>>>>>> contributors.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ======
>>>>>>>>>>>> JIRA
>>>>>>>>>>>> ======
>>>>>>>>>>>> 
>>>>>>>>>>>> Ideally, JIRA can be used to get an overview of what are the
>>>> known
>>>>>>>>>> issues
>>>>>>>>>>>> of each component, and what are
>>>>>>>>>>>> common feature requests. Unfortunately, the Flink JIRA is
>>> quite
>>>>>>>>>>> unorganized
>>>>>>>>>>>> right now.
>>>>>>>>>>>> 
>>>>>>>>>>>> A natural followup effort of this proposal would be to
>> define
>>> in
>>>>> JIRA
>>>>>>>>>> the
>>>>>>>>>>>> same components as we defined here,
>>>>>>>>>>>> and have the maintainers keep JIRA meaningful for that
>>>> particular
>>>>>>>>>>>> component. That would allow us to
>>>>>>>>>>>> easily generate some tables out of JIRA (like top known
>> issues
>>>> per
>>>>>>>>>>>> component, most requested features)
>>>>>>>>>>>> post them on the dev list once in a while as a "state of the
>>>>> union"
>>>>>>>>>>> report.
>>>>>>>>>>>> 
>>>>>>>>>>>> Initial assignment of issues to components should be made by
>>>> those
>>>>>>>>>> people
>>>>>>>>>>>> opening the issue. The maintainer
>>>>>>>>>>>> of that tagged component needs to change the tag, if the
>>>> component
>>>>>>>>> was
>>>>>>>>>>>> classified incorrectly.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> ======================================
>>>>>>>>>>>> Initial Components and Maintainers Suggestion
>>>>>>>>>>>> ======================================
>>>>>>>>>>>> 
>>>>>>>>>>>> Below is a suggestion of how to define components for Flink.
>>> One
>>>>> goal
>>>>>>>>>> of
>>>>>>>>>>>> the division was to make it
>>>>>>>>>>>> obvious for the majority of questions and contributions to
>>> which
>>>>>>>>>>> component
>>>>>>>>>>>> they would relate. Otherwise,
>>>>>>>>>>>> if many contributions had fuzzy component associations, we
>>> would
>>>>>>>>> again
>>>>>>>>>>> not
>>>>>>>>>>>> solve the issue of having clear
>>>>>>>>>>>> responsibilities for who would track the progress and
>>>> resolution.
>>>>>>>>>>>> 
>>>>>>>>>>>> We also looked at each component and wrote the names of some
>>>>> people
>>>>>>>>> who
>>>>>>>>>>> we
>>>>>>>>>>>> thought were natural
>>>>>>>>>>>> experts for the components, and thus natural candidates for
>>>>>>>>>> maintainers.
>>>>>>>>>>>> 
>>>>>>>>>>>> **These names are only a starting point for discussion.**
>>>>>>>>>>>> 
>>>>>>>>>>>> Once agreed upon, the components and names of maintainers
>>> should
>>>>> be
>>>>>>>>>> kept
>>>>>>>>>>> in
>>>>>>>>>>>> the wiki and updated as
>>>>>>>>>>>> components change and people step up or down.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> *DataSet API* (*Fabian, Greg, Gabor*)
>>>>>>>>>>>>  - Incuding Hadoop compat. parts
>>>>>>>>>>>> 
>>>>>>>>>>>> *DataStream API* (*Aljoscha, Max, Stephan*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Runtime*
>>>>>>>>>>>>  - Distributed Coordination (JobManager/TaskManager, Akka)
>>>>> (*Till*)
>>>>>>>>>>>>  - Local Runtime (Memory Management, State Backends,
>>>>>>>>> Tasks/Operators)
>>>>>>>>>> (
>>>>>>>>>>>> *Stephan*)
>>>>>>>>>>>>  - Network (*Ufuk*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Client/Optimizer* (*Fabian*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Type system / Type extractor* (Timo)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Cluster Management* (Yarn, Mesos, Docker, ...) (*Max,
>>> Robert*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Libraries*
>>>>>>>>>>>>  - Gelly (*Vasia, Greg*)
>>>>>>>>>>>>  - ML (*Till, Theo*)
>>>>>>>>>>>>  - CEP (*Till*)
>>>>>>>>>>>>  - Python (*Chesnay*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Table API & SQL* (*Fabian, Vasia, Timo, Chengxiang*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Streaming Connectors* (*Robert*, *Aljoscha*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Batch Connectors and Input/Output Formats* (*Chesnay*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Storm Compatibility Layer* (*Mathias*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Scala shell* (*Till*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Startup Shell Scripts* (Ufuk)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Flink Build System, Maven Files* (*Robert*)
>>>>>>>>>>>> 
>>>>>>>>>>>> *Documentation* (Ufuk)
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Please let us know what you think about this proposal.
>>>>>>>>>>>> Happy discussing!
>>>>>>>>>>>> 
>>>>>>>>>>>> Greetings,
>>>>>>>>>>>> Stephan
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to