A document for malhar contribution guidelines has been prepared and submitted in a pull request
https://github.com/apache/apex-site/pull/44 Thanks On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <[email protected]> wrote: > I wanted to close the loop on this discussion. In general everyone seemed > to be favorable to this idea with no serious objections. Folks had good > suggestions like documenting capabilities of operators, come up well > defined criteria for graduation of operators and what those criteria may be > and what to do with existing operators that may not yet be mature or > unused. > > I am going to summarize the key points that resulted from the discussion > and would like to proceed with them. > > - Operators that do not yet provide the key platform capabilities to > make an operator useful across different applications such as reusability, > partitioning static or dynamic, idempotency, exactly once will still be > accepted as long as they are functionally correct, have unit tests and will > go into a separate module. > - Contrib module was suggested as a place where new contributions go > in that don't yet have all the platform capabilities and are not yet > mature. If there are no other suggestions we will go with this one. > - It was suggested the operators documentation list those platform > capabilities it currently provides from the list above. I will document a > structure for this in the contribution guidelines. > - Folks wanted to know what would be the criteria to graduate an > operator to the big leagues :). I will kick-off a separate thread for it as > I think it requires its own discussion and hopefully we can come up with a > set of guidelines for it. > - David brought up state of some of the existing operators and their > retirement and the layout of operators in Malhar in general and how it > causes problems with development. I will ask him to lead the discussion on > that. > > Thanks > > On Fri, May 27, 2016 at 7:47 PM, David Yan <[email protected]> wrote: > >> The two ideas are not conflicting, but rather complementing. >> >> On the contrary, putting a new process for people trying to contribute >> while NOT addressing the old unused subpar operators in the repository is >> what is conflicting. >> >> Keep in mind that when people try to contribute, they always look at the >> existing operators already in the repository as examples and likely a >> model >> for their new operators. >> >> David >> >> >> On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <[email protected]> wrote: >> >> > Yes there are two conflicting threads now. The original thread was to >> open >> > up a way for contributors to submit code in a dir (contrib?) as long as >> > license part of taken care of. >> > >> > On the thread of removing non-used operators -> How do we know what is >> > being used? >> > >> > Thks, >> > Amol >> > >> > >> > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <[email protected] >> > >> > wrote: >> > >> > > +1 for removing the not-used operators. >> > > >> > > So we are creating a process for operator writers who don't want to >> > > understand the platform, yet wants to contribute? How big is that set? >> > > If we tell the app-user, here is the code which has not passed all the >> > > checklist, will they be ready to use that in production? >> > > >> > > This thread has 2 conflicting forces, reduce the operators and make it >> > easy >> > > to add more operators. >> > > >> > > >> > > >> > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni < >> [email protected]> >> > > wrote: >> > > >> > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta < >> > [email protected]> >> > > > wrote: >> > > > >> > > > > Pramod, >> > > > > >> > > > > By that logic I would say let's put all partitionable operators >> into >> > > one >> > > > > folder, non-partitionable operators in another and so on... >> > > > > >> > > > >> > > > Remember the original goal of making it easier for new members to >> > > > contribute and managing those contributions to maturity. It is not a >> > > > functional level separation. >> > > > >> > > > >> > > > > When I look at hadoop code I see these annotations being used at >> > class >> > > > > level and not at package/folder level. >> > > > >> > > > >> > > > I had a typo in my email, I meant to say "think of this like a >> > folder..." >> > > > as an analogy and not literally. >> > > > >> > > > Thanks >> > > > >> > > > >> > > > > Thanks >> > > > > >> > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni < >> > > [email protected] >> > > > > >> > > > > wrote: >> > > > > >> > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta < >> > > > [email protected]> >> > > > > > wrote: >> > > > > > >> > > > > > > Can same goal not be achieved by >> > > > > > > using >> > org.apache.hadoop.classification.InterfaceStability.Evolving >> > > / >> > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable >> > > > > annotation? >> > > > > > > >> > > > > > >> > > > > > I think it is important to localize the additions in one place >> so >> > > that >> > > > it >> > > > > > becomes clearer to users about the maturity level of these, >> easier >> > > for >> > > > > > developers to track them towards the path to maturity and also >> > > > provides a >> > > > > > clearer directive for committers and contributors on acceptance >> of >> > > new >> > > > > > submissions. Relying on the annotations alone makes them spread >> all >> > > > over >> > > > > > the place and adds an additional layer of difficulty in >> > > identification >> > > > > not >> > > > > > just for users but also for developers who want to find such >> > > operators >> > > > > and >> > > > > > improve them. This of this like a folder level annotation where >> > > > > everything >> > > > > > under this folder is unstable or evolving. >> > > > > > >> > > > > > Thanks >> > > > > > >> > > > > > >> > > > > > > >> > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan < >> > [email protected] >> > > > >> > > > > > wrote: >> > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Malhar in its current state, has way too many >> operators >> > > that >> > > > > fall >> > > > > > > in >> > > > > > > > > the >> > > > > > > > > > > "non-production quality" category. We should make it >> > > obvious >> > > > to >> > > > > > > users >> > > > > > > > > > that >> > > > > > > > > > > which operators are up to par, and which operators are >> > not, >> > > > and >> > > > > > > maybe >> > > > > > > > > > even >> > > > > > > > > > > remove those that are likely not ever used in a real >> use >> > > > case. >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > I am ambivalent about revisiting older operators and >> doing >> > > this >> > > > > > > > exercise >> > > > > > > > > as >> > > > > > > > > > this can cause unnecessary tensions. My original intent >> is >> > > for >> > > > > > > > > > contributions going forward. >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > IMO it is important to address this as well. Operators >> > outside >> > > > the >> > > > > > play >> > > > > > > > > area should be of well known quality. >> > > > > > > > > >> > > > > > > > > >> > > > > > > > I think this is important, and I don't anticipate much >> tension >> > if >> > > > we >> > > > > > > > establish clear criteria. >> > > > > > > > It's not helpful if we let the old subpar operators stay and >> > put >> > > up >> > > > > the >> > > > > > > > bars for new operators. >> > > > > > > > >> > > > > > > > David >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
