+1 On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:
> Hi all, > > I would like to renew the discussion of retiring operators in Malhar. > > As stated before, the reason why we would like to retire operators in > Malhar is because some of them were written a long time ago before Apache > incubation, and they do not pertain to real use cases, are not up to par in > code quality, have no potential for improvement, and probably completely > unused by anybody. > > We do not want contributors to use them as a model of their contribution, > or users to use them thinking they are of quality, and then hit a wall. > Both scenarios are not beneficial to the reputation of Apex. > > The initial 3 packages that we would like to target are *lib/algo*, > *lib/math*, and *lib/streamquery*. > > I'm adding this thread to the users list. Please speak up if you are using > any operator in these 3 packages. We would like to hear from you. > > These are the options I can think of for retiring those operators: > > 1) Completely remove them from the malhar repository. > 2) Move them from malhar-library into a separate artifact called > malhar-misc > 3) Mark them deprecated and add to their javadoc that they are no longer > supported > > Note that 2 and 3 are not mutually exclusive. Any thoughts? > > David > > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pra...@datatorrent.com> > wrote: > >> I wanted to close the loop on this discussion. In general everyone seemed >> to be favorable to this idea with no serious objections. Folks had good >> suggestions like documenting capabilities of operators, come up well >> defined criteria for graduation of operators and what those criteria may >> be >> and what to do with existing operators that may not yet be mature or >> unused. >> >> I am going to summarize the key points that resulted from the discussion >> and would like to proceed with them. >> >> - Operators that do not yet provide the key platform capabilities to >> make an operator useful across different applications such as >> reusability, >> partitioning static or dynamic, idempotency, exactly once will still be >> accepted as long as they are functionally correct, have unit tests and >> will >> go into a separate module. >> - Contrib module was suggested as a place where new contributions go in >> that don't yet have all the platform capabilities and are not yet >> mature. >> If there are no other suggestions we will go with this one. >> - It was suggested the operators documentation list those platform >> capabilities it currently provides from the list above. I will >> document a >> structure for this in the contribution guidelines. >> - Folks wanted to know what would be the criteria to graduate an >> operator to the big leagues :). I will kick-off a separate thread for >> it as >> I think it requires its own discussion and hopefully we can come up >> with a >> set of guidelines for it. >> - David brought up state of some of the existing operators and their >> retirement and the layout of operators in Malhar in general and how it >> causes problems with development. I will ask him to lead the >> discussion on >> that. >> >> Thanks >> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote: >> >> > The two ideas are not conflicting, but rather complementing. >> > >> > On the contrary, putting a new process for people trying to contribute >> > while NOT addressing the old unused subpar operators in the repository >> is >> > what is conflicting. >> > >> > Keep in mind that when people try to contribute, they always look at the >> > existing operators already in the repository as examples and likely a >> model >> > for their new operators. >> > >> > David >> > >> > >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <a...@datatorrent.com> >> wrote: >> > >> > > Yes there are two conflicting threads now. The original thread was to >> > open >> > > up a way for contributors to submit code in a dir (contrib?) as long >> as >> > > license part of taken care of. >> > > >> > > On the thread of removing non-used operators -> How do we know what is >> > > being used? >> > > >> > > Thks, >> > > Amol >> > > >> > > >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde < >> sand...@datatorrent.com> >> > > wrote: >> > > >> > > > +1 for removing the not-used operators. >> > > > >> > > > So we are creating a process for operator writers who don't want to >> > > > understand the platform, yet wants to contribute? How big is that >> set? >> > > > If we tell the app-user, here is the code which has not passed all >> the >> > > > checklist, will they be ready to use that in production? >> > > > >> > > > This thread has 2 conflicting forces, reduce the operators and make >> it >> > > easy >> > > > to add more operators. >> > > > >> > > > >> > > > >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni < >> > pra...@datatorrent.com> >> > > > wrote: >> > > > >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta < >> > > gaurav.gopi...@gmail.com> >> > > > > wrote: >> > > > > >> > > > > > Pramod, >> > > > > > >> > > > > > By that logic I would say let's put all partitionable operators >> > into >> > > > one >> > > > > > folder, non-partitionable operators in another and so on... >> > > > > > >> > > > > >> > > > > Remember the original goal of making it easier for new members to >> > > > > contribute and managing those contributions to maturity. It is >> not a >> > > > > functional level separation. >> > > > > >> > > > > >> > > > > > When I look at hadoop code I see these annotations being used at >> > > class >> > > > > > level and not at package/folder level. >> > > > > >> > > > > >> > > > > I had a typo in my email, I meant to say "think of this like a >> > > folder..." >> > > > > as an analogy and not literally. >> > > > > >> > > > > Thanks >> > > > > >> > > > > >> > > > > > Thanks >> > > > > > >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni < >> > > > pra...@datatorrent.com >> > > > > > >> > > > > > wrote: >> > > > > > >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta < >> > > > > gaurav.gopi...@gmail.com> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > Can same goal not be achieved by >> > > > > > > > using >> > > org.apache.hadoop.classification.InterfaceStability.Evolving >> > > > / >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable >> > > > > > annotation? >> > > > > > > > >> > > > > > > >> > > > > > > I think it is important to localize the additions in one >> place so >> > > > that >> > > > > it >> > > > > > > becomes clearer to users about the maturity level of these, >> > easier >> > > > for >> > > > > > > developers to track them towards the path to maturity and also >> > > > > provides a >> > > > > > > clearer directive for committers and contributors on >> acceptance >> > of >> > > > new >> > > > > > > submissions. Relying on the annotations alone makes them >> spread >> > all >> > > > > over >> > > > > > > the place and adds an additional layer of difficulty in >> > > > identification >> > > > > > not >> > > > > > > just for users but also for developers who want to find such >> > > > operators >> > > > > > and >> > > > > > > improve them. This of this like a folder level annotation >> where >> > > > > > everything >> > > > > > > under this folder is unstable or evolving. >> > > > > > > >> > > > > > > Thanks >> > > > > > > >> > > > > > > >> > > > > > > > >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan < >> > > da...@datatorrent.com >> > > > > >> > > > > > > wrote: >> > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > Malhar in its current state, has way too many >> operators >> > > > that >> > > > > > fall >> > > > > > > > in >> > > > > > > > > > the >> > > > > > > > > > > > "non-production quality" category. We should make it >> > > > obvious >> > > > > to >> > > > > > > > users >> > > > > > > > > > > that >> > > > > > > > > > > > which operators are up to par, and which operators >> are >> > > not, >> > > > > and >> > > > > > > > maybe >> > > > > > > > > > > even >> > > > > > > > > > > > remove those that are likely not ever used in a real >> > use >> > > > > case. >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > I am ambivalent about revisiting older operators and >> > doing >> > > > this >> > > > > > > > > exercise >> > > > > > > > > > as >> > > > > > > > > > > this can cause unnecessary tensions. My original >> intent >> > is >> > > > for >> > > > > > > > > > > contributions going forward. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > IMO it is important to address this as well. Operators >> > > outside >> > > > > the >> > > > > > > play >> > > > > > > > > > area should be of well known quality. >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > I think this is important, and I don't anticipate much >> > tension >> > > if >> > > > > we >> > > > > > > > > establish clear criteria. >> > > > > > > > > It's not helpful if we let the old subpar operators stay >> and >> > > put >> > > > up >> > > > > > the >> > > > > > > > > bars for new operators. >> > > > > > > > > >> > > > > > > > > David >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > >