Thomas thanks for the suggestions and the comments in the document. I will take another look at the ones that I had shortlisted in the document to keep. Within that subset, would it be ok to leave the ones that don't have a large state problem, for the time being, till we have replacement operators implemented with the new windowing and state management. After the cleanup, I can also help in the development effort of those replacement operators as well.
Thanks On Tue, Aug 9, 2016 at 11:21 AM, Thomas Weise <[email protected]> wrote: > There are a bunch of operators that don't have proper state management and > also don't support generic windowing (event time etc.). I would suggest to > move those out or deprecate them. > > The new windowing and state management support along with the appropriate > aggregators is going to make them obsolete. > > Thomas > > > On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <[email protected] > > wrote: > >> Hi, >> >> Friendly Reminder : >> >> I created a shared google sheet and tracked the various details of >> operators. The sheet contains information about operators under lib/algo, >> lib/math & lib/streamquery. Link is https://docs.google.com/a/d >> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD >> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each >> operator . Please take a look and provide comments as if any. >> >> Thanks >> Lakshmi Prasanna >> >> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <[email protected]> >> wrote: >> >>> Added comments, also recommend having the misc folder for the remaining >>> operators in contrib according to proposed guidelines >>> >>> https://github.com/apache/apex-site/pull/44 >>> >>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni < >>> [email protected]> >>> wrote: >>> >>> > Hi >>> > >>> > I also added recommendation for lib/math operators to the same >>> document as >>> > a separate sheet. Please have a look. >>> > >>> > Thanks >>> > Lakshmi Prasanna >>> > >>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni < >>> [email protected] >>> > > wrote: >>> > >>> >> Hi, >>> >> >>> >> I also added recommendation for each operator . Please take a look. >>> >> >>> >> thanks >>> >> >>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni < >>> >> [email protected]> wrote: >>> >> >>> >>> Hi, >>> >>> >>> >>> I created a shared google sheet and tracked the various details of >>> >>> operators. Currently, the sheet contains information about operators >>> under >>> >>> lib/algo only. Link is https://docs.google.com/a/ >>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_ >>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with >>> >>> >>> lib/math too. >>> >>> >>> >>> Thanks >>> >>> Lakshmi Prasanna >>> >>> >>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <[email protected]> >>> >>> wrote: >>> >>> >>> >>>> Hi Lakshmi, >>> >>>> >>> >>>> Thanks for volunteering. >>> >>>> >>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets >>> and >>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks >>> >>>> individual operators are both good, with the exception that >>> lib/streamquery >>> >>>> is one unit and we probably do not need to look at individual >>> operators >>> >>>> under it. >>> >>>> >>> >>>> If we don't have any objection in the community, let's start the >>> >>>> process. >>> >>>> >>> >>>> David >>> >>>> >>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni < >>> >>>> [email protected]> wrote: >>> >>>> >>> >>>>> I am interested to work on this. >>> >>>>> >>> >>>>> Regards, >>> >>>>> Lakshmi prasanna >>> >>>>> >>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, [email protected] < >>> [email protected]> >>> >>>>> wrote: >>> >>>>> >>> >>>>> > Why not have a shared google sheet with a list of operators and >>> >>>>> options >>> >>>>> > that we want to do with it. >>> >>>>> > I think it's case by case. >>> >>>>> > But retire unused or obsolete operators is important and we >>> should >>> >>>>> do it >>> >>>>> > sooner rather than later. >>> >>>>> > >>> >>>>> > Regards, >>> >>>>> > Siyuan >>> >>>>> > >>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre < >>> [email protected]> >>> >>>>> wrote: >>> >>>>> > >>> >>>>> >> >>> >>>>> >> My vote is to do 2&3 >>> >>>>> >> >>> >>>>> >> Thks >>> >>>>> >> Amol >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh < >>> >>>>> >> [email protected]> wrote: >>> >>>>> >> >>> >>>>> >>> +1 for deprecating the packages listed below. >>> >>>>> >>> >>> >>>>> >>> -----Original Message----- >>> >>>>> >>> From: [email protected] [mailto:[email protected]] >>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM >>> >>>>> >>> >>> >>>>> >>> +1 >>> >>>>> >>> >>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan < >>> [email protected] >>> >>>>> > >>> >>>>> >>> wrote: >>> >>>>> >>> >>> >>>>> >>> > Hi all, >>> >>>>> >>> > >>> >>>>> >>> > I would like to renew the discussion of retiring operators in >>> >>>>> Malhar. >>> >>>>> >>> > >>> >>>>> >>> > As stated before, the reason why we would like to retire >>> >>>>> operators in >>> >>>>> >>> > Malhar is because some of them were written a long time ago >>> >>>>> before >>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases, >>> >>>>> are not >>> >>>>> >>> > up to par in code quality, have no potential for >>> improvement, and >>> >>>>> >>> > probably completely unused by anybody. >>> >>>>> >>> > >>> >>>>> >>> > We do not want contributors to use them as a model of their >>> >>>>> >>> > contribution, or users to use them thinking they are of >>> quality, >>> >>>>> and >>> >>>>> >>> then hit a wall. >>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex. >>> >>>>> >>> > >>> >>>>> >>> > The initial 3 packages that we would like to target are >>> >>>>> *lib/algo*, >>> >>>>> >>> > *lib/math*, and *lib/streamquery*. >>> >>>>> >>> >>> >>>>> >>> > >>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if >>> you >>> >>>>> are >>> >>>>> >>> > using any operator in these 3 packages. We would like to hear >>> >>>>> from you. >>> >>>>> >>> > >>> >>>>> >>> > These are the options I can think of for retiring those >>> >>>>> operators: >>> >>>>> >>> > >>> >>>>> >>> > 1) Completely remove them from the malhar repository. >>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact >>> called >>> >>>>> >>> > malhar-misc >>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they >>> are no >>> >>>>> >>> > longer supported >>> >>>>> >>> > >>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts? >>> >>>>> >>> > >>> >>>>> >>> > David >>> >>>>> >>> > >>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni >>> >>>>> >>> > <[email protected]> >>> >>>>> >>> > wrote: >>> >>>>> >>> > >>> >>>>> >>> >> I wanted to close the loop on this discussion. In general >>> >>>>> everyone >>> >>>>> >>> >> seemed to be favorable to this idea with no serious >>> objections. >>> >>>>> Folks >>> >>>>> >>> >> had good suggestions like documenting capabilities of >>> >>>>> operators, come >>> >>>>> >>> >> up well defined criteria for graduation of operators and >>> what >>> >>>>> those >>> >>>>> >>> >> criteria may be and what to do with existing operators that >>> may >>> >>>>> not >>> >>>>> >>> >> yet be mature or unused. >>> >>>>> >>> >> >>> >>>>> >>> >> I am going to summarize the key points that resulted from >>> the >>> >>>>> >>> >> discussion and would like to proceed with them. >>> >>>>> >>> >> >>> >>>>> >>> >> - Operators that do not yet provide the key platform >>> >>>>> capabilities >>> >>>>> >>> to >>> >>>>> >>> >> make an operator useful across different applications >>> such as >>> >>>>> >>> >> reusability, >>> >>>>> >>> >> partitioning static or dynamic, idempotency, exactly once >>> >>>>> will >>> >>>>> >>> still be >>> >>>>> >>> >> accepted as long as they are functionally correct, have >>> unit >>> >>>>> tests >>> >>>>> >>> >> and will >>> >>>>> >>> >> go into a separate module. >>> >>>>> >>> >> - Contrib module was suggested as a place where new >>> >>>>> contributions >>> >>>>> >>> go in >>> >>>>> >>> >> that don't yet have all the platform capabilities and are >>> >>>>> not yet >>> >>>>> >>> >> mature. >>> >>>>> >>> >> If there are no other suggestions we will go with this >>> one. >>> >>>>> >>> >> - It was suggested the operators documentation list those >>> >>>>> platform >>> >>>>> >>> >> capabilities it currently provides from the list above. I >>> >>>>> will >>> >>>>> >>> >> document a >>> >>>>> >>> >> structure for this in the contribution guidelines. >>> >>>>> >>> >> - Folks wanted to know what would be the criteria to >>> >>>>> graduate an >>> >>>>> >>> >> operator to the big leagues :). I will kick-off a >>> separate >>> >>>>> thread >>> >>>>> >>> >> for it as >>> >>>>> >>> >> I think it requires its own discussion and hopefully we >>> can >>> >>>>> come >>> >>>>> >>> >> up with a >>> >>>>> >>> >> set of guidelines for it. >>> >>>>> >>> >> - David brought up state of some of the existing >>> operators >>> >>>>> and >>> >>>>> >>> their >>> >>>>> >>> >> retirement and the layout of operators in Malhar in >>> general >>> >>>>> and >>> >>>>> >>> how it >>> >>>>> >>> >> causes problems with development. I will ask him to lead >>> the >>> >>>>> >>> >> discussion on >>> >>>>> >>> >> that. >>> >>>>> >>> >> >>> >>>>> >>> >> Thanks >>> >>>>> >>> >> >>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan < >>> >>>>> [email protected]> >>> >>>>> >>> wrote: >>> >>>>> >>> >> >>> >>>>> >>> >> > The two ideas are not conflicting, but rather >>> complementing. >>> >>>>> >>> >> > >>> >>>>> >>> >> > On the contrary, putting a new process for people trying >>> to >>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar >>> >>>>> operators in >>> >>>>> >>> >> > the repository >>> >>>>> >>> >> is >>> >>>>> >>> >> > what is conflicting. >>> >>>>> >>> >> > >>> >>>>> >>> >> > Keep in mind that when people try to contribute, they >>> always >>> >>>>> look >>> >>>>> >>> >> > at the existing operators already in the repository as >>> >>>>> examples and >>> >>>>> >>> >> > likely a >>> >>>>> >>> >> model >>> >>>>> >>> >> > for their new operators. >>> >>>>> >>> >> > >>> >>>>> >>> >> > David >>> >>>>> >>> >> > >>> >>>>> >>> >> > >>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre < >>> >>>>> [email protected]> >>> >>>>> >>> >> wrote: >>> >>>>> >>> >> > >>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original >>> >>>>> thread >>> >>>>> >>> >> > > was to >>> >>>>> >>> >> > open >>> >>>>> >>> >> > > up a way for contributors to submit code in a dir >>> >>>>> (contrib?) as >>> >>>>> >>> >> > > long >>> >>>>> >>> >> as >>> >>>>> >>> >> > > license part of taken care of. >>> >>>>> >>> >> > > >>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do >>> we >>> >>>>> know >>> >>>>> >>> >> > > what is being used? >>> >>>>> >>> >> > > >>> >>>>> >>> >> > > Thks, >>> >>>>> >>> >> > > Amol >>> >>>>> >>> >> > > >>> >>>>> >>> >> > > >>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde < >>> >>>>> >>> >> [email protected]> >>> >>>>> >>> >> > > wrote: >>> >>>>> >>> >> > > >>> >>>>> >>> >> > > > +1 for removing the not-used operators. >>> >>>>> >>> >> > > > >>> >>>>> >>> >> > > > So we are creating a process for operator writers who >>> >>>>> don't >>> >>>>> >>> >> > > > want to understand the platform, yet wants to >>> contribute? >>> >>>>> How >>> >>>>> >>> >> > > > big is that >>> >>>>> >>> >> set? >>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has >>> not >>> >>>>> passed >>> >>>>> >>> >> > > > all >>> >>>>> >>> >> the >>> >>>>> >>> >> > > > checklist, will they be ready to use that in >>> production? >>> >>>>> >>> >> > > > >>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the >>> >>>>> operators and >>> >>>>> >>> >> > > > make >>> >>>>> >>> >> it >>> >>>>> >>> >> > > easy >>> >>>>> >>> >> > > > to add more operators. >>> >>>>> >>> >> > > > >>> >>>>> >>> >> > > > >>> >>>>> >>> >> > > > >>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni < >>> >>>>> >>> >> > [email protected]> >>> >>>>> >>> >> > > > wrote: >>> >>>>> >>> >> > > > >>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta < >>> >>>>> >>> >> > > [email protected]> >>> >>>>> >>> >> > > > > wrote: >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > > Pramod, >>> >>>>> >>> >> > > > > > >>> >>>>> >>> >> > > > > > By that logic I would say let's put all >>> partitionable >>> >>>>> >>> >> > > > > > operators >>> >>>>> >>> >> > into >>> >>>>> >>> >> > > > one >>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another >>> and so >>> >>>>> on... >>> >>>>> >>> >> > > > > > >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > Remember the original goal of making it easier for >>> new >>> >>>>> >>> >> > > > > members to contribute and managing those >>> contributions >>> >>>>> to >>> >>>>> >>> >> > > > > maturity. It is >>> >>>>> >>> >> not a >>> >>>>> >>> >> > > > > functional level separation. >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations >>> >>>>> being >>> >>>>> >>> >> > > > > > used at >>> >>>>> >>> >> > > class >>> >>>>> >>> >> > > > > > level and not at package/folder level. >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of >>> this >>> >>>>> like >>> >>>>> >>> >> > > > > a >>> >>>>> >>> >> > > folder..." >>> >>>>> >>> >> > > > > as an analogy and not literally. >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > Thanks >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > > Thanks >>> >>>>> >>> >> > > > > > >>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni < >>> >>>>> >>> >> > > > [email protected] >>> >>>>> >>> >> > > > > > >>> >>>>> >>> >> > > > > > wrote: >>> >>>>> >>> >> > > > > > >>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta < >>> >>>>> >>> >> > > > > [email protected]> >>> >>>>> >>> >> > > > > > > wrote: >>> >>>>> >>> >> > > > > > > >>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using >>> >>>>> >>> >> > > org.apache.hadoop.classification. >>> >>>>> InterfaceStability.Evolving >>> >>>>> >>> >> > > > / >>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification. >>> >>>>> InterfaceStability.Uns >>> >>>>> >>> >> > > > > > > > table >>> >>>>> >>> >> > > > > > annotation? >>> >>>>> >>> >> > > > > > > > >>> >>>>> >>> >> > > > > > > >>> >>>>> >>> >> > > > > > > I think it is important to localize the >>> additions >>> >>>>> in one >>> >>>>> >>> >> place so >>> >>>>> >>> >> > > > that >>> >>>>> >>> >> > > > > it >>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity >>> level of >>> >>>>> >>> >> > > > > > > these, >>> >>>>> >>> >> > easier >>> >>>>> >>> >> > > > for >>> >>>>> >>> >> > > > > > > developers to track them towards the path to >>> >>>>> maturity and >>> >>>>> >>> >> > > > > > > also >>> >>>>> >>> >> > > > > provides a >>> >>>>> >>> >> > > > > > > clearer directive for committers and >>> contributors on >>> >>>>> >>> >> acceptance >>> >>>>> >>> >> > of >>> >>>>> >>> >> > > > new >>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone >>> makes >>> >>>>> them >>> >>>>> >>> >> spread >>> >>>>> >>> >> > all >>> >>>>> >>> >> > > > > over >>> >>>>> >>> >> > > > > > > the place and adds an additional layer of >>> >>>>> difficulty in >>> >>>>> >>> >> > > > identification >>> >>>>> >>> >> > > > > > not >>> >>>>> >>> >> > > > > > > just for users but also for developers who want >>> to >>> >>>>> find >>> >>>>> >>> >> > > > > > > such >>> >>>>> >>> >> > > > operators >>> >>>>> >>> >> > > > > > and >>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level >>> >>>>> annotation >>> >>>>> >>> >> where >>> >>>>> >>> >> > > > > > everything >>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving. >>> >>>>> >>> >> > > > > > > >>> >>>>> >>> >> > > > > > > Thanks >>> >>>>> >>> >> > > > > > > >>> >>>>> >>> >> > > > > > > >>> >>>>> >>> >> > > > > > > > >>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan < >>> >>>>> >>> >> > > [email protected] >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > > > > wrote: >>> >>>>> >>> >> > > > > > > > >>> >>>>> >>> >> > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way >>> too >>> >>>>> many >>> >>>>> >>> >> operators >>> >>>>> >>> >> > > > that >>> >>>>> >>> >> > > > > > fall >>> >>>>> >>> >> > > > > > > > in >>> >>>>> >>> >> > > > > > > > > > the >>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We >>> >>>>> should >>> >>>>> >>> >> > > > > > > > > > > > make it >>> >>>>> >>> >> > > > obvious >>> >>>>> >>> >> > > > > to >>> >>>>> >>> >> > > > > > > > users >>> >>>>> >>> >> > > > > > > > > > > that >>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and >>> which >>> >>>>> >>> >> > > > > > > > > > > > operators >>> >>>>> >>> >> are >>> >>>>> >>> >> > > not, >>> >>>>> >>> >> > > > > and >>> >>>>> >>> >> > > > > > > > maybe >>> >>>>> >>> >> > > > > > > > > > > even >>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever >>> >>>>> used in a >>> >>>>> >>> >> > > > > > > > > > > > real >>> >>>>> >>> >> > use >>> >>>>> >>> >> > > > > case. >>> >>>>> >>> >> > > > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older >>> >>>>> operators >>> >>>>> >>> >> > > > > > > > > > > and >>> >>>>> >>> >> > doing >>> >>>>> >>> >> > > > this >>> >>>>> >>> >> > > > > > > > > exercise >>> >>>>> >>> >> > > > > > > > > > as >>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My >>> >>>>> original >>> >>>>> >>> >> intent >>> >>>>> >>> >> > is >>> >>>>> >>> >> > > > for >>> >>>>> >>> >> > > > > > > > > > > contributions going forward. >>> >>>>> >>> >> > > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as >>> well. >>> >>>>> >>> >> > > > > > > > > > Operators >>> >>>>> >>> >> > > outside >>> >>>>> >>> >> > > > > the >>> >>>>> >>> >> > > > > > > play >>> >>>>> >>> >> > > > > > > > > > area should be of well known quality. >>> >>>>> >>> >> > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > > >>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't >>> >>>>> anticipate >>> >>>>> >>> >> > > > > > > > > much >>> >>>>> >>> >> > tension >>> >>>>> >>> >> > > if >>> >>>>> >>> >> > > > > we >>> >>>>> >>> >> > > > > > > > > establish clear criteria. >>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar >>> >>>>> operators >>> >>>>> >>> >> > > > > > > > > stay >>> >>>>> >>> >> and >>> >>>>> >>> >> > > put >>> >>>>> >>> >> > > > up >>> >>>>> >>> >> > > > > > the >>> >>>>> >>> >> > > > > > > > > bars for new operators. >>> >>>>> >>> >> > > > > > > > > >>> >>>>> >>> >> > > > > > > > > David >>> >>>>> >>> >> > > > > > > > > >>> >>>>> >>> >> > > > > > > > >>> >>>>> >>> >> > > > > > > >>> >>>>> >>> >> > > > > > >>> >>>>> >>> >> > > > > >>> >>>>> >>> >> > > > >>> >>>>> >>> >> > > >>> >>>>> >>> >> > >>> >>>>> >>> >> >>> >>>>> >>> > >>> >>>>> >>> > >>> >>>>> >>> >>> >>>>> >> >>> >>>>> >> >>> >>>>> > >>> >>>>> >>> >>>> >>> >>>> >>> >>> >>> >> >>> > >>> >> >> >
