Most users are not interested using all parsers and their dependencies in their project. They have a specific format, for which they will need a parser. When I have xml files, I don't need the dependencies for csv.
Thomas On Thu, Dec 24, 2015 at 12:22 PM, Isha Arkatkar <[email protected]> wrote: > Hi, > > Agree with Shubham's point here as well. The parser split in contrib and > library is probably harder to find and manage going forward. However, I > think increasing dependencies in library also has its downside. > > As mentioned in separate mail thread on splitting malhar into modules, can > we create a single module for parsers outside library and contrib? I can > close the current pull request and open a new one if that makes more sense. > > Thanks, > Isha > > > On Wed, Dec 23, 2015 at 11:00 PM, Shubham Pathak <[email protected]> > wrote: > > > Hi, > > > > Can we take a decision regarding whether all parsers need to be in one > > place ? ( either malhar lib / contrib ) > > > > In this PR https://github.com/apache/incubator-apex-malhar/pull/137 we > are > > moving XML and JSON parser from contrib to lib. > > I suggest we move CSV parsers as well . > > > > I understand Thomas's point of dependencies, but shouldn't we also think > > about users that use malhar that would wish to have all parsers under > > common bucket ? > > > > > > Thanks, > > Shubham > > > > > > On Fri, Dec 18, 2015 at 2:44 AM, Thomas Weise <[email protected]> > > wrote: > > > > > This has changed over time. Pieces that cannot be covered by unit tests > > > should still be kept out of library. > > > > > > In addition, we cannot keep on expanding the number of dependencies in > a > > > single monolithic module. Hence, we are going start to create smaller > > > modules with their dependencies and tests setup correctly. The first of > > > those will be Kafka, for which a PR is currently open. > > > > > > For lib, we can add more operators as long as they don't introduce new > > > dependencies. When building an app and using one operator from lib, > > > everything else comes with it. That's a problem for the application > > > assembly we are trying to address. > > > > > > Thomas > > > > > > > > > On Thu, Dec 17, 2015 at 1:06 PM, Sandesh Hegde < > [email protected]> > > > wrote: > > > > > > > Even my understanding was along the lines, "if an unit test can't be > > run > > > on > > > > a developer machine without installing dependencies ( RabbitMq/redis > > ec), > > > > it should go into contrib". > > > > > > > > Documenting the exact process as to what goes where helps. > > > > > > > > On Thu, Dec 17, 2015 at 12:16 PM Chinmay Kolhatkar < > > > > [email protected]> > > > > wrote: > > > > > > > > > Hi Isha, > > > > > > > > > > I think what Shubham meant to say is any operator which has a > > > dependency > > > > on > > > > > external entity (not library) should go in contrib. Others should > go > > in > > > > > library. > > > > > > > > > > For eg. DB related operators needs an instance/server of respect > > > database > > > > > to be running. Without that running server, the operator cannot > > satisfy > > > > the > > > > > functionality expected. Hence it goes into contrib. > > > > > > > > > > I think none of the parsers will have dependency on external > entity, > > > > hence > > > > > they should go in library and not contrib. > > > > > > > > > > - Chinmay. > > > > > On 18 Dec 2015 00:42, "Isha Arkatkar" <[email protected]> > wrote: > > > > > > > > > > > Hi Shubham, > > > > > > > > > > > > I think it is somewhat subjective what goes in Contrib Vs > > > > Library. > > > > > > Here is what I understood about general guideline: > > > > > > An operator would go in malhar-library if it does not > have > > > any > > > > > > other library dependency than what is already available. If > > operator > > > > > needs > > > > > > to include a dependency, it could be added to Contrib but not > > > library. > > > > > > > > > > > > If we add an external library dependency in Malhar-library, > > the > > > > size > > > > > > of the lib jar keeps on growing. So, if we refer malhar-lib in a > > > > project, > > > > > > the size of the total package would increase, even if we may not > > > > directly > > > > > > use all external libs. > > > > > > > > > > > > For this reason, in pull request #137 > > > > > > <https://github.com/apache/incubator-apex-malhar/pull/137> , I > > move > > > > only > > > > > > Json and XML parsers to malhar-library. Csv one had library > > reference > > > > to > > > > > > supercsv, so it is still in contrib. > > > > > > > > > > > > Please correct if I missed something. :) > > > > > > > > > > > > Thanks! > > > > > > Isha > > > > > > > > > > > > On Wed, Dec 16, 2015 at 11:27 PM, Shubham Pathak < > > > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > What do we mean by "additional dependency" in case of deciding > > > > contrib > > > > > vs > > > > > > > lib ? > > > > > > > As i understand, "additional dependency" is when an operator > is > > > > > > > interacting with external technologies . For e.g kafka, MQ , > > HBase > > > > etc. > > > > > > > By this definition even CSV parsers need to be moved to > > malhar-lib. > > > > > > > > > > > > > > Thanks, > > > > > > > Shubham > > > > > > > > > > > > > > > > > > > > > On Thu, Dec 17, 2015 at 5:39 AM, Isha Arkatkar < > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Chandni, > > > > > > > > > > > > > > > > I have moved that as well to lib, as the parsers depended > on > > > > that. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > Isha > > > > > > > > > > > > > > > > On Wed, Dec 16, 2015 at 1:01 PM, Chandni Singh < > > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > There is a converter package under com.datatorrent.contrib > > > which > > > > > has > > > > > > a > > > > > > > > > Converter API. This belongs in library as well. > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > Chandni > > > > > > > > > > > > > > > > > > On Tue, Dec 15, 2015 at 1:29 PM, Chandni Singh < > > > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Isha, > > > > > > > > > > > > > > > > > > > > Thanks for moving this. When you move these files, please > > > place > > > > > > then > > > > > > > > > under > > > > > > > > > > a package which reflects its functionality. I don't see > the > > > > need > > > > > > for > > > > > > > > > > package called schema. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Chandni > > > > > > > > > > > > > > > > > > > > On Tue, Dec 15, 2015 at 12:31 PM, Isha Arkatkar < > > > > > > > [email protected]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > >> Hi, > > > > > > > > > >> > > > > > > > > > >> For csv parser there is an additional dependency. So, > > I'll > > > > > move > > > > > > > only > > > > > > > > > >> json > > > > > > > > > >> and xml to new location. > > > > > > > > > >> > > > > > > > > > >> Thanks, > > > > > > > > > >> Isha > > > > > > > > > >> > > > > > > > > > >> On Tue, Dec 15, 2015 at 11:42 AM, Thomas Weise < > > > > > > > > [email protected]> > > > > > > > > > >> wrote: > > > > > > > > > >> > > > > > > > > > >> > As long as the operators don't introduce additional > > > > > dependencies > > > > > > > > they > > > > > > > > > >> > should be in lib. > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > On Tue, Dec 15, 2015 at 9:34 AM, Shubham Pathak < > > > > > > > > > >> [email protected]> > > > > > > > > > >> > wrote: > > > > > > > > > >> > > > > > > > > > > >> > > Hi Chandni, > > > > > > > > > >> > > > > > > > > > > > >> > > I had written those operators. > > > > > > > > > >> > > Here is the jira for that > > > > > > > > > >> https://malhar.atlassian.net/browse/MLHR-1838 > > > > > > > > > >> > > You would find the entire discussion there. > > > > > > > > > >> > > > > > > > > > > > >> > > Why are all these operator under Malhar/contrib and > > not > > > > > > > Malhar/lib > > > > > > > > > >> > > When i was writing the code i saw AbstractCsvParser > in > > > > > > contriib > > > > > > > > and > > > > > > > > > >> hence > > > > > > > > > >> > > added there. > > > > > > > > > >> > > > > > > > > > > > >> > > Recently i got to know which operators must go in > > > contrib > > > > > and > > > > > > > what > > > > > > > > > >> must > > > > > > > > > >> > go > > > > > > > > > >> > > in lib. > > > > > > > > > >> > > By that definition, these operators must belong to > > lib. > > > > > > > > > >> > > > > > > > > > > > >> > > Thanks, > > > > > > > > > >> > > Shubham > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > On Tue, Dec 15, 2015 at 1:32 PM, Chandni Singh < > > > > > > > > > >> [email protected]> > > > > > > > > > >> > > wrote: > > > > > > > > > >> > > > > > > > > > > > >> > > > Hi, > > > > > > > > > >> > > > > > > > > > > > > >> > > > I just came across couple of formatter and parser > > > > > operators > > > > > > > > which > > > > > > > > > >> are > > > > > > > > > >> > > under > > > > > > > > > >> > > > Malhar/contrib/schema. > > > > > > > > > >> > > > > > > > > > > > > >> > > > I have couple of questions: > > > > > > > > > >> > > > 1. What does schema denote here? > > > > > > > > > >> > > > 2. Why formatter/parser which are functions are > > placed > > > > > under > > > > > > > > > schema > > > > > > > > > >> > > > package? > > > > > > > > > >> > > > 2. Why are all these operator under Malhar/contrib > > and > > > > not > > > > > > > > > >> Malhar/lib > > > > > > > > > >> > > > > > > > > > > > > >> > > > Thanks, > > > > > > > > > >> > > > Chandni > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
