We got to think about how people can find the operators and dependencies when bundling the applications. The complain I hear often is that folks can't find the operators they are looking for. We should be careful about how much more work this will add for the user to now search and find all the dependencies.
Thanks > On Oct 2, 2015, at 3:44 PM, David Yan <[email protected]> wrote: > > I actually don't think it makes sense any more to separate malhar-library > and malhar-contrib after the breakup, especially since we are planning for > a major release for these changes. > > People are often confused, myself included, which operators should be in > malhar-library and which ones should be in contrib. Requiring a separate > setup for unit test should not be a criteria because the user of the > library couldn't care less whether the unit test requires extra setup. The > factor of requiring extra dependencies isn't valid either because there're > already dependencies of malhar-library now that apex does not have. > > We can retain them for backward compatibility purpose but going forward new > app packages should only use the baby artifacts, without denoting whether > it's contrib or not. > > David > > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <[email protected]> > wrote: > >> Hi all, >> >> This is a first cut at a plan to restructure malhar in a way that is more >> portable and adherent to Maven's principles of modularity and dependency >> management. >> >> Overview of Current Malhar Architecture >> --------------------------------------------------------------- >> The current malhar repo consists of several maven modules: >> >> * *malhar-library* >> operators which do not require additional transitive dependencies beyond >> what Apex and Hadoop require >> * *malhar-contrib* >> operators requiring other maven dependencies >> * *malhar-demos* >> demo applications >> * *malhar-samples* >> sample code showing example usage of malhar operators >> * *malhar-apps* >> apex applications (currently only logstream) >> >> >> Proposed Changes >> --------------------------------------------------------------- >> >> 1. *Scrub malhar-library for any operators needing additional dependencies* >> `malhar-library` is intended to consist of only operators without extra >> transitive dependencies. All operators should be checked for the necessity >> of extra dependencies. >> >> 2. *Move operators from malhar-demos and malhar-apps into contrib (or >> library if prudent)* >> There are various operators in both of these modules that are general >> enough to move into library or contrib. >> >> 3. *Create modules for all contrib subfolders* >> All folders under `contrib/src/main/com/datatorrent/contrib/` should be >> converted to modules of contrib and listed as such in `/contrib/pom.xml`. >> Additionally, each of these smaller contrib modules will have its own >> version and dependencies. >> >> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified >> class names* >> This is made possible by shades class relocation >> < >> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html >> feature. This might be a bit error prone as well as confusing to use for >> outside developers, but it must be done if these changes are to be made >> prior to a major release. >> >> >> >> Let me know what you all think of this approach. >> >> Best, >> Andy >> >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <[email protected]> >> wrote: >> >>> +1 >>> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <[email protected]> >>> wrote: >>> >>>> I agree with David.. Each artifact should have it's own version >>>> >>>> Thanks >>>> -Gaurav >>>> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <[email protected]> >>>> wrote: >>>> >>>>> I actually think that each baby artifact should have its own version, >>>>> because each artifact has its own interface and its own life cycle, >>>>> especially after we break up the giant library, applications will >>> depend >>>> on >>>>> the baby artifacts instead of the giant library. For example if >> there >>> is >>>>> no change in malhar-contrib-kafka (I think the name should actually >> be >>>>> apex-malhar-kafka), we should not confuse users by bumping the >> version. >>>>> >>>>> David >>>>> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <[email protected] >>> >>>>> wrote: >>>>> >>>>>> Tushar, >>>>>> >>>>>> I agree that all modules should inherit the version from the >> "parent >>>> pom" >>>>>> of the malhar repo. I think the benefits outweigh the cost of >> bumping >>>>>> versions of components that haven't actually changed. I'd love to >> get >>>>>> others feedback on this as well. >>>>>> >>>>>> On another note, I plan on starting a spreadsheet/googledoc with >> the >>>>>> possible groupings of operators into these modules. Stay tuned... >>>>>> >>>>>> -Andy >>>>>> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi < >>>> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> +1 for the general idea >>>>>>> >>>>>>> Does these independent modules going to have independent >> versions? >>>> For >>>>>>> example, if there is no change in kafka operator between malhar >> 3.0 >>>> and >>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to >>>> 4.0. I >>>>>>> have learned from my previous project that, It is easier to >> manage >>>>>> versions >>>>>>> if we make all modules at same version level for a release, even >> if >>>>> there >>>>>>> is no change in a particular module. >>>>>>> >>>>>>> - Tushar. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas < >>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I agree Andy's solution is better, but just for the sake of >>>> argument >>>>>>>> profiles can be inherited from a parent pom, so if the maven >>>>> archetype >>>>>>>> defines a new project with a parent pom with the correct >> profiles >>>>>>> defined, >>>>>>>> then the desired profiles can be activated in the pom of the >> new >>>>>> project. >>>>>>>> It is no more complicated than adding additional dependencies >> to >>>> your >>>>>>>> project. >>>>>>>> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde < >>>>>> [email protected] >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked >> as >>>>>>> optional. >>>>>>>> So >>>>>>>>> users have to already modify the existing POM to use it in >>> their >>>>>>> project. >>>>>>>>> So restructuring should be fine. >>>>>>>>> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude < >>>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> The profiles are excellent when you are developing >>>>> malhar-contrib. >>>>>>>>> Profiles >>>>>>>>>> do not work when you are using malhar-contrib. The problem >>> Andy >>>>> is >>>>>>>>> trying >>>>>>>>>> to solve is the later. If there is an elegant solution >> which >>> I >>>> am >>>>>>>> missing >>>>>>>>>> using profiles, please correct me. >>>>>>>>>> >>>>>>>>>> The way Andy suggested is the way many successful projects >> do >>>> it. >>>>>>> Look >>>>>>>> at >>>>>>>>>> Netty as an example. >>>>>>>>>> >>>>>>>>>> +1 for that. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Chetan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas < >>>>>>> [email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I think restructuring the project in that way would be >> the >>>>>>>> technically >>>>>>>>>>> correct thing to do, but if people are unwilling to >> accept >>>> the >>>>>>> change >>>>>>>>> in >>>>>>>>>>> project structure you could achieve something similar by >>>> using >>>>>>> maven >>>>>>>>>>> profiles. With profiles the project structure would >> remain >>> as >>>>> is. >>>>>>>>>> Profiles >>>>>>>>>>> could be added to the malhar pom, and a profile would >>> define >>>>> the >>>>>>>>>>> dependencies needed for different types of operators. For >>>>> example >>>>>>> the >>>>>>>>>> hbase >>>>>>>>>>> profile would define the dependencies for the hbase >>> operator. >>>>>> Then >>>>>>>> any >>>>>>>>>>> project using a malhar library would just activate the >>>> correct >>>>>>>> profile >>>>>>>>> in >>>>>>>>>>> it's pom, and the correct dependencies would be pulled >> in. >> http://maven.apache.org/guides/introduction/introduction-to-profiles.html >>>>>>>>>>> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch < >>>>>>>> [email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi everyone, >>>>>>>>>>>> >>>>>>>>>>>> I am currently assigned to MLHR-1843 >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which >>>>>>> essentially >>>>>>>>>> aims >>>>>>>>>>> to >>>>>>>>>>>> expose smaller, more consumable maven artifacts that >>> would >>>> do >>>>>>> away >>>>>>>>> with >>>>>>>>>>> the >>>>>>>>>>>> need to manually include necessary dependencies based >> on >>>> the >>>>>>>>> operators >>>>>>>>>> in >>>>>>>>>>>> use. >>>>>>>>>>>> >>>>>>>>>>>> As an example, say I am building an app package that >>> needs >>>>>> Kafka >>>>>>>>> input >>>>>>>>>>> and >>>>>>>>>>>> output operators, but I don't want all the other >>> transitive >>>>>>>>>> dependencies >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to >>>>> specify >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions >>> block >>>>> in >>>>>>> my >>>>>>>>> app >>>>>>>>>>>> package pom: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId> >>>>>> <version>3.0.0</version> >>>>>>>>> <!-- >>>>>>>>>>> so >>>>>>>>>>>> none of malhar-contrib's deps are included -->* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> * <exclusions> <exclusion> >> <groupId>*</groupId> >>>>>>>>>>>> <artifactId>*</artifactId> </exclusion> >>>>>>>>> </exclusions></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> Then, I would have to include the kafka library >>> explicitly >>>>> as a >>>>>>>>>>> dependency: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>org.apache.kafka</groupId> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my >> pom?: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In order to make this possible, we will need to >> organize >>>> the >>>>>>> malhar >>>>>>>>>>> project >>>>>>>>>>>> into more granular modules (artifacts). Specifically, >> the >>>>>>>>>> malhar-contrib >>>>>>>>>>>> artifact would essentially just be a pom that specifies >>>> each >>>>>>>> smaller >>>>>>>>>>> module >>>>>>>>>>>> as a dependency: >>>>>>>>>>>> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->* >>>>>>>>>>>> >>>>>>>>>>>> *<modules> <module>kafka</module>* >>>>>>>>>>>> * <module>twitter</module>* >>>>>>>>>>>> * <module>redis</module>* >>>>>>>>>>>> >>>>>>>>>>>> * <!-- other smaller modules --></modules>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> With these changes, there may be a risk of breaking >>>> backwards >>>>>>>>>>>> compatibility, however I think the gain in usability of >>>>> malhar >>>>>>>> merits >>>>>>>>>> the >>>>>>>>>>>> effort to make this work. >>>>>>>>>>>> >>>>>>>>>>>> I am still relatively new to maven, so I would love to >>> get >>>>> some >>>>>>>>>> feedback >>>>>>>>>>>> from other devs about this! >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Regards, >>>>>>>>>>>> Andy Perlitch >>>>>>>>>>>> Software Engineer >>>>>>>>>>>> DataTorrent Inc >>>>>>>>>>>> (408)829-9319 >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> Andy Perlitch >>>>>> Software Engineer >>>>>> DataTorrent Inc >>>>>> (408)829-9319 >> >> >> >> -- >> Regards, >> Andy Perlitch >> Software Engineer >> DataTorrent Inc >> (408)829-9319 >>
