+1
> On Sep 29, 2015, at 4:46 AM, Thomas Weise <[email protected]> wrote: > > I actually think that these changes should be made as part of a major > release (Malhar only, not engine). Along with other changes to convert to > Apache package names, purge deprecated operators etc. > > Since Apex core and Malhar will be decoupled going forward, such major > release can be done without affecting existing users. Since the package > names change, both major versions can also be used together in the same > application, no forced upgrade, ability to selectively pick new operators. > > Thoughts? > > > > >> On Tue, Sep 29, 2015 at 3:19 AM, Andy Perlitch <[email protected]> wrote: >> >> Hi all, >> >> This is a first cut at a plan to restructure malhar in a way that is more >> portable and adherent to Maven's principles of modularity and dependency >> management. >> >> Overview of Current Malhar Architecture >> --------------------------------------------------------------- >> The current malhar repo consists of several maven modules: >> >> * *malhar-library* >> operators which do not require additional transitive dependencies beyond >> what Apex and Hadoop require >> * *malhar-contrib* >> operators requiring other maven dependencies >> * *malhar-demos* >> demo applications >> * *malhar-samples* >> sample code showing example usage of malhar operators >> * *malhar-apps* >> apex applications (currently only logstream) >> >> >> Proposed Changes >> --------------------------------------------------------------- >> >> 1. *Scrub malhar-library for any operators needing additional dependencies* >> `malhar-library` is intended to consist of only operators without extra >> transitive dependencies. All operators should be checked for the necessity >> of extra dependencies. >> >> 2. *Move operators from malhar-demos and malhar-apps into contrib (or >> library if prudent)* >> There are various operators in both of these modules that are general >> enough to move into library or contrib. >> >> 3. *Create modules for all contrib subfolders* >> All folders under `contrib/src/main/com/datatorrent/contrib/` should be >> converted to modules of contrib and listed as such in `/contrib/pom.xml`. >> Additionally, each of these smaller contrib modules will have its own >> version and dependencies. >> >> 4. *Use the Shades Plugin to allow for backwards-compatible fully-qualified >> class names* >> This is made possible by shades class relocation >> < >> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html >> feature. This might be a bit error prone as well as confusing to use for >> outside developers, but it must be done if these changes are to be made >> prior to a major release. >> >> >> >> Let me know what you all think of this approach. >> >> Best, >> Andy >> >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude <[email protected]> >> wrote: >> >>> +1 >>> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta <[email protected]> >>> wrote: >>> >>>> I agree with David.. Each artifact should have it's own version >>>> >>>> Thanks >>>> -Gaurav >>>> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <[email protected]> >>>> wrote: >>>> >>>>> I actually think that each baby artifact should have its own version, >>>>> because each artifact has its own interface and its own life cycle, >>>>> especially after we break up the giant library, applications will >>> depend >>>> on >>>>> the baby artifacts instead of the giant library. For example if >> there >>> is >>>>> no change in malhar-contrib-kafka (I think the name should actually >> be >>>>> apex-malhar-kafka), we should not confuse users by bumping the >> version. >>>>> >>>>> David >>>>> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch <[email protected] >>> >>>>> wrote: >>>>> >>>>>> Tushar, >>>>>> >>>>>> I agree that all modules should inherit the version from the >> "parent >>>> pom" >>>>>> of the malhar repo. I think the benefits outweigh the cost of >> bumping >>>>>> versions of components that haven't actually changed. I'd love to >> get >>>>>> others feedback on this as well. >>>>>> >>>>>> On another note, I plan on starting a spreadsheet/googledoc with >> the >>>>>> possible groupings of operators into these modules. Stay tuned... >>>>>> >>>>>> -Andy >>>>>> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi < >>>> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> +1 for the general idea >>>>>>> >>>>>>> Does these independent modules going to have independent >> versions? >>>> For >>>>>>> example, if there is no change in kafka operator between malhar >> 3.0 >>>> and >>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to >>>> 4.0. I >>>>>>> have learned from my previous project that, It is easier to >> manage >>>>>> versions >>>>>>> if we make all modules at same version level for a release, even >> if >>>>> there >>>>>>> is no change in a particular module. >>>>>>> >>>>>>> - Tushar. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas < >>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I agree Andy's solution is better, but just for the sake of >>>> argument >>>>>>>> profiles can be inherited from a parent pom, so if the maven >>>>> archetype >>>>>>>> defines a new project with a parent pom with the correct >> profiles >>>>>>> defined, >>>>>>>> then the desired profiles can be activated in the pom of the >> new >>>>>> project. >>>>>>>> It is no more complicated than adding additional dependencies >> to >>>> your >>>>>>>> project. >>>>>>>> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde < >>>>>> [email protected] >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked >> as >>>>>>> optional. >>>>>>>> So >>>>>>>>> users have to already modify the existing POM to use it in >>> their >>>>>>> project. >>>>>>>>> So restructuring should be fine. >>>>>>>>> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude < >>>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> The profiles are excellent when you are developing >>>>> malhar-contrib. >>>>>>>>> Profiles >>>>>>>>>> do not work when you are using malhar-contrib. The problem >>> Andy >>>>> is >>>>>>>>> trying >>>>>>>>>> to solve is the later. If there is an elegant solution >> which >>> I >>>> am >>>>>>>> missing >>>>>>>>>> using profiles, please correct me. >>>>>>>>>> >>>>>>>>>> The way Andy suggested is the way many successful projects >> do >>>> it. >>>>>>> Look >>>>>>>> at >>>>>>>>>> Netty as an example. >>>>>>>>>> >>>>>>>>>> +1 for that. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Chetan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas < >>>>>>> [email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> I think restructuring the project in that way would be >> the >>>>>>>> technically >>>>>>>>>>> correct thing to do, but if people are unwilling to >> accept >>>> the >>>>>>> change >>>>>>>>> in >>>>>>>>>>> project structure you could achieve something similar by >>>> using >>>>>>> maven >>>>>>>>>>> profiles. With profiles the project structure would >> remain >>> as >>>>> is. >>>>>>>>>> Profiles >>>>>>>>>>> could be added to the malhar pom, and a profile would >>> define >>>>> the >>>>>>>>>>> dependencies needed for different types of operators. For >>>>> example >>>>>>> the >>>>>>>>>> hbase >>>>>>>>>>> profile would define the dependencies for the hbase >>> operator. >>>>>> Then >>>>>>>> any >>>>>>>>>>> project using a malhar library would just activate the >>>> correct >>>>>>>> profile >>>>>>>>> in >>>>>>>>>>> it's pom, and the correct dependencies would be pulled >> in. >> http://maven.apache.org/guides/introduction/introduction-to-profiles.html >>>>>>>>>>> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch < >>>>>>>> [email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi everyone, >>>>>>>>>>>> >>>>>>>>>>>> I am currently assigned to MLHR-1843 >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which >>>>>>> essentially >>>>>>>>>> aims >>>>>>>>>>> to >>>>>>>>>>>> expose smaller, more consumable maven artifacts that >>> would >>>> do >>>>>>> away >>>>>>>>> with >>>>>>>>>>> the >>>>>>>>>>>> need to manually include necessary dependencies based >> on >>>> the >>>>>>>>> operators >>>>>>>>>> in >>>>>>>>>>>> use. >>>>>>>>>>>> >>>>>>>>>>>> As an example, say I am building an app package that >>> needs >>>>>> Kafka >>>>>>>>> input >>>>>>>>>>> and >>>>>>>>>>>> output operators, but I don't want all the other >>> transitive >>>>>>>>>> dependencies >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to >>>>> specify >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions >>> block >>>>> in >>>>>>> my >>>>>>>>> app >>>>>>>>>>>> package pom: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId> >>>>>> <version>3.0.0</version> >>>>>>>>> <!-- >>>>>>>>>>> so >>>>>>>>>>>> none of malhar-contrib's deps are included -->* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> * <exclusions> <exclusion> >> <groupId>*</groupId> >>>>>>>>>>>> <artifactId>*</artifactId> </exclusion> >>>>>>>>> </exclusions></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> Then, I would have to include the kafka library >>> explicitly >>>>> as a >>>>>>>>>>> dependency: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>org.apache.kafka</groupId> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my >> pom?: >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In order to make this possible, we will need to >> organize >>>> the >>>>>>> malhar >>>>>>>>>>> project >>>>>>>>>>>> into more granular modules (artifacts). Specifically, >> the >>>>>>>>>> malhar-contrib >>>>>>>>>>>> artifact would essentially just be a pom that specifies >>>> each >>>>>>>> smaller >>>>>>>>>>> module >>>>>>>>>>>> as a dependency: >>>>>>>>>>>> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->* >>>>>>>>>>>> >>>>>>>>>>>> *<modules> <module>kafka</module>* >>>>>>>>>>>> * <module>twitter</module>* >>>>>>>>>>>> * <module>redis</module>* >>>>>>>>>>>> >>>>>>>>>>>> * <!-- other smaller modules --></modules>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >>>>>>>>>>>> >>>>>>>>>>>> With these changes, there may be a risk of breaking >>>> backwards >>>>>>>>>>>> compatibility, however I think the gain in usability of >>>>> malhar >>>>>>>> merits >>>>>>>>>> the >>>>>>>>>>>> effort to make this work. >>>>>>>>>>>> >>>>>>>>>>>> I am still relatively new to maven, so I would love to >>> get >>>>> some >>>>>>>>>> feedback >>>>>>>>>>>> from other devs about this! >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Regards, >>>>>>>>>>>> Andy Perlitch >>>>>>>>>>>> Software Engineer >>>>>>>>>>>> DataTorrent Inc >>>>>>>>>>>> (408)829-9319 >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> Andy Perlitch >>>>>> Software Engineer >>>>>> DataTorrent Inc >>>>>> (408)829-9319 >> >> >> >> -- >> Regards, >> Andy Perlitch >> Software Engineer >> DataTorrent Inc >> (408)829-9319 >>
