Let's restart the discussion of this topic. We'd like to break malhar into modules, so we can have separate artifacts for kafka, cassandra, hbase, etc., instead of just malhar-contrib and malhar-library. This way users using them will only pull in the right dependencies automatically, without the ugly business of optional and exclude dependencies today.
Also, I propose adding the 3rd party version in the artifact name. For example: malhar-kafka-0.8 malhar-kafka-0.9 so that we can simultaneously support multiple versions of kafka. Thoughts? David On Fri, Oct 2, 2015 at 4:40 PM, David Yan <[email protected]> wrote: > The list of all malhar operators are listed as part of the apidoc here: > https://www.datatorrent.com/docs/apidocs/index.html > And developers should be able to find the operators they need there. > > But, it's referenced from > https://www.datatorrent.com/product-documentation/ as "Platform API > Reference" so users may have trouble finding it. > > We probably should have a separate javadoc pages for Apex Core and Apex > Malhar and add the links to this page http://apex.apache.org/docs.html > also. > > David > > On Fri, Oct 2, 2015 at 4:28 PM, Pramod Immaneni <[email protected]> > wrote: > >> We got to think about how people can find the operators and >> dependencies when bundling the applications. The complain I hear often >> is that folks can't find the operators they are looking for. We should >> be careful about how much more work this will add for the user to now >> search and find all the dependencies. >> >> Thanks >> >> > On Oct 2, 2015, at 3:44 PM, David Yan <[email protected]> wrote: >> > >> > I actually don't think it makes sense any more to separate >> malhar-library >> > and malhar-contrib after the breakup, especially since we are planning >> for >> > a major release for these changes. >> > >> > People are often confused, myself included, which operators should be in >> > malhar-library and which ones should be in contrib. Requiring a >> separate >> > setup for unit test should not be a criteria because the user of the >> > library couldn't care less whether the unit test requires extra setup. >> The >> > factor of requiring extra dependencies isn't valid either because >> there're >> > already dependencies of malhar-library now that apex does not have. >> > >> > We can retain them for backward compatibility purpose but going forward >> new >> > app packages should only use the baby artifacts, without denoting >> whether >> > it's contrib or not. >> > >> > David >> > >> > On Tue, Sep 29, 2015 at 12:19 AM, Andy Perlitch <[email protected]> >> > wrote: >> > >> >> Hi all, >> >> >> >> This is a first cut at a plan to restructure malhar in a way that is >> more >> >> portable and adherent to Maven's principles of modularity and >> dependency >> >> management. >> >> >> >> Overview of Current Malhar Architecture >> >> --------------------------------------------------------------- >> >> The current malhar repo consists of several maven modules: >> >> >> >> * *malhar-library* >> >> operators which do not require additional transitive dependencies >> beyond >> >> what Apex and Hadoop require >> >> * *malhar-contrib* >> >> operators requiring other maven dependencies >> >> * *malhar-demos* >> >> demo applications >> >> * *malhar-samples* >> >> sample code showing example usage of malhar operators >> >> * *malhar-apps* >> >> apex applications (currently only logstream) >> >> >> >> >> >> Proposed Changes >> >> --------------------------------------------------------------- >> >> >> >> 1. *Scrub malhar-library for any operators needing additional >> dependencies* >> >> `malhar-library` is intended to consist of only operators without >> extra >> >> transitive dependencies. All operators should be checked for the >> necessity >> >> of extra dependencies. >> >> >> >> 2. *Move operators from malhar-demos and malhar-apps into contrib (or >> >> library if prudent)* >> >> There are various operators in both of these modules that are >> general >> >> enough to move into library or contrib. >> >> >> >> 3. *Create modules for all contrib subfolders* >> >> All folders under `contrib/src/main/com/datatorrent/contrib/` >> should be >> >> converted to modules of contrib and listed as such in >> `/contrib/pom.xml`. >> >> Additionally, each of these smaller contrib modules will have its >> own >> >> version and dependencies. >> >> >> >> 4. *Use the Shades Plugin to allow for backwards-compatible >> fully-qualified >> >> class names* >> >> This is made possible by shades class relocation >> >> < >> >> >> https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html >> >> feature. This might be a bit error prone as well as confusing to use >> for >> >> outside developers, but it must be done if these changes are to be made >> >> prior to a major release. >> >> >> >> >> >> >> >> Let me know what you all think of this approach. >> >> >> >> Best, >> >> Andy >> >> >> >> >> >> On Tue, Sep 22, 2015 at 11:20 AM, Chetan Narsude < >> [email protected]> >> >> wrote: >> >> >> >>> +1 >> >>> >> >>> On Tue, Sep 22, 2015 at 11:08 AM, Gaurav Gupta < >> [email protected]> >> >>> wrote: >> >>> >> >>>> I agree with David.. Each artifact should have it's own version >> >>>> >> >>>> Thanks >> >>>> -Gaurav >> >>>> >> >>>>> On Tue, Sep 22, 2015 at 11:07 AM, David Yan <[email protected]> >> >>>> wrote: >> >>>> >> >>>>> I actually think that each baby artifact should have its own >> version, >> >>>>> because each artifact has its own interface and its own life cycle, >> >>>>> especially after we break up the giant library, applications will >> >>> depend >> >>>> on >> >>>>> the baby artifacts instead of the giant library. For example if >> >> there >> >>> is >> >>>>> no change in malhar-contrib-kafka (I think the name should actually >> >> be >> >>>>> apex-malhar-kafka), we should not confuse users by bumping the >> >> version. >> >>>>> >> >>>>> David >> >>>>> >> >>>>> On Tue, Sep 22, 2015 at 9:03 AM, Andy Perlitch < >> [email protected] >> >>> >> >>>>> wrote: >> >>>>> >> >>>>>> Tushar, >> >>>>>> >> >>>>>> I agree that all modules should inherit the version from the >> >> "parent >> >>>> pom" >> >>>>>> of the malhar repo. I think the benefits outweigh the cost of >> >> bumping >> >>>>>> versions of components that haven't actually changed. I'd love to >> >> get >> >>>>>> others feedback on this as well. >> >>>>>> >> >>>>>> On another note, I plan on starting a spreadsheet/googledoc with >> >> the >> >>>>>> possible groupings of operators into these modules. Stay tuned... >> >>>>>> >> >>>>>> -Andy >> >>>>>> >> >>>>>> On Mon, Sep 21, 2015 at 11:51 PM, Tushar Gosavi < >> >>>> [email protected]> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> +1 for the general idea >> >>>>>>> >> >>>>>>> Does these independent modules going to have independent >> >> versions? >> >>>> For >> >>>>>>> example, if there is no change in kafka operator between malhar >> >> 3.0 >> >>>> and >> >>>>>>> malhar 4.0, will we increment version of malhar-contrib-kafka to >> >>>> 4.0. I >> >>>>>>> have learned from my previous project that, It is easier to >> >> manage >> >>>>>> versions >> >>>>>>> if we make all modules at same version level for a release, even >> >> if >> >>>>> there >> >>>>>>> is no change in a particular module. >> >>>>>>> >> >>>>>>> - Tushar. >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> On Fri, Sep 18, 2015 at 12:18 AM, Timothy Farkas < >> >>>> [email protected]> >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> I agree Andy's solution is better, but just for the sake of >> >>>> argument >> >>>>>>>> profiles can be inherited from a parent pom, so if the maven >> >>>>> archetype >> >>>>>>>> defines a new project with a parent pom with the correct >> >> profiles >> >>>>>>> defined, >> >>>>>>>> then the desired profiles can be activated in the pom of the >> >> new >> >>>>>> project. >> >>>>>>>> It is no more complicated than adding additional dependencies >> >> to >> >>>> your >> >>>>>>>> project. >> >>>>>>>> >> >>>>>>>> On Thu, Sep 17, 2015 at 10:32 AM, Sandesh Hegde < >> >>>>>> [email protected] >> >>>>>>>> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> Currently all the dependencies in Malhar-Contrib are marked >> >> as >> >>>>>>> optional. >> >>>>>>>> So >> >>>>>>>>> users have to already modify the existing POM to use it in >> >>> their >> >>>>>>> project. >> >>>>>>>>> So restructuring should be fine. >> >>>>>>>>> >> >>>>>>>>> On Thu, Sep 17, 2015 at 11:29 AM Chetan Narsude < >> >>>>>>> [email protected]> >> >>>>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>>> The profiles are excellent when you are developing >> >>>>> malhar-contrib. >> >>>>>>>>> Profiles >> >>>>>>>>>> do not work when you are using malhar-contrib. The problem >> >>> Andy >> >>>>> is >> >>>>>>>>> trying >> >>>>>>>>>> to solve is the later. If there is an elegant solution >> >> which >> >>> I >> >>>> am >> >>>>>>>> missing >> >>>>>>>>>> using profiles, please correct me. >> >>>>>>>>>> >> >>>>>>>>>> The way Andy suggested is the way many successful projects >> >> do >> >>>> it. >> >>>>>>> Look >> >>>>>>>> at >> >>>>>>>>>> Netty as an example. >> >>>>>>>>>> >> >>>>>>>>>> +1 for that. >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> -- >> >>>>>>>>>> Chetan >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> On Thu, Sep 17, 2015 at 11:22 AM, Timothy Farkas < >> >>>>>>> [email protected]> >> >>>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> I think restructuring the project in that way would be >> >> the >> >>>>>>>> technically >> >>>>>>>>>>> correct thing to do, but if people are unwilling to >> >> accept >> >>>> the >> >>>>>>> change >> >>>>>>>>> in >> >>>>>>>>>>> project structure you could achieve something similar by >> >>>> using >> >>>>>>> maven >> >>>>>>>>>>> profiles. With profiles the project structure would >> >> remain >> >>> as >> >>>>> is. >> >>>>>>>>>> Profiles >> >>>>>>>>>>> could be added to the malhar pom, and a profile would >> >>> define >> >>>>> the >> >>>>>>>>>>> dependencies needed for different types of operators. For >> >>>>> example >> >>>>>>> the >> >>>>>>>>>> hbase >> >>>>>>>>>>> profile would define the dependencies for the hbase >> >>> operator. >> >>>>>> Then >> >>>>>>>> any >> >>>>>>>>>>> project using a malhar library would just activate the >> >>>> correct >> >>>>>>>> profile >> >>>>>>>>> in >> >>>>>>>>>>> it's pom, and the correct dependencies would be pulled >> >> in. >> >> >> http://maven.apache.org/guides/introduction/introduction-to-profiles.html >> >>>>>>>>>>> >> >>>>>>>>>>> On Thu, Sep 17, 2015 at 10:01 AM, Andy Perlitch < >> >>>>>>>> [email protected]> >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> Hi everyone, >> >>>>>>>>>>>> >> >>>>>>>>>>>> I am currently assigned to MLHR-1843 >> >>>>>>>>>>>> <https://malhar.atlassian.net/browse/MLHR-1843>, which >> >>>>>>> essentially >> >>>>>>>>>> aims >> >>>>>>>>>>> to >> >>>>>>>>>>>> expose smaller, more consumable maven artifacts that >> >>> would >> >>>> do >> >>>>>>> away >> >>>>>>>>> with >> >>>>>>>>>>> the >> >>>>>>>>>>>> need to manually include necessary dependencies based >> >> on >> >>>> the >> >>>>>>>>> operators >> >>>>>>>>>> in >> >>>>>>>>>>>> use. >> >>>>>>>>>>>> >> >>>>>>>>>>>> As an example, say I am building an app package that >> >>> needs >> >>>>>> Kafka >> >>>>>>>>> input >> >>>>>>>>>>> and >> >>>>>>>>>>>> output operators, but I don't want all the other >> >>> transitive >> >>>>>>>>>> dependencies >> >>>>>>>>>>>> that come via malhar-contrib. Currently I would need to >> >>>>> specify >> >>>>>>>>>>>> malhar-contrib as a dependency, and add an exclusions >> >>> block >> >>>>> in >> >>>>>>> my >> >>>>>>>>> app >> >>>>>>>>>>>> package pom: >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >> >>>>>>>>>>>> <artifactId>malhar-contrib</artifactId> >> >>>>>> <version>3.0.0</version> >> >>>>>>>>> <!-- >> >>>>>>>>>>> so >> >>>>>>>>>>>> none of malhar-contrib's deps are included -->* >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> * <exclusions> <exclusion> >> >> <groupId>*</groupId> >> >>>>>>>>>>>> <artifactId>*</artifactId> </exclusion> >> >>>>>>>>> </exclusions></dependency>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> Then, I would have to include the kafka library >> >>> explicitly >> >>>>> as a >> >>>>>>>>>>> dependency: >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<dependency> <groupId>org.apache.kafka</groupId> >> >>>>>>>>>>>> <artifactId>kafka_2.10</artifactId> >> >>>>>>>>>>>> <version>0.8.1.1</version></dependency>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> Wouldn't it be nice if I could just put this in my >> >> pom?: >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> In order to make this possible, we will need to >> >> organize >> >>>> the >> >>>>>>> malhar >> >>>>>>>>>>> project >> >>>>>>>>>>>> into more granular modules (artifacts). Specifically, >> >> the >> >>>>>>>>>> malhar-contrib >> >>>>>>>>>>>> artifact would essentially just be a pom that specifies >> >>>> each >> >>>>>>>> smaller >> >>>>>>>>>>> module >> >>>>>>>>>>>> as a dependency: >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<!-- in malhar-contrib's pom.xml: -->* >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<modules> <module>kafka</module>* >> >>>>>>>>>>>> * <module>twitter</module>* >> >>>>>>>>>>>> * <module>redis</module>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> * <!-- other smaller modules --></modules>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >> >>>>>>>>>>>> <artifactId>malhar-contrib-kafka</artifactId> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >> >>>>>>>>>>>> <artifactId>malhar-contrib-twitter</artifactId> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>>> *<dependency> <groupId>com.datatorrent</groupId> >> >>>>>>>>>>>> <artifactId>malhar-contrib-redis</artifactId> >> >>>>>>>>>>>> <version>3.0.0</version></dependency>* >> >>>>>>>>>>>> >> >>>>>>>>>>>> With these changes, there may be a risk of breaking >> >>>> backwards >> >>>>>>>>>>>> compatibility, however I think the gain in usability of >> >>>>> malhar >> >>>>>>>> merits >> >>>>>>>>>> the >> >>>>>>>>>>>> effort to make this work. >> >>>>>>>>>>>> >> >>>>>>>>>>>> I am still relatively new to maven, so I would love to >> >>> get >> >>>>> some >> >>>>>>>>>> feedback >> >>>>>>>>>>>> from other devs about this! >> >>>>>>>>>>>> >> >>>>>>>>>>>> -- >> >>>>>>>>>>>> Regards, >> >>>>>>>>>>>> Andy Perlitch >> >>>>>>>>>>>> Software Engineer >> >>>>>>>>>>>> DataTorrent Inc >> >>>>>>>>>>>> (408)829-9319 >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> Regards, >> >>>>>> Andy Perlitch >> >>>>>> Software Engineer >> >>>>>> DataTorrent Inc >> >>>>>> (408)829-9319 >> >> >> >> >> >> >> >> -- >> >> Regards, >> >> Andy Perlitch >> >> Software Engineer >> >> DataTorrent Inc >> >> (408)829-9319 >> >> >> > >
