If we standardize on an approach, I think it should be the shade plugin because it properly handles META-INF content (required for HDFS, HBase, etc.). The assembly plugin doesn’t.
If we specify a Hadoop version and allow all the dependencies to be pulled in, we would just be shifting the complexity since users would then have to exclude the bundled dependencies in addition to specifying the version they want. I tested using the Kafka spout and HDFS bolt from an unmodified flux-examples jar (from 0.10.0-beta1) and it worked without a hitch. I understand the frustration, but I’m not sure much can be done about it. We can’t control the dependencies of 3rd party projects, and especially with Hadoop, there are many dependencies that can change from version to version and lead to conflicts. -Taylor > On Jul 30, 2015, at 10:35 AM, Sriharsha Chintalapani > <[email protected]> wrote: > > Hi All, > Currently the way we publish storm connector jars into maven > repositories is to just publish storm-kafka, hive, hbase without any of its > dependencies included. > The expectation here is user will include their version of hdfs and kafka > dependencies along with storm-hdfs or kafka and package it with topology as a > uber jar. > IMO this is most painful step in deploying/building a topology as observed > here https://issues.apache.org/jira/browse/STORM-967 . I think we need to > standardize either assembly or shade plugin. > Also why don’t we publish connectors with all the dependencies included and > user only need to include a dependency storm-hdfs in their pom.xml and it > will bring all the other dependencies its needed. > Any ideas on improving this? > > Thanks, > Harsha >
signature.asc
Description: Message signed with OpenPGP using GPGMail
