If we standardize on an approach, I think it should be the shade plugin because 
it properly handles META-INF content (required for HDFS, HBase, etc.). The 
assembly plugin doesn’t.

If we specify a Hadoop version and allow all the dependencies to be pulled in, 
we would just be shifting the complexity since users would then have to exclude 
the bundled dependencies in addition to specifying the version they want.

I tested using the Kafka spout and HDFS bolt from an unmodified flux-examples 
jar (from 0.10.0-beta1) and it worked without a hitch.

I understand the frustration, but I’m not sure much can be done about it. We 
can’t control the dependencies of 3rd party projects, and especially with 
Hadoop, there are many dependencies that can change from version to version and 
lead to conflicts.

-Taylor

> On Jul 30, 2015, at 10:35 AM, Sriharsha Chintalapani 
> <[email protected]> wrote:
> 
> Hi All,
>              Currently the way we publish storm connector jars into maven 
> repositories is to just publish storm-kafka, hive, hbase without any of its 
> dependencies included.
> The expectation here is user will include their version of hdfs and kafka 
> dependencies along with storm-hdfs or kafka and package it with topology as a 
> uber jar.
> IMO this is most painful step in deploying/building a topology as observed 
> here  https://issues.apache.org/jira/browse/STORM-967 .  I think we need to 
> standardize either assembly or shade plugin.
> Also why don’t we publish connectors with all the dependencies included and 
> user only need to include a dependency storm-hdfs in their pom.xml and it 
> will bring all the other dependencies its needed.
> Any ideas on improving this?
> 
> Thanks,
> Harsha
> 

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to