Re: Clean up dependencies in streaming connectors

Stephan Ewen Mon, 29 Sep 2014 09:51:17 -0700

Shipping the connectors with the job jars would thin out the dependencies,
but make it more cumbersome to assemble a job jar.


On Mon, Sep 29, 2014 at 6:47 PM, Gyula Fora <[email protected]> wrote:

> Thanks, I will look into this and try to figure it out, as you can see I
> am not a maven pro :)
>
> On 29 Sep 2014, at 18:44, Stephan Ewen <[email protected]> wrote:
>
> > You may be able to solve this with careful exclusions.
> >
> > It seems kafka is monolithic, having no separation between connector and
> > engine. If you know for example that zookeeper is not required by the
> > connector (you have to be sure), you can exclude it as the dependency. We
> > have done this for Hadoop1, where we only use the HDFS client
> functionality.
> >
> > On Mon, Sep 29, 2014 at 6:40 PM, Gyula Fóra <[email protected]>
> wrote:
> >
> >> Yes, you are right, kafka and flume are the heavy ones.
> >>
> >> We always have the choice to take out them from the package and maybe
> have
> >> a separate repo for all the different connectors and only keep 1-2 most
> >> important ones. I don't think there's much else to do because we don't
> use
> >> the packages you mentioned, but they get pulled by the kafka and flume
> >> dependencies.
> >>
> >>
> >>
> >>
> >> On Mon, Sep 29, 2014 at 6:24 PM, Stephan Ewen <[email protected]> wrote:
> >>
> >>> The streaming connectors currently pull a massive amount of
> dependencies.
> >>>
> >>> For example, we transitively get the scala compiler/reflection/etc and
> >>> ZooKeeper.
> >>>
> >>> A lot of stuff comes with flume and kafka. Are those required to make
> the
> >>> connectors work? Otherwise, it might be good to exclude them, to
> prevent
> >>> conflicts for users that actually depend on those components.
> >>>
> >>
>
>

Re: Clean up dependencies in streaming connectors

Reply via email to