Yeah, good point. I think that in the future we'll move to having smaller "plugin" projects for various input sources (e.g. spark-kafka, spark-rabbitmq).
Matei On Oct 15, 2013, at 9:50 AM, Ryan Weald <r...@weald.com> wrote: > Thanks for the response. It is interesting that you must manually specify the > Kafka dependency given it is required for even the most basic streaming job > to compile, regardless of whether you have any intention of using the Kafka > input source. Might be worth considering a way to exclude the entire kafka > input source from the default Spark Streaming dependency if the goal is to > keep size down and you don't want to confuse new adopters who aren't using > Kafka as part of their tech stack. > > -Ryan > > > On Sat, Oct 12, 2013 at 10:52 AM, Matei Zaharia <matei.zaha...@gmail.com> > wrote: > Hi Ryan, > > Spark Streaming ships with a special version of the Kafka 0.7.2 client that > we ported to Scala 2.9, and you need to add that as a JAR explicitly in your > project. The JAR is in > streaming/lib/org/apache/kafka/kafka/0.7.2-spark/kafka-0.7.2-spark.jar under > Spark. The streaming/lib directory is also designed to act as a Maven repo -- > see streaming/pom.xml for how to reference it from Maven. > > Matei > > On Oct 11, 2013, at 3:14 PM, Ryan Weald <r...@weald.com> wrote: > >> When pulling the latest (0.8.0-incubating) dependencies from maven I get a >> compile error when I try and write a basic streaming job. The error produces >> a large stack trace which you can find >> here(https://gist.github.com/rweald/6942840) which seems to point to a lack >> of a Kafka dependency. >> >> Has anyone else had this problem and is there a documented solution? >> >> Cheers >> >> -Ryan Weald > >