Yeah, good point. I think that in the future we'll move to having smaller 
"plugin" projects for various input sources (e.g. spark-kafka, spark-rabbitmq).

Matei

On Oct 15, 2013, at 9:50 AM, Ryan Weald <r...@weald.com> wrote:

> Thanks for the response. It is interesting that you must manually specify the 
> Kafka dependency given it is required for even the most basic streaming job 
> to compile, regardless of whether you have any intention of using the Kafka 
> input source. Might be worth considering a way to exclude the entire kafka 
> input source from the default Spark Streaming dependency if the goal is to 
> keep size down and you don't want to confuse new adopters who aren't using 
> Kafka as part of their tech stack. 
> 
> -Ryan
> 
> 
> On Sat, Oct 12, 2013 at 10:52 AM, Matei Zaharia <matei.zaha...@gmail.com> 
> wrote:
> Hi Ryan,
> 
> Spark Streaming ships with a special version of the Kafka 0.7.2 client that 
> we ported to Scala 2.9, and you need to add that as a JAR explicitly in your 
> project. The JAR is in 
> streaming/lib/org/apache/kafka/kafka/0.7.2-spark/kafka-0.7.2-spark.jar under 
> Spark. The streaming/lib directory is also designed to act as a Maven repo -- 
> see streaming/pom.xml for how to reference it from Maven.
> 
> Matei
> 
> On Oct 11, 2013, at 3:14 PM, Ryan Weald <r...@weald.com> wrote:
> 
>> When pulling the latest (0.8.0-incubating) dependencies from maven I get a 
>> compile error when I try and write a basic streaming job. The error produces 
>> a large stack trace which you can find 
>> here(https://gist.github.com/rweald/6942840) which seems to point to a lack 
>> of a Kafka dependency. 
>> 
>> Has anyone else had this problem and is there a documented solution? 
>> 
>> Cheers
>> 
>> -Ryan Weald
> 
> 

Reply via email to