Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-14 Thread Tathagata Das
It would be good if you can tell what I should add to the documentation to make it easier to understand. I can update the docs for 1.4.0 release. On Tue, May 12, 2015 at 9:52 AM, Lee McFadden splee...@gmail.com wrote: Thanks for explaining Sean and Cody, this makes sense now. I'd like to help

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Sean Owen
The question is really whether all the third-party integrations should be built into Spark's main assembly. I think reasonable people could disagree, but I think the current state (not built in) is reasonable. It means you have to bring the integration with you. That is, no, third-party queue

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Lee McFadden
Thanks for explaining Sean and Cody, this makes sense now. I'd like to help improve this documentation so other python users don't run into the same thing, so I'll look into that today. On Tue, May 12, 2015 at 9:44 AM Cody Koeninger c...@koeninger.org wrote: One of the packages just contains

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Sean Owen
I think Java-land users will understand to look for an assembly jar in general, but it's not as obvious outside the Java ecosystem. Assembly = this thing, plus all its transitive dependencies. No, there is nothing wrong with Kafka at all. You need to bring everything it needs for it to work at

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Lee McFadden
Thanks again for all the help folks. I can confirm that simply switching to `--packages org.apache.spark:spark-streaming-kafka-assembly_2.10:1.3.1` makes everything work as intended. I'm not sure what the difference is between the two packages honestly, or why one should be used over the other,

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Sean Owen
Yeah, fair point about Python. spark-streaming-kafka should not contain third-party dependencies. However there's nothing stopping the build from producing an assembly jar from these modules. I think there is an assembly target already though? On Tue, May 12, 2015 at 3:37 PM, Lee McFadden

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Ted Yu
Currently external/kafka/pom.xml doesn't cite yammer metrics as dependency. $ ls -l ~/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar -rw-r--r-- 1 tyu staff 82123 Dec 17 2013 /Users/tyu/.m2/repository/com/yammer/metrics/metrics-core/2.2.0/metrics-core-2.2.0.jar

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-12 Thread Ted Yu
bq. it is already in the assembly Yes. Verified: $ jar tvf ~/Downloads/spark-streaming-kafka-assembly_2.10-1.3.1.jar | grep yammer | grep Gauge 1329 Sat Apr 11 04:25:50 PDT 2015 com/yammer/metrics/core/Gauge.class On Tue, May 12, 2015 at 8:05 AM, Sean Owen so...@cloudera.com wrote: It

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Ted Yu
You can use '--jars ' option of spark-submit to ship metrics-core jar. Cheers On Mon, May 11, 2015 at 2:04 PM, Lee McFadden splee...@gmail.com wrote: Thanks Ted, The issue is that I'm using packages (see spark-submit definition) and I do not know how to add com.yammer.metrics:metrics-core

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Ted Yu
com.yammer.metrics.core.Gauge is in metrics-core jar e.g., in master branch: [INFO] | \- org.apache.kafka:kafka_2.10:jar:0.8.1.1:compile [INFO] | +- com.yammer.metrics:metrics-core:jar:2.2.0:compile Please make sure metrics-core jar is on the classpath. On Mon, May 11, 2015 at 1:32 PM, Lee

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
Thanks Ted, The issue is that I'm using packages (see spark-submit definition) and I do not know how to add com.yammer.metrics:metrics-core to my classpath so Spark can see it. Should metrics-core not be part of the org.apache.spark:spark-streaming-kafka_2.10:1.3.1 package so it can work

Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
Hi, We've been having some issues getting spark streaming running correctly using a Kafka stream, and we've been going around in circles trying to resolve this dependency. Details of our environment and the error below, if anyone can help resolve this it would be much appreciated. Submit

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Sean Owen
Ah yes, the Kafka + streaming code isn't in the assembly, is it? you'd have to provide it and all its dependencies with your app. You could also build this into your own app jar. Tools like Maven will add in the transitive dependencies. On Mon, May 11, 2015 at 10:04 PM, Lee McFadden

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
Ted, many thanks. I'm not used to Java dependencies so this was a real head-scratcher for me. Downloading the two metrics packages from the maven repository (metrics-core, metrics-annotation) and supplying it on the spark-submit command line worked. My final spark-submit for a python project

Re: Kafka stream fails: java.lang.NoClassDefFound com/yammer/metrics/core/Gauge

2015-05-11 Thread Lee McFadden
I opened a ticket on this (without posting here first - bad etiquette, apologies) which was closed as 'fixed'. https://issues.apache.org/jira/browse/SPARK-7538 I don't believe that because I have my script running means this is fixed, I think it is still an issue. I downloaded the spark source,