Re: Classpath hell and Elasticsearch 2.3.2...

Nick Pentreath Thu, 02 Jun 2016 16:00:00 -0700

Which Scala version is Spark built against? I'd guess it's 2.10 since
you're using spark-1.6, and you're using the 2.11 jar for es-hadoop.



On Thu, 2 Jun 2016 at 15:50 Kevin Burton <bur...@spinn3r.com> wrote:

> Thanks.
>
> I'm trying to run it in a standalone cluster with an existing / large 100
> node ES install.
>
> I'm using the standard 1.6.1 -2.6 distribution with
> elasticsearch-hadoop-2.3.2...
>
> I *think* I'm only supposed to use the
> elasticsearch-spark_2.11-2.3.2.jar with it...
>
> but now I get the following exception:
>
>
> java.lang.NoSuchMethodError:
> scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
> at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
> at
> org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:40)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:45)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:47)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:49)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:51)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:53)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:55)
> at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:57)
> at $iwC$$iwC$$iwC$$iwC.<init>(<console>:59)
> at $iwC$$iwC$$iwC.<init>(<console>:61)
> at $iwC$$iwC.<init>(<console>:63)
> at $iwC.<init>(<console>:65)
> at <init>(<console>:67)
> at .<init>(<console>:71)
> at .<clinit>(<console>)
> at .<init>(<console>:7)
> at .<clinit>(<console>)
> at $print(<console>)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
> at
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
> at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
> at
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
> at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
> at
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
> at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
> at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
> at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
> at org.apache.spark.repl.SparkILoop.org
> $apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
> at
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
> at
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
> at
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
> at
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
> at org.apache.spark.repl.SparkILoop.org
> $apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
> at org.apache.spark.repl.Main$.main(Main.scala:31)
> at org.apache.spark.repl.Main.main(Main.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> On Thu, Jun 2, 2016 at 3:45 PM, Nick Pentreath <nick.pentre...@gmail.com>
> wrote:
>
>> Hey there
>>
>> When I used es-hadoop, I just pulled in the dependency into my pom.xml,
>> with spark as a "provided" dependency, and built a fat jar with assembly.
>>
>> Then with spark-submit use the --jars option to include your assembly jar
>> (IIRC I sometimes also needed to use --driver-classpath too, but perhaps
>> not with recent Spark versions).
>>
>>
>>
>> On Thu, 2 Jun 2016 at 15:34 Kevin Burton <bur...@spinn3r.com> wrote:
>>
>>> I'm trying to get spark 1.6.1 to work with 2.3.2... needless to say it's
>>> not super easy.
>>>
>>> I wish there was an easier way to get this stuff to work.. Last time I
>>> tried to use spark more I was having similar problems with classpath setup
>>> and Cassandra.
>>>
>>> Seems a huge opportunity to make this easier for new developers.  This
>>> stuff isn't rocket science but it can (needlessly) waste a ton of time.
>>>
>>> ... anyway... I'm have since figured out I have to specific *specific*
>>> jars from the elasticsearch-hadoop distribution and use those.
>>>
>>> Right now I'm using :
>>>
>>>
>>> SPARK_CLASSPATH=/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar:/usr/share/apache-spark/lib/*
>>>
>>> ... but I"m getting:
>>>
>>> java.lang.NoClassDefFoundError: Could not initialize class
>>> org.elasticsearch.hadoop.util.Version
>>> at
>>> org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
>>> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
>>> at
>>> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>>> at
>>> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>
>>> ... but I think its caused by this:
>>>
>>> 16/06/03 00:26:48 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID
>>> 0, localhost): java.lang.Error: Multiple ES-Hadoop versions detected in the
>>> classpath; please use only one
>>>
>>> jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar
>>>
>>> jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar
>>>
>>> jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar
>>>
>>> at org.elasticsearch.hadoop.util.Version.<clinit>(Version.java:73)
>>> at
>>> org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
>>> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
>>> at
>>> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>>> at
>>> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
>>> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>> .. still tracking this down but was wondering if there is someting
>>> obvious I'm dong wrong.  I'm going to take out
>>> elasticsearch-hadoop-2.3.2.jar and try again.
>>>
>>> Lots of trial and error here :-/
>>>
>>> Kevin
>>>
>>> --
>>>
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>
>>>
>
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>
>

Re: Classpath hell and Elasticsearch 2.3.2...

Reply via email to