Re: Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-03 Thread Costin Leau

Hi,

Sorry to hear about your troubles. Not sure whether you are aware of the 
ES-Hadoop docs [1]. I've raised an issue [2] to
better clarify the usage of elasticsearch-hadoop vs elasticsearch-spark jars.

Apologies for the delayed response, for ES-Hadoop questions/issues it's best to 
use the dedicated forum namely
https://discuss.elastic.co/c/elasticsearch-and-hadoop (see [3]).

Hope this helps,

[1] https://www.elastic.co/guide/en/elasticsearch/hadoop/2.3/spark.html
[2] https://github.com/elastic/elasticsearch-hadoop/issues/780
[3] 
https://www.elastic.co/guide/en/elasticsearch/hadoop/master/troubleshooting.html#help


On 6/3/16 2:06 AM, Kevin Burton wrote:

Yeah.. thanks Nick. Figured that out since your last email... I deletedthe 2.10 
by accident but then put 2+2 together.

Got it working now.

Still sticking to my story that it's somewhat complicated to setup :)

Kevin

On Thu, Jun 2, 2016 at 3:59 PM, Nick Pentreath > wrote:

Which Scala version is Spark built against? I'd guess it's 2.10 since 
you're using spark-1.6, and you're using the
2.11 jar for es-hadoop.


On Thu, 2 Jun 2016 at 15:50 Kevin Burton > wrote:

Thanks.

I'm trying to run it in a standalone cluster with an existing /large 
100 node ES install.

I'm using the standard 1.6.1 -2.6 distribution with 
elasticsearch-hadoop-2.3.2...

I *think* I'm only supposed to use the 
elasticsearch-spark_2.11-2.3.2.jar with it...

but now I get the following exception:


java.lang.NoSuchMethodError: 
scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
at 
org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:40)
at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:45)
at 
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:47)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55)
at $iwC$$iwC$$iwC$$iwC$$iwC.(:57)
at $iwC$$iwC$$iwC$$iwC.(:59)
at $iwC$$iwC$$iwC.(:61)
at $iwC$$iwC.(:63)
at $iwC.(:65)
at (:67)
at .(:71)
at .()
at .(:7)
at .()
at $print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at 
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at 
org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at 
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at 
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at 
org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
at 
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org

$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at

org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at 
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org

$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Chris Fregly
i recently powered through this Spark + ElasticSearch integration, as well.

you see this + many other Spark integrations with the PANCAKE STACK
 here:  https://github.com/fluxcapacitor/pipeline

all configs found here:
https://github.com/fluxcapacitor/pipeline/tree/master/config

in particular, the Stanford CoreNLP + Spark ML Pipeline integration was the
most-difficult, but we got it working finally with some hard-coding and
finger-crossing!


On Thu, Jun 2, 2016 at 4:09 PM, Nick Pentreath 
wrote:

> Fair enough.
>
> However, if you take a look at the deployment guide (
> http://spark.apache.org/docs/latest/submitting-applications.html#bundling-your-applications-dependencies)
> you will see that the generally advised approach is to package your app
> dependencies into a fat JAR and submit (possibly using the --jars option
> too). This also means you specify the Scala and other library versions in
> your project pom.xml or sbt file, avoiding having to manually decide which
> artefact to include on your classpath  :)
>
> On Thu, 2 Jun 2016 at 16:06 Kevin Burton  wrote:
>
>> Yeah.. thanks Nick. Figured that out since your last email... I deleted
>> the 2.10 by accident but then put 2+2 together.
>>
>> Got it working now.
>>
>> Still sticking to my story that it's somewhat complicated to setup :)
>>
>> Kevin
>>
>> On Thu, Jun 2, 2016 at 3:59 PM, Nick Pentreath 
>> wrote:
>>
>>> Which Scala version is Spark built against? I'd guess it's 2.10 since
>>> you're using spark-1.6, and you're using the 2.11 jar for es-hadoop.
>>>
>>>
>>> On Thu, 2 Jun 2016 at 15:50 Kevin Burton  wrote:
>>>
 Thanks.

 I'm trying to run it in a standalone cluster with an existing / large
 100 node ES install.

 I'm using the standard 1.6.1 -2.6 distribution with
 elasticsearch-hadoop-2.3.2...

 I *think* I'm only supposed to use the
 elasticsearch-spark_2.11-2.3.2.jar with it...

 but now I get the following exception:


 java.lang.NoSuchMethodError:
 scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
 at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
 at
 org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:40)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:45)
 at
 $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:47)
 at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49)
 at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51)
 at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53)
 at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55)
 at $iwC$$iwC$$iwC$$iwC$$iwC.(:57)
 at $iwC$$iwC$$iwC$$iwC.(:59)
 at $iwC$$iwC$$iwC.(:61)
 at $iwC$$iwC.(:63)
 at $iwC.(:65)
 at (:67)
 at .(:71)
 at .()
 at .(:7)
 at .()
 at $print()
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:497)
 at
 org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
 at
 org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
 at
 org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
 at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
 at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
 at
 org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
 at
 org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
 at
 org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
 at
 org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
 at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
 at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
 at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
 at org.apache.spark.repl.SparkILoop.org
 $apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
 at
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
 at
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
 at
 org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
 at
 scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
 at org.apache.spark.repl.SparkILoop.org
 

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
Fair enough.

However, if you take a look at the deployment guide (
http://spark.apache.org/docs/latest/submitting-applications.html#bundling-your-applications-dependencies)
you will see that the generally advised approach is to package your app
dependencies into a fat JAR and submit (possibly using the --jars option
too). This also means you specify the Scala and other library versions in
your project pom.xml or sbt file, avoiding having to manually decide which
artefact to include on your classpath  :)

On Thu, 2 Jun 2016 at 16:06 Kevin Burton  wrote:

> Yeah.. thanks Nick. Figured that out since your last email... I deleted
> the 2.10 by accident but then put 2+2 together.
>
> Got it working now.
>
> Still sticking to my story that it's somewhat complicated to setup :)
>
> Kevin
>
> On Thu, Jun 2, 2016 at 3:59 PM, Nick Pentreath 
> wrote:
>
>> Which Scala version is Spark built against? I'd guess it's 2.10 since
>> you're using spark-1.6, and you're using the 2.11 jar for es-hadoop.
>>
>>
>> On Thu, 2 Jun 2016 at 15:50 Kevin Burton  wrote:
>>
>>> Thanks.
>>>
>>> I'm trying to run it in a standalone cluster with an existing / large
>>> 100 node ES install.
>>>
>>> I'm using the standard 1.6.1 -2.6 distribution with
>>> elasticsearch-hadoop-2.3.2...
>>>
>>> I *think* I'm only supposed to use the
>>> elasticsearch-spark_2.11-2.3.2.jar with it...
>>>
>>> but now I get the following exception:
>>>
>>>
>>> java.lang.NoSuchMethodError:
>>> scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
>>> at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
>>> at
>>> org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:40)
>>> at
>>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:45)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:47)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55)
>>> at $iwC$$iwC$$iwC$$iwC$$iwC.(:57)
>>> at $iwC$$iwC$$iwC$$iwC.(:59)
>>> at $iwC$$iwC$$iwC.(:61)
>>> at $iwC$$iwC.(:63)
>>> at $iwC.(:65)
>>> at (:67)
>>> at .(:71)
>>> at .()
>>> at .(:7)
>>> at .()
>>> at $print()
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:497)
>>> at
>>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>>> at
>>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>>> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>>> at
>>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>>> at
>>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>>> at
>>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
>>> at
>>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>>> at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>>> at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>>> at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>>> at org.apache.spark.repl.SparkILoop.org
>>> $apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>>> at
>>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>>> at
>>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>> at
>>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>>> at
>>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>>> at org.apache.spark.repl.SparkILoop.org
>>> $apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>>> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>>> at org.apache.spark.repl.Main$.main(Main.scala:31)
>>> at org.apache.spark.repl.Main.main(Main.scala)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:497)
>>> at
>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>>> at
>>> 

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Kevin Burton
Yeah.. thanks Nick. Figured that out since your last email... I deleted the
2.10 by accident but then put 2+2 together.

Got it working now.

Still sticking to my story that it's somewhat complicated to setup :)

Kevin

On Thu, Jun 2, 2016 at 3:59 PM, Nick Pentreath 
wrote:

> Which Scala version is Spark built against? I'd guess it's 2.10 since
> you're using spark-1.6, and you're using the 2.11 jar for es-hadoop.
>
>
> On Thu, 2 Jun 2016 at 15:50 Kevin Burton  wrote:
>
>> Thanks.
>>
>> I'm trying to run it in a standalone cluster with an existing / large 100
>> node ES install.
>>
>> I'm using the standard 1.6.1 -2.6 distribution with
>> elasticsearch-hadoop-2.3.2...
>>
>> I *think* I'm only supposed to use the
>> elasticsearch-spark_2.11-2.3.2.jar with it...
>>
>> but now I get the following exception:
>>
>>
>> java.lang.NoSuchMethodError:
>> scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
>> at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
>> at
>> org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:40)
>> at
>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:45)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:47)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53)
>> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55)
>> at $iwC$$iwC$$iwC$$iwC$$iwC.(:57)
>> at $iwC$$iwC$$iwC$$iwC.(:59)
>> at $iwC$$iwC$$iwC.(:61)
>> at $iwC$$iwC.(:63)
>> at $iwC.(:65)
>> at (:67)
>> at .(:71)
>> at .()
>> at .(:7)
>> at .()
>> at $print()
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:497)
>> at
>> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
>> at
>> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
>> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
>> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
>> at
>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
>> at
>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>> at
>> org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
>> at
>> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
>> at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
>> at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
>> at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
>> at org.apache.spark.repl.SparkILoop.org
>> $apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
>> at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
>> at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>> at
>> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
>> at
>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>> at org.apache.spark.repl.SparkILoop.org
>> $apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
>> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
>> at org.apache.spark.repl.Main$.main(Main.scala:31)
>> at org.apache.spark.repl.Main.main(Main.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:497)
>> at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
>> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>>
>> On Thu, Jun 2, 2016 at 3:45 PM, Nick Pentreath 
>> wrote:
>>
>>> Hey there
>>>
>>> When I used es-hadoop, I just pulled in the dependency into my pom.xml,
>>> with spark as a "provided" dependency, and built a fat jar with assembly.
>>>
>>> Then with spark-submit use the --jars option to include your assembly
>>> jar (IIRC I sometimes also needed to use --driver-classpath too, but
>>> perhaps not with recent Spark 

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
Which Scala version is Spark built against? I'd guess it's 2.10 since
you're using spark-1.6, and you're using the 2.11 jar for es-hadoop.


On Thu, 2 Jun 2016 at 15:50 Kevin Burton  wrote:

> Thanks.
>
> I'm trying to run it in a standalone cluster with an existing / large 100
> node ES install.
>
> I'm using the standard 1.6.1 -2.6 distribution with
> elasticsearch-hadoop-2.3.2...
>
> I *think* I'm only supposed to use the
> elasticsearch-spark_2.11-2.3.2.jar with it...
>
> but now I get the following exception:
>
>
> java.lang.NoSuchMethodError:
> scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
> at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
> at
> org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:40)
> at
> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:45)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:47)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53)
> at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55)
> at $iwC$$iwC$$iwC$$iwC$$iwC.(:57)
> at $iwC$$iwC$$iwC$$iwC.(:59)
> at $iwC$$iwC$$iwC.(:61)
> at $iwC$$iwC.(:63)
> at $iwC.(:65)
> at (:67)
> at .(:71)
> at .()
> at .(:7)
> at .()
> at $print()
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
> at
> org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
> at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
> at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
> at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
> at
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
> at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
> at
> org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
> at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
> at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
> at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
> at org.apache.spark.repl.SparkILoop.org
> $apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
> at
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
> at
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
> at
> org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
> at
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
> at org.apache.spark.repl.SparkILoop.org
> $apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
> at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
> at org.apache.spark.repl.Main$.main(Main.scala:31)
> at org.apache.spark.repl.Main.main(Main.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
> On Thu, Jun 2, 2016 at 3:45 PM, Nick Pentreath 
> wrote:
>
>> Hey there
>>
>> When I used es-hadoop, I just pulled in the dependency into my pom.xml,
>> with spark as a "provided" dependency, and built a fat jar with assembly.
>>
>> Then with spark-submit use the --jars option to include your assembly jar
>> (IIRC I sometimes also needed to use --driver-classpath too, but perhaps
>> not with recent Spark versions).
>>
>>
>>
>> On Thu, 2 Jun 2016 at 15:34 Kevin Burton  wrote:
>>
>>> I'm trying to get spark 1.6.1 to work with 2.3.2... needless to say it's
>>> not super easy.
>>>
>>> I wish there was an easier way to get this stuff to work.. Last time I
>>> tried to use spark more I was having similar problems with classpath setup
>>> and Cassandra.
>>>
>>> Seems a huge opportunity to make this easier 

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Kevin Burton
Thanks.

I'm trying to run it in a standalone cluster with an existing / large 100
node ES install.

I'm using the standard 1.6.1 -2.6 distribution with
elasticsearch-hadoop-2.3.2...

I *think* I'm only supposed to use the
elasticsearch-spark_2.11-2.3.2.jar with it...

but now I get the following exception:


java.lang.NoSuchMethodError:
scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;
at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:52)
at
org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:37)
at
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:40)
at
$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:45)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:47)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:49)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:51)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:53)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.(:55)
at $iwC$$iwC$$iwC$$iwC$$iwC.(:57)
at $iwC$$iwC$$iwC$$iwC.(:59)
at $iwC$$iwC$$iwC.(:61)
at $iwC$$iwC.(:63)
at $iwC.(:65)
at (:67)
at .(:71)
at .()
at .(:7)
at .()
at $print()
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:875)
at
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org
$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org
$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


On Thu, Jun 2, 2016 at 3:45 PM, Nick Pentreath 
wrote:

> Hey there
>
> When I used es-hadoop, I just pulled in the dependency into my pom.xml,
> with spark as a "provided" dependency, and built a fat jar with assembly.
>
> Then with spark-submit use the --jars option to include your assembly jar
> (IIRC I sometimes also needed to use --driver-classpath too, but perhaps
> not with recent Spark versions).
>
>
>
> On Thu, 2 Jun 2016 at 15:34 Kevin Burton  wrote:
>
>> I'm trying to get spark 1.6.1 to work with 2.3.2... needless to say it's
>> not super easy.
>>
>> I wish there was an easier way to get this stuff to work.. Last time I
>> tried to use spark more I was having similar problems with classpath setup
>> and Cassandra.
>>
>> Seems a huge opportunity to make this easier for new developers.  This
>> stuff isn't rocket science but it can (needlessly) waste a ton of time.
>>
>> ... anyway... I'm have since figured out I have to specific *specific*
>> jars from the elasticsearch-hadoop distribution and use those.
>>
>> Right now I'm using :
>>
>>
>> 

Re: Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Nick Pentreath
Hey there

When I used es-hadoop, I just pulled in the dependency into my pom.xml,
with spark as a "provided" dependency, and built a fat jar with assembly.

Then with spark-submit use the --jars option to include your assembly jar
(IIRC I sometimes also needed to use --driver-classpath too, but perhaps
not with recent Spark versions).



On Thu, 2 Jun 2016 at 15:34 Kevin Burton  wrote:

> I'm trying to get spark 1.6.1 to work with 2.3.2... needless to say it's
> not super easy.
>
> I wish there was an easier way to get this stuff to work.. Last time I
> tried to use spark more I was having similar problems with classpath setup
> and Cassandra.
>
> Seems a huge opportunity to make this easier for new developers.  This
> stuff isn't rocket science but it can (needlessly) waste a ton of time.
>
> ... anyway... I'm have since figured out I have to specific *specific*
> jars from the elasticsearch-hadoop distribution and use those.
>
> Right now I'm using :
>
>
> SPARK_CLASSPATH=/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar:/usr/share/apache-spark/lib/*
>
> ... but I"m getting:
>
> java.lang.NoClassDefFoundError: Could not initialize class
> org.elasticsearch.hadoop.util.Version
> at
> org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
> at
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> ... but I think its caused by this:
>
> 16/06/03 00:26:48 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> localhost): java.lang.Error: Multiple ES-Hadoop versions detected in the
> classpath; please use only one
> jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar
>
> jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar
>
> jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar
>
> at org.elasticsearch.hadoop.util.Version.(Version.java:73)
> at
> org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
> at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
> at
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at
> org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
>
> .. still tracking this down but was wondering if there is someting obvious
> I'm dong wrong.  I'm going to take out elasticsearch-hadoop-2.3.2.jar and
> try again.
>
> Lots of trial and error here :-/
>
> Kevin
>
> --
>
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>


Classpath hell and Elasticsearch 2.3.2...

2016-06-02 Thread Kevin Burton
I'm trying to get spark 1.6.1 to work with 2.3.2... needless to say it's
not super easy.

I wish there was an easier way to get this stuff to work.. Last time I
tried to use spark more I was having similar problems with classpath setup
and Cassandra.

Seems a huge opportunity to make this easier for new developers.  This
stuff isn't rocket science but it can (needlessly) waste a ton of time.

... anyway... I'm have since figured out I have to specific *specific* jars
from the elasticsearch-hadoop distribution and use those.

Right now I'm using :

SPARK_CLASSPATH=/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar:/usr/share/apache-spark/lib/*

... but I"m getting:

java.lang.NoClassDefFoundError: Could not initialize class
org.elasticsearch.hadoop.util.Version
at
org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
at
org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
at
org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

... but I think its caused by this:

16/06/03 00:26:48 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
localhost): java.lang.Error: Multiple ES-Hadoop versions detected in the
classpath; please use only one
jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-2.3.2.jar
jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-spark_2.11-2.3.2.jar
jar:file:/usr/share/elasticsearch-hadoop/lib/elasticsearch-hadoop-mr-2.3.2.jar

at org.elasticsearch.hadoop.util.Version.(Version.java:73)
at
org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:376)
at org.elasticsearch.spark.rdd.EsRDDWriter.write(EsRDDWriter.scala:40)
at
org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
at
org.elasticsearch.spark.rdd.EsSpark$$anonfun$saveToEs$1.apply(EsSpark.scala:67)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

.. still tracking this down but was wondering if there is someting obvious
I'm dong wrong.  I'm going to take out elasticsearch-hadoop-2.3.2.jar and
try again.

Lots of trial and error here :-/

Kevin

-- 

We’re hiring if you know of any awesome Java Devops or Linux Operations
Engineers!

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile