Re: java.lang.NoSuchMethodError and yarn-client mode

2015-09-09 Thread Ted Yu
Have you checked the contents of __app__.jar ?



> On Sep 9, 2015, at 3:28 AM, Tom Seddon  wrote:
> 
> Thanks for your reply Aniket.
> 
> Ok I've done this and I'm still confused.  Output from running locally shows:
> 
> file:/home/tom/spark-avro/target/scala-2.10/simpleapp.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/conf/
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar
> file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar
> file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar
> saving text file...
> done!
> 
> In yarn-client mode:
> 
> file:/home/hadoop/simpleapp.jar
> file:/usr/lib/hadoop/hadoop-auth-2.6.0-amzn-0.jar
> ...
> file:/usr/lib/hadoop-mapreduce/avro-1.7.4.jar
> ...
> 
> And in yarn-cluster mode:
> file:/mnt/yarn/usercache/hadoop/appcache/application_1441787021820_0004/container_1441787021820_0004_01_01/__app__.jar
> ...
> file:/usr/lib/hadoop/lib/avro-1.7.4.jar
> ...
> saving text file...
> done!
> 
> In yarn-cluster mode it doesn't appear to have sight of the fat jar 
> (simpleapp), but can see avro-1.7.4, but runs fine!
> 
> Thanks,
> 
> Tom
> 
> 
>> On Wed, Sep 9, 2015 at 9:49 AM Aniket Bhatnagar  
>> wrote:
>> Hi Tom
>> 
>> There has to be a difference in classpaths in yarn-client and yarn-cluster 
>> mode. Perhaps a good starting point would be to print classpath as a first 
>> thing in SimpleApp.main. It should give clues around why it works in 
>> yarn-cluster mode.
>> 
>> Thanks,
>> Aniket
>> 
>> 
>>> On Wed, Sep 9, 2015, 2:11 PM Tom Seddon  wrote:
>>> Hi,
>>> 
>>> I have a problem trying to get a fairly simple app working which makes use 
>>> of native avro libraries.  The app runs fine on my local machine and in 
>>> yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get 
>>> the error below.  I'm aware this is a version problem, as EMR runs an 
>>> earlier version of avro, and I am trying to use avro-1.7.7.
>>> 
>>> What's confusing me a great deal is the fact that this runs fine in 
>>> yarn-cluster mode.
>>> 
>>> What is it about yarn-cluster mode that means the application has access to 
>>> the correct version of the avro library?  I need to run in yarn-client mode 
>>> as I will be caching data to the driver machine in between batches.  I 
>>> think in yarn-cluster mode the driver can run on any machine in the cluster 
>>> so this would not work.
>>> 
>>> Grateful for any advice as I'm really stuck on this.  AWS support are 
>>> trying but they don't seem to know why this is happening either!
>>> 
>>> Just to note, I'm aware of Databricks spark-avro project and have used it.  
>>> This is an investigation to see if I can use RDDs instead of dataframes.
>>> 
>>> java.lang.NoSuchMethodError: 
>>> org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema;
>>> at ophan.thrift.event.Event.(Event.java:10)
>>> at SimpleApp$.main(SimpleApp.scala:25)
>>> at SimpleApp.main(SimpleApp.scala)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>> at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:606)
>>> at 
>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
>>> at 
>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
>>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
>>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>> 
>>> Thanks,
>>> 
>>> Tom


Re: java.lang.NoSuchMethodError and yarn-client mode

2015-09-09 Thread Tom Seddon
Thanks for your reply Aniket.

Ok I've done this and I'm still confused.  Output from running locally
shows:

file:/home/tom/spark-avro/target/scala-2.10/simpleapp.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/conf/
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar
file:/home/tom/spark-1.4.0-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunjce_provider.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/zipfs.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/localedata.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/dnsns.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunec.jar
file:/usr/lib/jvm/java-7-oracle/jre/lib/ext/sunpkcs11.jar
saving text file...
done!

In yarn-client mode:

file:/home/hadoop/simpleapp.jar
file:/usr/lib/hadoop/hadoop-auth-2.6.0-amzn-0.jar
...
*file:/usr/lib/hadoop-mapreduce/avro-1.7.4.jar*
...

And in yarn-cluster mode:
file:/mnt/yarn/usercache/hadoop/appcache/application_1441787021820_0004/container_1441787021820_0004_01_01/__app__.jar
...
*file:/usr/lib/hadoop/lib/avro-1.7.4.jar*
...
saving text file...
done!

In yarn-cluster mode it doesn't appear to have sight of the fat jar
(simpleapp), but can see avro-1.7.4, but runs fine!

Thanks,

Tom


On Wed, Sep 9, 2015 at 9:49 AM Aniket Bhatnagar 
wrote:

> Hi Tom
>
> There has to be a difference in classpaths in yarn-client and yarn-cluster
> mode. Perhaps a good starting point would be to print classpath as a first
> thing in SimpleApp.main. It should give clues around why it works in
> yarn-cluster mode.
>
> Thanks,
> Aniket
>
> On Wed, Sep 9, 2015, 2:11 PM Tom Seddon  wrote:
>
>> Hi,
>>
>> I have a problem trying to get a fairly simple app working which makes
>> use of native avro libraries.  The app runs fine on my local machine and in
>> yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get
>> the error below.  I'm aware this is a version problem, as EMR runs an
>> earlier version of avro, and I am trying to use avro-1.7.7.
>>
>> What's confusing me a great deal is the fact that this runs fine in
>> yarn-cluster mode.
>>
>> What is it about yarn-cluster mode that means the application has access
>> to the correct version of the avro library?  I need to run in yarn-client
>> mode as I will be caching data to the driver machine in between batches.  I
>> think in yarn-cluster mode the driver can run on any machine in the cluster
>> so this would not work.
>>
>> Grateful for any advice as I'm really stuck on this.  AWS support are
>> trying but they don't seem to know why this is happening either!
>>
>> Just to note, I'm aware of Databricks spark-avro project and have used
>> it.  This is an investigation to see if I can use RDDs instead of
>> dataframes.
>>
>> java.lang.NoSuchMethodError:
>> org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema;
>> at ophan.thrift.event.Event.(Event.java:10)
>> at SimpleApp$.main(SimpleApp.scala:25)
>> at SimpleApp.main(SimpleApp.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:606)
>> at
>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
>> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
>> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
>> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> Thanks,
>>
>> Tom
>>
>>
>>


Re: java.lang.NoSuchMethodError and yarn-client mode

2015-09-09 Thread Aniket Bhatnagar
Hi Tom

There has to be a difference in classpaths in yarn-client and yarn-cluster
mode. Perhaps a good starting point would be to print classpath as a first
thing in SimpleApp.main. It should give clues around why it works in
yarn-cluster mode.

Thanks,
Aniket

On Wed, Sep 9, 2015, 2:11 PM Tom Seddon  wrote:

> Hi,
>
> I have a problem trying to get a fairly simple app working which makes use
> of native avro libraries.  The app runs fine on my local machine and in
> yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get
> the error below.  I'm aware this is a version problem, as EMR runs an
> earlier version of avro, and I am trying to use avro-1.7.7.
>
> What's confusing me a great deal is the fact that this runs fine in
> yarn-cluster mode.
>
> What is it about yarn-cluster mode that means the application has access
> to the correct version of the avro library?  I need to run in yarn-client
> mode as I will be caching data to the driver machine in between batches.  I
> think in yarn-cluster mode the driver can run on any machine in the cluster
> so this would not work.
>
> Grateful for any advice as I'm really stuck on this.  AWS support are
> trying but they don't seem to know why this is happening either!
>
> Just to note, I'm aware of Databricks spark-avro project and have used
> it.  This is an investigation to see if I can use RDDs instead of
> dataframes.
>
> java.lang.NoSuchMethodError:
> org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema;
> at ophan.thrift.event.Event.(Event.java:10)
> at SimpleApp$.main(SimpleApp.scala:25)
> at SimpleApp.main(SimpleApp.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> Thanks,
>
> Tom
>
>
>


java.lang.NoSuchMethodError and yarn-client mode

2015-09-09 Thread Tom Seddon
Hi,

I have a problem trying to get a fairly simple app working which makes use
of native avro libraries.  The app runs fine on my local machine and in
yarn-cluster mode, but when I try to run it on EMR yarn-client mode I get
the error below.  I'm aware this is a version problem, as EMR runs an
earlier version of avro, and I am trying to use avro-1.7.7.

What's confusing me a great deal is the fact that this runs fine in
yarn-cluster mode.

What is it about yarn-cluster mode that means the application has access to
the correct version of the avro library?  I need to run in yarn-client mode
as I will be caching data to the driver machine in between batches.  I
think in yarn-cluster mode the driver can run on any machine in the cluster
so this would not work.

Grateful for any advice as I'm really stuck on this.  AWS support are
trying but they don't seem to know why this is happening either!

Just to note, I'm aware of Databricks spark-avro project and have used it.
This is an investigation to see if I can use RDDs instead of dataframes.

java.lang.NoSuchMethodError:
org.apache.avro.Schema$Parser.parse(Ljava/lang/String;[Ljava/lang/String;)Lorg/apache/avro/Schema;
at ophan.thrift.event.Event.(Event.java:10)
at SimpleApp$.main(SimpleApp.scala:25)
at SimpleApp.main(SimpleApp.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Thanks,

Tom