Re: Protobuf version in mvn vs sbt

2014-12-06 Thread spark.dubovsky.jakub
Hi,

  I have created assembly with additional hadoop-2.3 profile and submit is 
smooth now.

  Thank you for quick reply!

  Now I can move to another maybe dev related problem. Posted to user 
mailinglist under Including data nucleus tools.

  Jakub


-- Původní zpráva --
Od: DB Tsai dbt...@dbtsai.com
Komu: Marcelo Vanzin van...@cloudera.com
Datum: 5. 12. 2014 22:31:13
Předmět: Re: Protobuf version in mvn vs sbt

As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try

./make-distribution.sh -Pyarn -Phive -Phadoop-2.3
-Dhadoop.version=2.3.0-cdh5.1.3 -DskipTests

See the detail of how to change the profile at
https://spark.apache.org/docs/latest/building-with-maven.html

Sincerely,

DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Fri, Dec 5, 2014 at 12:54 PM, Marcelo Vanzin van...@cloudera.com wrote:
 When building against Hadoop 2.x, you need to enable the appropriate
 profile, aside from just specifying the version. e.g. -Phadoop-2.3
 for Hadoop 2.3.

 On Fri, Dec 5, 2014 at 12:51 PM, spark.dubovsky.ja...@seznam.cz wrote:
 Hi devs,

 I play with your amazing Spark here in Prague for some time. I have
 stumbled on a thing which I like to ask about. I create assembly jars 
from
 source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster
 using yarn. Example of my usage [1]. Formerly I had started to use sbt 
for
 creating assemblies like this [2] which runs just fine. Then reading 
those
 maven-prefered stories here on dev list I found make-distribution.sh 
script
 in root of codebase and wanted to give it a try. I used it to create
 assembly by both [3] and [4].

 But I am not able to use assemblies created by make-distribution because
 it refuses to be submited to cluster. Here is what happens:
 - run [3] or [4]
 - recompile app agains new assembly
 - submit job using new assembly by [1] like command
 - submit fails with important parts of stack trace being [5]

 My guess is that it is due to improper version of protobuf included in
 assembly jar. My questions are:
 - Can you confirm this hypothesis?
 - What is the difference between sbt and mvn way of creating assembly? I
 mean sbt works and mvn not...
 - What additional option I need to pass to make-distribution to make it
 work?

 Any help/explanation here would be appreciated

 Jakub
 --
 [1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf
 spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-
 hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib.
 CreateGuidDomainDictionary root-0.1.jar ${args}

 [2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/
 assembly

 [3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive 
-
 DskipTests

 [4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-
cdh
 5.1.3 -Pyarn -Phive -DskipTests

 [5]Exception in thread main org.apache.hadoop.yarn.exceptions.
 YarnRuntimeException: java.lang.reflect.InvocationTargetException
 at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
 getClient(RpcClientFactoryPBImpl.java:79)
 at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy
 (HadoopYarnProtoRPC.java:48)
 at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134)
 ...
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance
 (NativeConstructorAccessorImpl.java:39)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
 (DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
 getClient(RpcClientFactoryPBImpl.java:76)
 ... 27 more
 Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto.
 YarnServiceProtos$SubmitApplicationRequestProto overrides final method
 getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)




 --
 Marcelo

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Protobuf version in mvn vs sbt

2014-12-05 Thread spark.dubovsky.jakub
Hi devs,

  I play with your amazing Spark here in Prague for some time. I have 
stumbled on a thing which I like to ask about. I create assembly jars from 
source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster 
using yarn. Example of my usage [1]. Formerly I had started to use sbt for 
creating assemblies like this [2] which runs just fine. Then reading those 
maven-prefered stories here on dev list I found make-distribution.sh script 
in root of codebase and wanted to give it a try. I used it to create 
assembly by both [3] and [4].

  But I am not able to use assemblies created by make-distribution because 
it refuses to be submited to cluster. Here is what happens:
- run [3] or [4]
- recompile app agains new assembly
- submit job using new assembly by [1] like command
- submit fails with important parts of stack trace being [5]

  My guess is that it is due to improper version of protobuf included in 
assembly jar. My questions are:
- Can you confirm this hypothesis?
- What is the difference between sbt and mvn way of creating assembly? I 
mean sbt works and mvn not...
- What additional option I need to pass to make-distribution to make it 
work?

  Any help/explanation here would be appreciated

  Jakub
--
[1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf 
spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-
hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib.
CreateGuidDomainDictionary root-0.1.jar ${args}

[2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/
assembly

[3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive -
DskipTests

[4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-cdh
5.1.3 -Pyarn -Phive -DskipTests

[5]Exception in thread main org.apache.hadoop.yarn.exceptions.
YarnRuntimeException: java.lang.reflect.InvocationTargetException
    at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
getClient(RpcClientFactoryPBImpl.java:79)
    at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy
(HadoopYarnProtoRPC.java:48)
    at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134)
...
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance
(NativeConstructorAccessorImpl.java:39)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
getClient(RpcClientFactoryPBImpl.java:76)
... 27 more
Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto.
YarnServiceProtos$SubmitApplicationRequestProto overrides final method 
getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)



Re: Protobuf version in mvn vs sbt

2014-12-05 Thread Marcelo Vanzin
When building against Hadoop 2.x, you need to enable the appropriate
profile, aside from just specifying the version. e.g. -Phadoop-2.3
for Hadoop 2.3.

On Fri, Dec 5, 2014 at 12:51 PM,  spark.dubovsky.ja...@seznam.cz wrote:
 Hi devs,

   I play with your amazing Spark here in Prague for some time. I have
 stumbled on a thing which I like to ask about. I create assembly jars from
 source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster
 using yarn. Example of my usage [1]. Formerly I had started to use sbt for
 creating assemblies like this [2] which runs just fine. Then reading those
 maven-prefered stories here on dev list I found make-distribution.sh script
 in root of codebase and wanted to give it a try. I used it to create
 assembly by both [3] and [4].

   But I am not able to use assemblies created by make-distribution because
 it refuses to be submited to cluster. Here is what happens:
 - run [3] or [4]
 - recompile app agains new assembly
 - submit job using new assembly by [1] like command
 - submit fails with important parts of stack trace being [5]

   My guess is that it is due to improper version of protobuf included in
 assembly jar. My questions are:
 - Can you confirm this hypothesis?
 - What is the difference between sbt and mvn way of creating assembly? I
 mean sbt works and mvn not...
 - What additional option I need to pass to make-distribution to make it
 work?

   Any help/explanation here would be appreciated

   Jakub
 --
 [1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf
 spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-
 hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib.
 CreateGuidDomainDictionary root-0.1.jar ${args}

 [2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/
 assembly

 [3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive -
 DskipTests

 [4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-cdh
 5.1.3 -Pyarn -Phive -DskipTests

 [5]Exception in thread main org.apache.hadoop.yarn.exceptions.
 YarnRuntimeException: java.lang.reflect.InvocationTargetException
 at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
 getClient(RpcClientFactoryPBImpl.java:79)
 at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy
 (HadoopYarnProtoRPC.java:48)
 at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134)
 ...
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance
 (NativeConstructorAccessorImpl.java:39)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
 (DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
 getClient(RpcClientFactoryPBImpl.java:76)
 ... 27 more
 Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto.
 YarnServiceProtos$SubmitApplicationRequestProto overrides final method
 getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)




-- 
Marcelo

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Protobuf version in mvn vs sbt

2014-12-05 Thread DB Tsai
As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try

./make-distribution.sh -Pyarn -Phive -Phadoop-2.3
-Dhadoop.version=2.3.0-cdh5.1.3 -DskipTests

See the detail of how to change the profile at
https://spark.apache.org/docs/latest/building-with-maven.html

Sincerely,

DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Fri, Dec 5, 2014 at 12:54 PM, Marcelo Vanzin van...@cloudera.com wrote:
 When building against Hadoop 2.x, you need to enable the appropriate
 profile, aside from just specifying the version. e.g. -Phadoop-2.3
 for Hadoop 2.3.

 On Fri, Dec 5, 2014 at 12:51 PM,  spark.dubovsky.ja...@seznam.cz wrote:
 Hi devs,

   I play with your amazing Spark here in Prague for some time. I have
 stumbled on a thing which I like to ask about. I create assembly jars from
 source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster
 using yarn. Example of my usage [1]. Formerly I had started to use sbt for
 creating assemblies like this [2] which runs just fine. Then reading those
 maven-prefered stories here on dev list I found make-distribution.sh script
 in root of codebase and wanted to give it a try. I used it to create
 assembly by both [3] and [4].

   But I am not able to use assemblies created by make-distribution because
 it refuses to be submited to cluster. Here is what happens:
 - run [3] or [4]
 - recompile app agains new assembly
 - submit job using new assembly by [1] like command
 - submit fails with important parts of stack trace being [5]

   My guess is that it is due to improper version of protobuf included in
 assembly jar. My questions are:
 - Can you confirm this hypothesis?
 - What is the difference between sbt and mvn way of creating assembly? I
 mean sbt works and mvn not...
 - What additional option I need to pass to make-distribution to make it
 work?

   Any help/explanation here would be appreciated

   Jakub
 --
 [1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf
 spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-
 hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib.
 CreateGuidDomainDictionary root-0.1.jar ${args}

 [2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/
 assembly

 [3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive -
 DskipTests

 [4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-cdh
 5.1.3 -Pyarn -Phive -DskipTests

 [5]Exception in thread main org.apache.hadoop.yarn.exceptions.
 YarnRuntimeException: java.lang.reflect.InvocationTargetException
 at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
 getClient(RpcClientFactoryPBImpl.java:79)
 at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy
 (HadoopYarnProtoRPC.java:48)
 at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134)
 ...
 Caused by: java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance
 (NativeConstructorAccessorImpl.java:39)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance
 (DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl.
 getClient(RpcClientFactoryPBImpl.java:76)
 ... 27 more
 Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto.
 YarnServiceProtos$SubmitApplicationRequestProto overrides final method
 getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
 at java.lang.ClassLoader.defineClass1(Native Method)
 at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)




 --
 Marcelo

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Protobuf version in mvn vs sbt

2014-12-05 Thread Sean Owen
(Nit: CDH *5.1.x*, including 5.1.3, is derived from Hadoop 2.3.x. 5.3
is based on 2.5.x)

On Fri, Dec 5, 2014 at 3:29 PM, DB Tsai dbt...@dbtsai.com wrote:
 As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Protobuf version in mvn vs sbt

2014-12-05 Thread DB Tsai
oh, I meant to say cdh5.1.3 used by Jakub's company is based on 2.3. You
can see it from the first part of the Cloudera's version number - 2.3.0-cdh
5.1.3.


Sincerely,

DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai

On Fri, Dec 5, 2014 at 1:38 PM, Sean Owen so...@cloudera.com wrote:

 (Nit: CDH *5.1.x*, including 5.1.3, is derived from Hadoop 2.3.x. 5.3
 is based on 2.5.x)

 On Fri, Dec 5, 2014 at 3:29 PM, DB Tsai dbt...@dbtsai.com wrote:
  As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try