Re: --jars works in "yarn-client" but not "yarn-cluster" mode, why?

2015-05-20 Thread Fengyun RAO
Thank you so much, Marcelo!

It WORKS!

2015-05-21 2:05 GMT+08:00 Marcelo Vanzin :

> Hello,
>
> Sorry for the delay. The issue you're running into is because most HBase
> classes are in the system class path, while jars added with "--jars" are
> only visible to the application class loader created by Spark. So classes
> in the system class path cannot see them.
>
> You can work around this by setting "--driver-classpath
> /opt/.../htrace-core-3.1.0-incubating.jar" and "--conf
> spark.executor.extraClassPath=
> /opt/.../htrace-core-3.1.0-incubating.jar" in your spark-submit command
> line. (You can also add those configs to your spark-defaults.conf to avoid
> having to type them all the time; and don't forget to include any other
> jars that might be needed.)
>
>
> On Mon, May 18, 2015 at 11:14 PM, Fengyun RAO 
> wrote:
>
>> Thanks, Marcelo!
>>
>>
>> Below is the full log,
>>
>>
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in 
>> [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in 
>> [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 15/05/19 14:08:58 INFO yarn.ApplicationMaster: Registered signal handlers 
>> for [TERM, HUP, INT]
>> 15/05/19 14:08:59 INFO yarn.ApplicationMaster: ApplicationAttemptId: 
>> appattempt_1432015548391_0003_01
>> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: 
>> nobody,raofengyun
>> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: 
>> nobody,raofengyun
>> 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: 
>> authentication disabled; ui acls disabled; users with view permissions: 
>> Set(nobody, raofengyun); users with modify permissions: Set(nobody, 
>> raofengyun)
>> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Starting the user application 
>> in a separate Thread
>> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context 
>> initialization
>> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context 
>> initialization ...
>> 15/05/19 14:09:00 INFO spark.SparkContext: Running Spark version 1.3.0
>> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: 
>> nobody,raofengyun
>> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: 
>> nobody,raofengyun
>> 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: 
>> authentication disabled; ui acls disabled; users with view permissions: 
>> Set(nobody, raofengyun); users with modify permissions: Set(nobody, 
>> raofengyun)
>> 15/05/19 14:09:01 INFO slf4j.Slf4jLogger: Slf4jLogger started
>> 15/05/19 14:09:01 INFO Remoting: Starting remoting
>> 15/05/19 14:09:01 INFO Remoting: Remoting started; listening on addresses 
>> :[akka.tcp://sparkDriver@gs-server-v-127:7191]
>> 15/05/19 14:09:01 INFO Remoting: Remoting now listens on addresses: 
>> [akka.tcp://sparkDriver@gs-server-v-127:7191]
>> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 
>> 'sparkDriver' on port 7191.
>> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering MapOutputTracker
>> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering BlockManagerMaster
>> 15/05/19 14:09:01 INFO storage.DiskBlockManager: Created local directory at 
>> /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/blockmgr-3250910b-693e-46ff-b057-26d552fd8abd
>> 15/05/19 14:09:01 INFO storage.MemoryStore: MemoryStore started with 
>> capacity 259.7 MB
>> 15/05/19 14:09:01 INFO spark.HttpFileServer: HTTP File server directory is 
>> /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/httpd-5bc614bc-d8b1-473d-a807-4d9252eb679d
>> 15/05/19 14:09:01 INFO spark.HttpServer: Starting HTTP Server
>> 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT
>> 15/05/19 14:09:01 INFO server.AbstractConnector: Started 
>> SocketConnector@0.0.0.0:9349
>> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'HTTP file 
>> server' on port 9349.
>> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering OutputCommitCoordinator
>> 15/05/19 14:09:01 INFO ui.JettyUtils: Adding filter: 
>> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
>> 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT
>> 15/05/19 14:09:01 INFO server.AbstractConnector: Started 
>> SelectChannelConnector@0.0.0.0:63023
>> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'SparkUI' on 
>> port 63023.
>> 15/05/19 14:09:01 INFO ui.SparkUI: Started SparkUI at 
>> http://gs-server-v-127:63023
>> 15/05/19 14:09:02 INFO cluster.YarnClusterScheduler: Created 
>> YarnClusterScheduler
>> 15/05/19 14:09:02 INFO netty.NettyBlockTransferService:

Re: --jars works in "yarn-client" but not "yarn-cluster" mode, why?

2015-05-20 Thread Marcelo Vanzin
Hello,

Sorry for the delay. The issue you're running into is because most HBase
classes are in the system class path, while jars added with "--jars" are
only visible to the application class loader created by Spark. So classes
in the system class path cannot see them.

You can work around this by setting "--driver-classpath
/opt/.../htrace-core-3.1.0-incubating.jar" and "--conf
spark.executor.extraClassPath=
/opt/.../htrace-core-3.1.0-incubating.jar" in your spark-submit command
line. (You can also add those configs to your spark-defaults.conf to avoid
having to type them all the time; and don't forget to include any other
jars that might be needed.)


On Mon, May 18, 2015 at 11:14 PM, Fengyun RAO  wrote:

> Thanks, Marcelo!
>
>
> Below is the full log,
>
>
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 15/05/19 14:08:58 INFO yarn.ApplicationMaster: Registered signal handlers for 
> [TERM, HUP, INT]
> 15/05/19 14:08:59 INFO yarn.ApplicationMaster: ApplicationAttemptId: 
> appattempt_1432015548391_0003_01
> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: 
> nobody,raofengyun
> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: 
> nobody,raofengyun
> 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(nobody, 
> raofengyun); users with modify permissions: Set(nobody, raofengyun)
> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Starting the user application 
> in a separate Thread
> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context 
> initialization
> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context 
> initialization ...
> 15/05/19 14:09:00 INFO spark.SparkContext: Running Spark version 1.3.0
> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: 
> nobody,raofengyun
> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: 
> nobody,raofengyun
> 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users with view permissions: Set(nobody, 
> raofengyun); users with modify permissions: Set(nobody, raofengyun)
> 15/05/19 14:09:01 INFO slf4j.Slf4jLogger: Slf4jLogger started
> 15/05/19 14:09:01 INFO Remoting: Starting remoting
> 15/05/19 14:09:01 INFO Remoting: Remoting started; listening on addresses 
> :[akka.tcp://sparkDriver@gs-server-v-127:7191]
> 15/05/19 14:09:01 INFO Remoting: Remoting now listens on addresses: 
> [akka.tcp://sparkDriver@gs-server-v-127:7191]
> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'sparkDriver' 
> on port 7191.
> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering MapOutputTracker
> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering BlockManagerMaster
> 15/05/19 14:09:01 INFO storage.DiskBlockManager: Created local directory at 
> /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/blockmgr-3250910b-693e-46ff-b057-26d552fd8abd
> 15/05/19 14:09:01 INFO storage.MemoryStore: MemoryStore started with capacity 
> 259.7 MB
> 15/05/19 14:09:01 INFO spark.HttpFileServer: HTTP File server directory is 
> /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/httpd-5bc614bc-d8b1-473d-a807-4d9252eb679d
> 15/05/19 14:09:01 INFO spark.HttpServer: Starting HTTP Server
> 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT
> 15/05/19 14:09:01 INFO server.AbstractConnector: Started 
> SocketConnector@0.0.0.0:9349
> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'HTTP file 
> server' on port 9349.
> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering OutputCommitCoordinator
> 15/05/19 14:09:01 INFO ui.JettyUtils: Adding filter: 
> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
> 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT
> 15/05/19 14:09:01 INFO server.AbstractConnector: Started 
> SelectChannelConnector@0.0.0.0:63023
> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'SparkUI' on 
> port 63023.
> 15/05/19 14:09:01 INFO ui.SparkUI: Started SparkUI at 
> http://gs-server-v-127:63023
> 15/05/19 14:09:02 INFO cluster.YarnClusterScheduler: Created 
> YarnClusterScheduler
> 15/05/19 14:09:02 INFO netty.NettyBlockTransferService: Server created on 
> 33526
> 15/05/19 14:09:02 INFO storage.BlockManagerMaster: Trying to register 
> BlockManager
> 15/05/19 14:09:02 INFO storage.BlockManagerMasterActor: Registering block 
> mana

Re: --jars works in "yarn-client" but not "yarn-cluster" mode, why?

2015-05-18 Thread Fengyun RAO
Thanks, Marcelo!


Below is the full log,


SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/05/19 14:08:58 INFO yarn.ApplicationMaster: Registered signal
handlers for [TERM, HUP, INT]
15/05/19 14:08:59 INFO yarn.ApplicationMaster: ApplicationAttemptId:
appattempt_1432015548391_0003_01
15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to:
nobody,raofengyun
15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to:
nobody,raofengyun
15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view
permissions: Set(nobody, raofengyun); users with modify permissions:
Set(nobody, raofengyun)
15/05/19 14:09:00 INFO yarn.ApplicationMaster: Starting the user
application in a separate Thread
15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark
context initialization
15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark
context initialization ...
15/05/19 14:09:00 INFO spark.SparkContext: Running Spark version 1.3.0
15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to:
nobody,raofengyun
15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to:
nobody,raofengyun
15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users with view
permissions: Set(nobody, raofengyun); users with modify permissions:
Set(nobody, raofengyun)
15/05/19 14:09:01 INFO slf4j.Slf4jLogger: Slf4jLogger started
15/05/19 14:09:01 INFO Remoting: Starting remoting
15/05/19 14:09:01 INFO Remoting: Remoting started; listening on
addresses :[akka.tcp://sparkDriver@gs-server-v-127:7191]
15/05/19 14:09:01 INFO Remoting: Remoting now listens on addresses:
[akka.tcp://sparkDriver@gs-server-v-127:7191]
15/05/19 14:09:01 INFO util.Utils: Successfully started service
'sparkDriver' on port 7191.
15/05/19 14:09:01 INFO spark.SparkEnv: Registering MapOutputTracker
15/05/19 14:09:01 INFO spark.SparkEnv: Registering BlockManagerMaster
15/05/19 14:09:01 INFO storage.DiskBlockManager: Created local
directory at 
/data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/blockmgr-3250910b-693e-46ff-b057-26d552fd8abd
15/05/19 14:09:01 INFO storage.MemoryStore: MemoryStore started with
capacity 259.7 MB
15/05/19 14:09:01 INFO spark.HttpFileServer: HTTP File server
directory is 
/data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/httpd-5bc614bc-d8b1-473d-a807-4d9252eb679d
15/05/19 14:09:01 INFO spark.HttpServer: Starting HTTP Server
15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/05/19 14:09:01 INFO server.AbstractConnector: Started
SocketConnector@0.0.0.0:9349
15/05/19 14:09:01 INFO util.Utils: Successfully started service 'HTTP
file server' on port 9349.
15/05/19 14:09:01 INFO spark.SparkEnv: Registering OutputCommitCoordinator
15/05/19 14:09:01 INFO ui.JettyUtils: Adding filter:
org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT
15/05/19 14:09:01 INFO server.AbstractConnector: Started
SelectChannelConnector@0.0.0.0:63023
15/05/19 14:09:01 INFO util.Utils: Successfully started service
'SparkUI' on port 63023.
15/05/19 14:09:01 INFO ui.SparkUI: Started SparkUI at
http://gs-server-v-127:63023
15/05/19 14:09:02 INFO cluster.YarnClusterScheduler: Created
YarnClusterScheduler
15/05/19 14:09:02 INFO netty.NettyBlockTransferService: Server created on 33526
15/05/19 14:09:02 INFO storage.BlockManagerMaster: Trying to register
BlockManager
15/05/19 14:09:02 INFO storage.BlockManagerMasterActor: Registering
block manager gs-server-v-127:33526 with 259.7 MB RAM,
BlockManagerId(, gs-server-v-127, 33526)
15/05/19 14:09:02 INFO storage.BlockManagerMaster: Registered BlockManager
15/05/19 14:09:02 INFO scheduler.EventLoggingListener: Logging events
to 
hdfs://gs-server-v-127:8020/user/spark/applicationHistory/application_1432015548391_0003
15/05/19 14:09:02 INFO yarn.ApplicationMaster: Listen to driver:
akka.tcp://sparkDriver@gs-server-v-127:7191/user/YarnScheduler
15/05/19 14:09:02 INFO cluster.YarnClusterSchedulerBackend:
ApplicationMaster registered as
Actor[akka://sparkDriver/user/YarnAM#1902752386]
15/05/19 14:09:02 INFO client.RMProxy: Connecting to ResourceManager
at gs-server-v-127/10.200.200.56:8030
15/05/19 14:09:02 INFO yarn.YarnRMClient: Registering the ApplicationMaster
15/05/19 14:09:03 INFO yarn.YarnAllocator: Will request 2 executor
containers, each with 1 cores and 4480 MB memory including

Re: --jars works in "yarn-client" but not "yarn-cluster" mode, why?

2015-05-17 Thread Fengyun RAO
Thanks, Wilfred.

The problem is, the jar "/opt/cloudera/parcels/CD
H/lib/hbase/lib/htrace-core-3.1.0-incubating.jar"
is on every node in the cluster, since we installed CDH 5.4.

thus no matter we run on client or cluster, the driver has access to the
jar.

What's more, the driver does not depend on the jar, it is the executor that
throws the "ClassNotFoundException"


2015-05-18 6:53 GMT+08:00 Wilfred Spiegelenburg :

> When you run the driver in the cluster the application really runs from
> the cluster and the client goes away. If the driver does not have access to
> the jars, i.e. if they are not on the cluster available somewhere, this
> will happen.
> If you run the driver on the client the driver has access to the jars
> there. Unless you have copied the jars onto the cluster it will not work.
> That is what SPARK-5377 is all about.
>
> Wilfred
>
> On 15/05/2015 00:37, Fengyun RAO wrote:
>
>> thanks, Wilfred.
>>
>> In our program, the "htrace-core-3.1.0-incubating.jar" dependency is
>> only required in the executor, not in the driver.
>> while in both "yarn-client" and "yarn-cluster", the executor runs in
>> cluster.
>>
>> and it's clearly in "yarn-cluster" mode, the jar IS in
>> "spark.yarn.secondary.jars", but still throws ClassNotFoundException
>>
>> 2015-05-14 18:52 GMT+08:00 Wilfred Spiegelenburg
>> mailto:wspiegelenb...@cloudera.com>>:
>>
>> In the cluster the driver runs in the cluster and not locally in the
>> spark-submit JVM. This changes what is available on your classpath.
>> It looks like you are running into a similar situation as described
>> in SPARK-5377.
>>
>> Wilfred
>>
>> On 14/05/2015 13:47, Fengyun RAO wrote:
>>
>> I look into the "Environment" in both modes.
>>
>> yarn-client:
>> spark.jars
>>
>> local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,file:/home/xxx/my-app.jar
>>
>> yarn-cluster:
>> spark.yarn.secondary.jars
>>
>> local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
>>
>> I wonder why htrace exists in "spark.yarn.secondary.jars" but
>> still not
>> found in URLClassLoader.
>>
>> I tried both "local" and "file" mode for the jar, still the same
>> error.
>>
>>
>> 2015-05-14 11:37 GMT+08:00 Fengyun RAO > 
>> >>:
>>
>>
>>
>>  Hadoop version: CDH 5.4.
>>
>>  We need to connect to HBase, thus need extra
>>
>>
>> "/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar"
>>  dependency.
>>
>>  It works in yarn-client mode:
>>  "spark-submit --class xxx.xxx.MyApp --master yarn-client
>>  --num-executors 10 --executor-memory 10g --jars
>>
>>
>> /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
>>  my-app.jar /input /output"
>>
>>  However, if we change "yarn-client" to "yarn-cluster', it
>> throws an
>>  ClassNotFoundException (actually the class exists in
>>  htrace-core-3.1.0-incubating.jar):
>>
>>  Caused by: java.lang.NoClassDefFoundError:
>> org/apache/htrace/Trace
>>  at
>>
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
>>  at
>>
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
>>  at
>>
>> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>>  at
>>
>> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86)
>>  at
>>
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850)
>>  at
>>
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>>  ... 21 more
>>  Caused by: java.lang.ClassNotFoundException:
>> org.apache.htrace.Trace
>>  at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>  at java.security.AccessController.doPrivileged(Native
>> Method)
>>  at
>> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>  at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>  at
>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>  at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>
>>
>>  Why --jars doesn't work in yarn-cluster mode? How to add
>> extra dependency in "yarn-cluster" mode?
>>
>>
>>
>> --
>>
>> ---
>> You received this message because you are subscr

Re: --jars works in "yarn-client" but not "yarn-cluster" mode, why?

2015-05-14 Thread Fengyun RAO
thanks, Wilfred.

In our program, the "htrace-core-3.1.0-incubating.jar" dependency is only
required in the executor, not in the driver.
while in both "yarn-client" and "yarn-cluster", the executor runs in
cluster.

and it's clearly in "yarn-cluster" mode, the jar IS in
"spark.yarn.secondary.jars",
but still throws ClassNotFoundException

2015-05-14 18:52 GMT+08:00 Wilfred Spiegelenburg <
wspiegelenb...@cloudera.com>:

> In the cluster the driver runs in the cluster and not locally in the
> spark-submit JVM. This changes what is available on your classpath. It
> looks like you are running into a similar situation as described in
> SPARK-5377.
>
> Wilfred
>
> On 14/05/2015 13:47, Fengyun RAO wrote:
>
>> I look into the "Environment" in both modes.
>>
>> yarn-client:
>> spark.jars
>>
>> local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,file:/home/xxx/my-app.jar
>>
>> yarn-cluster:
>> spark.yarn.secondary.jars
>>
>> local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
>>
>> I wonder why htrace exists in "spark.yarn.secondary.jars" but still not
>> found in URLClassLoader.
>>
>> I tried both "local" and "file" mode for the jar, still the same error.
>>
>>
>> 2015-05-14 11:37 GMT+08:00 Fengyun RAO > >:
>>
>>
>> Hadoop version: CDH 5.4.
>>
>> We need to connect to HBase, thus need extra
>>
>> "/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar"
>> dependency.
>>
>> It works in yarn-client mode:
>> "spark-submit --class xxx.xxx.MyApp --master yarn-client
>> --num-executors 10 --executor-memory 10g --jars
>>
>> /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
>> my-app.jar /input /output"
>>
>> However, if we change "yarn-client" to "yarn-cluster', it throws an
>> ClassNotFoundException (actually the class exists in
>> htrace-core-3.1.0-incubating.jar):
>>
>> Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace
>> at
>> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
>> at
>> org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
>> at
>> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>> at
>> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86)
>> at
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850)
>> at
>> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>> ... 21 more
>> Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>
>>
>> Why --jars doesn't work in yarn-cluster mode? How to add extra
>> dependency in "yarn-cluster" mode?
>>
>>
>>
>> --
>>
>> ---
>> You received this message because you are subscribed to the Google
>> Groups "CDH Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send
>> an email to cdh-user+unsubscr...@cloudera.org
>> .
>> For more options, visit https://groups.google.com/a/cloudera.org/d/optout
>> .
>>
>
> --
> Wilfred Spiegelenburg
> Backline Customer Operations Engineer
> YARN/MapReduce/Spark
>
> http://www.cloudera.com
> --
> http://five.sentenc.es
>
> --
>
> --- You received this message because you are subscribed to the Google
> Groups "CDH Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to cdh-user+unsubscr...@cloudera.org.
> For more options, visit https://groups.google.com/a/cloudera.org/d/optout.
>


Re: --jars works in "yarn-client" but not "yarn-cluster" mode, why?

2015-05-13 Thread Fengyun RAO
I look into the "Environment" in both modes.

yarn-client:
spark.jars
local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,file:/home/xxx/my-app.jar
yarn-cluster:
spark.yarn.secondary.jars
local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
I wonder why htrace exists in "spark.yarn.secondary.jars" but still not
found in URLClassLoader.

I tried both "local" and "file" mode for the jar, still the same error.


2015-05-14 11:37 GMT+08:00 Fengyun RAO :

> Hadoop version: CDH 5.4.
>
> We need to connect to HBase, thus need extra
> "/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar"
> dependency.
>
> It works in yarn-client mode:
> "spark-submit --class xxx.xxx.MyApp --master yarn-client --num-executors
> 10 --executor-memory 10g --jars
> /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar
> my-app.jar /input /output"
>
> However, if we change "yarn-client" to "yarn-cluster', it throws an
> ClassNotFoundException (actually the class exists in
> htrace-core-3.1.0-incubating.jar):
>
> Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace
>   at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218)
>   at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481)
>   at 
> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65)
>   at 
> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.(ConnectionManager.java:635)
>   ... 21 more
> Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>
>
> Why --jars doesn't work in yarn-cluster mode? How to add extra dependency in 
> "yarn-cluster" mode?
>
>
>