RE: How to support dependency jars and files on HDFS in standalone cluster mode?

2015-06-14 Thread Dong Lei
https://issues.apache.org/jira/browse/SPARK-8369 Created And I’m working on a PR. Thanks Dong Lei From: Cheng Lian [mailto:lian.cs@gmail.com] Sent: Friday, June 12, 2015 7:03 PM To: Dong Lei Cc: Dianfei (Keith) Han; dev@spark.apache.org Subject: Re: How to support dependency jars and files

Re: How to support dependency jars and files on HDFS in standalone cluster mode?

2015-06-12 Thread Cheng Lian
Would you mind to file a JIRA for this? Thanks! Cheng On 6/11/15 2:40 PM, Dong Lei wrote: I think in standalone cluster mode, spark is supposed to do: 1.Download jars, files to driver 2.Set the driver’s class path 3.Driver setup a http file server to distribute these files 4.Worker

Re: How to support dependency jars and files on HDFS in standalone cluster mode?

2015-06-11 Thread Cheng Lian
Oh sorry, I mistook --jars for --files. Yeah, for jars we need to add them to classpath, which is different from regular files. Cheng On 6/11/15 2:18 PM, Dong Lei wrote: Thanks Cheng, If I do not use --jars how can I tell spark to search the jars(and files) on HDFS? Do you mean the

RE: How to support dependency jars and files on HDFS in standalone cluster mode?

2015-06-11 Thread Dong Lei
I think in standalone cluster mode, spark is supposed to do: 1. Download jars, files to driver 2. Set the driver’s class path 3. Driver setup a http file server to distribute these files 4. Worker download from driver and setup classpath Right? But somehow, the first

RE: How to support dependency jars and files on HDFS in standalone cluster mode?

2015-06-11 Thread Dong Lei
Thanks Cheng, If I do not use --jars how can I tell spark to search the jars(and files) on HDFS? Do you mean the driver will not need to setup a HTTP file server for this scenario and the worker will fetch the jars and files from HDFS? Thanks Dong Lei From: Cheng Lian

Re: How to support dependency jars and files on HDFS in standalone cluster mode?

2015-06-10 Thread Cheng Lian
Since the jars are already on HDFS, you can access them directly in your Spark application without using --jars Cheng On 6/11/15 11:04 AM, Dong Lei wrote: Hi spark-dev: I can not use a hdfs location for the “--jars” or “--files” option when doing a spark-submit in a standalone cluster

How to support dependency jars and files on HDFS in standalone cluster mode?

2015-06-10 Thread Dong Lei
Hi spark-dev: I can not use a hdfs location for the --jars or --files option when doing a spark-submit in a standalone cluster mode. For example: Spark-submit ... --jars hdfs://ip/1.jar hdfs://ip/app.jar (standalone cluster mode) will not download 1.jar to driver's