Re: spark-submit config via file

Sandeep Nemuri Mon, 27 Mar 2017 07:45:16 -0700

You should try adding your NN host and port in the URL.

On Mon, Mar 27, 2017 at 11:03 AM, Saisai Shao <sai.sai.s...@gmail.com>
wrote:


> It's quite obvious your hdfs URL is not complete, please looks at the
> exception, your hdfs URI doesn't have host, port. Normally it should be OK
> if HDFS is your default FS.
>
> I think the problem is you're running on HDI, in which default FS is wasb.
> So here short name without host:port will lead to error. This looks like a
> HDI specific issue, you'd better ask HDI.
>
> Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no
> host: hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz
>
>         at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(Dist
> ributedFileSystem.java:154)
>
>         at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.
> java:2791)
>
>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>
>         at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem
> .java:2825)
>
>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2807)
>
>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
>
>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
>
>
>
>
> On Fri, Mar 24, 2017 at 9:18 PM, Yong Zhang <java8...@hotmail.com> wrote:
>
>> Of course it is possible.
>>
>>
>> You can always to set any configurations in your application using API,
>> instead of pass in through the CLI.
>>
>>
>> val sparkConf = new SparkConf().setAppName(properties.get("appName")
>> ).set("master", properties.get("master")).set(xxx, properties.get("xxx"))
>>
>> Your error is your environment problem.
>>
>> Yong
>> ------------------------------
>> *From:* , Roy <rp...@njit.edu>
>> *Sent:* Friday, March 24, 2017 7:38 AM
>> *To:* user
>> *Subject:* spark-submit config via file
>>
>> Hi,
>>
>> I am trying to deploy spark job by using spark-submit which has bunch of
>> parameters like
>>
>> spark-submit --class StreamingEventWriterDriver --master yarn
>> --deploy-mode cluster --executor-memory 3072m --executor-cores 4 --files
>> streaming.conf spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar -conf
>> "streaming.conf"
>>
>> I was looking a way to put all these flags in the file to pass to
>> spark-submit to make my spark-submitcommand simple like this
>>
>> spark-submit --class StreamingEventWriterDriver --master yarn
>> --deploy-mode cluster --properties-file properties.conf --files
>> streaming.conf spark_streaming_2.11-assembly-1.0-SNAPSHOT.jar -conf
>> "streaming.conf"
>>
>> properties.conf has following contents
>>
>>
>> spark.executor.memory 3072m
>>
>> spark.executor.cores 4
>>
>>
>> But I am getting following error
>>
>>
>> 17/03/24 11:36:26 INFO Client: Use hdfs cache file as spark.yarn.archive
>> for HDP, hdfsCacheFile:hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-
>> hdp-yarn-archive.tar.gz
>>
>> 17/03/24 11:36:26 WARN AzureFileSystemThreadPoolExecutor: Disabling
>> threads for Delete operation as thread count 0 is <= 1
>>
>> 17/03/24 11:36:26 INFO AzureFileSystemThreadPoolExecutor: Time taken for
>> Delete operation is: 1 ms with threads: 0
>>
>> 17/03/24 11:36:27 INFO Client: Deleted staging directory wasb://
>> a...@abc.blob.core.windows.net/user/sshuser/.sparkStag
>> ing/application_1488402758319_0492
>>
>> Exception in thread "main" java.io.IOException: Incomplete HDFS URI, no
>> host: hdfs:///hdp/apps/2.6.0.0-403/spark2/spark2-hdp-yarn-archive.tar.gz
>>
>>         at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(Dist
>> ributedFileSystem.java:154)
>>
>>         at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.
>> java:2791)
>>
>>         at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:99)
>>
>>         at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem
>> .java:2825)
>>
>>         at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:
>> 2807)
>>
>>         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:386)
>>
>>         at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
>>
>>         at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.
>> scala:364)
>>
>>         at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$
>> yarn$Client$$distribute$1(Client.scala:480)
>>
>>         at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Cl
>> ient.scala:552)
>>
>>         at org.apache.spark.deploy.yarn.Client.createContainerLaunchCon
>> text(Client.scala:881)
>>
>>         at org.apache.spark.deploy.yarn.Client.submitApplication(Client
>> .scala:170)
>>
>>         at org.apache.spark.deploy.yarn.Client.run(Client.scala:1218)
>>
>>         at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1277)
>>
>>         at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>>
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>
>>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
>> ssorImpl.java:62)
>>
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
>> thodAccessorImpl.java:43)
>>
>>         at java.lang.reflect.Method.invoke(Method.java:498)
>>
>>         at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy
>> $SparkSubmit$$runMain(SparkSubmit.scala:745)
>>
>>         at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit
>> .scala:187)
>>
>>         at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.
>> scala:212)
>>
>>         at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:
>> 126)
>>
>>         at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>>
>> 17/03/24 11:36:27 INFO MetricsSystemImpl: Stopping azure-file-system
>> metrics system...
>>
>> Anyone know is this is even possible ?
>>
>>
>> Thanks...
>>
>> Roy
>>
>
>


-- 
*  Regards*
*  Sandeep Nemuri*

Re: spark-submit config via file

Reply via email to