RE: spark on Windows 2008 failed to save RDD to windows shared folder

Wang, Ningjun (LNG-NPV) Tue, 26 May 2015 07:01:21 -0700

It is Hadoop-2.4.0 with spark-1.3.0.

I found that the problem only happen if there are multi nodes. If the cluster 
has only one node, it works fine.


For example if the cluster has a spark-master on machine A and a spark-worker 
on machine B, this problem happen. If both spark-master and spark-worker are on 
machine A, then no problem.

I do not use HDFS. I am just saving the RDD to a window share folder
rdd.saveAsObjectFile(“file:///T:/lab4-win02/IndexRoot01/tobacco-07/myrdd.obj<file:///T:\lab4-win02\IndexRoot01\tobacco-07\myrdd.obj>”)

With T: drive mapped to   
\\10.196.119.230\myshare<file:///\\10.196.119.230\myshare>

Ningjun

From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Friday, May 22, 2015 5:02 PM
To: Wang, Ningjun (LNG-NPV)
Cc: user@spark.apache.org
Subject: Re: spark on Windows 2008 failed to save RDD to windows shared folder

The stack trace is related to hdfs.

Can you tell us which hadoop release you are using ?

Is this a secure cluster ?

Thanks

On Fri, May 22, 2015 at 1:55 PM, Wang, Ningjun (LNG-NPV) 
<ningjun.w...@lexisnexis.com<mailto:ningjun.w...@lexisnexis.com>> wrote:
I used spark standalone cluster on Windows 2008. I kept on getting the 
following error when trying to save an RDD to a windows shared folder

rdd.saveAsObjectFile(“file:///T:/lab4-win02/IndexRoot01/tobacco-07/myrdd.obj<file:///T:\lab4-win02\IndexRoot01\tobacco-07\myrdd.obj>”)

15/05/22 16:49:05 ERROR Executor: Exception in task 0.0 in stage 12.0 (TID 12)
java.io.IOException: Mkdirs failed to create 
file:/T:/lab4-win02/IndexRoot01/tobacco-07/tmp/docs-150522204904805.op/_temporary/0/_temporary/attempt_201505221649_0012_m_000000_12
            at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:438)
            at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
            at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
            at 
org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1071)
            at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)
            at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:527)
            at 
org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:63)
            at 
org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)
            at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1068)
            at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)
            at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
            at org.apache.spark.scheduler.Task.run(Task.scala:64)
            at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
            at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:745)
The T: drive is mapped to a windows shared folder, e.g.  T:  ->  
\\10.196.119.230\myshare<file:///\\10.196.119.230\myshare>

The id running spark does have write permission to this folder. It works most 
of the time but failed sometime.

Can anybody tell me what is the problem here?

Please advise. Thanks.

RE: spark on Windows 2008 failed to save RDD to windows shared folder

Reply via email to