Re: Saprk error:- Not a valid DFS File name
Check what you have at SimpleMktDataFlow.scala:106 ~Pratik On Fri, Oct 23, 2015 at 11:47 AM kali.tumm...@gmail.com < kali.tumm...@gmail.com> wrote: > Full Error:- > at > > org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:195) > at > > org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:104) > at > > org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:831) > at > > org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827) > at > > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > > org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:827) > at > > org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:820) > at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1817) > at > > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:305) > at > > org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131) > at > org.apache.spark.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:64) > at > > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1046) > at > > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:941) > at > > org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:850) > at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1164) > at > com.citi.ocean.spark.SimpleMktDataFlow$.main(SimpleMktDataFlow.scala:106) > at > com.citi.ocean.spark.SimpleMktDataFlow.main(SimpleMktDataFlow.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427) > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Saprk-error-Not-a-valid-DFS-File-name-tp25186p25188.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >
Re: Saprk error:- Not a valid DFS File name
Full Error:- at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:195) at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:104) at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:831) at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:827) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:827) at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:820) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1817) at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:305) at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131) at org.apache.spark.SparkHadoopWriter.preSetup(SparkHadoopWriter.scala:64) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1046) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:941) at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:850) at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1164) at com.citi.ocean.spark.SimpleMktDataFlow$.main(SimpleMktDataFlow.scala:106) at com.citi.ocean.spark.SimpleMktDataFlow.main(SimpleMktDataFlow.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:427) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Saprk-error-Not-a-valid-DFS-File-name-tp25186p25188.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Saprk error:- Not a valid DFS File name
I had face a similar issue. The actual problem was not in the file name. We run Spark on Yarn. The actual problem was seen in the logs by running the command: $ yarn logs -applicationId Scroll from the beginning to know the actual error. ~Pratik On Fri, Oct 23, 2015 at 11:40 AM kali.tumm...@gmail.com < kali.tumm...@gmail.com> wrote: > Hi All, > > got this weird error when I tried to run spark on YARN-CLUSTER mode , I > have > 33 files and I am looping spark in bash one by one most of them worked ok > except few files. > > Is this below error HDFS or spark error ? > > Exception in thread "Driver" java.lang.IllegalArgumentException: Pathname > /user/myid/-u/12:51/_temporary/0 from > hdfs://dev/user/myid/-u/12:51/_temporary/0 is not a valid DFS filename. > > File Name which I passed to spark , does file name causes issue ? > > > hdfs://dev/data/20151019/sipmktdata.ColorDataArchive.UTD.P4_M-P.v5.2015-09-18.txt.20150918 > > Thanks > Sri > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Saprk-error-Not-a-valid-DFS-File-name-tp25186.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > - > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >