Re: ...FileNotFoundException: Path is not a file: - error on accessing HDFS with sc.wholeTextFiles

2014-12-15 Thread Karen Murphy
Thanks Akhil, In line with your suggestion I have used the following 2 commands to flatten the directory structure: find . -type f -iname '*' -exec mv '{}' . \; find . -type d -exec rm -rf '{}' \; Kind Regards Karen On 12/12/14 13:25, Akhil Das wrote: I'm not quiet sure whether spark

...FileNotFoundException: Path is not a file: - error on accessing HDFS with sc.wholeTextFiles

2014-12-12 Thread Karen Murphy
When I try to load a text file from a HDFS path using sc.wholeTextFiles(hdfs://localhost:54310/graphx/anywebsite.com/anywebsite.com/) I'm get the following error: java.io.FileNotFoundException: Path is not a file: /graphx/anywebsite.com/anywebsite.com/css (full stack trace at bottom of

Re: ...FileNotFoundException: Path is not a file: - error on accessing HDFS with sc.wholeTextFiles

2014-12-12 Thread Akhil Das
I'm not quiet sure whether spark will go inside subdirectories and pick up files from it. You could do something like following to bring all files to one directory. find . -iname '*' -exec mv '{}' . \; Thanks Best Regards On Fri, Dec 12, 2014 at 6:34 PM, Karen Murphy k.l.mur...@qub.ac.uk