I don't know why I can't see my emails immediately sent to the group ...
anyways,
I'm sorting a sequenceFile using it's sorter on my local filesystem. The
inputFile size is 1937690478 bytes.
but after 14 minutes of sorting.. I get :
TEST SORTING ..
java.io.FileNotFoundException: File does not exist:
/usr/mark/tmp/mapred/local/SortedOutput.0
at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:457)
at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1353)
at
org.apache.hadoop.io.SequenceFile$Sorter.cloneFileAttributes(SequenceFile.java:2663)
at
org.apache.hadoop.io.SequenceFile$Sorter.mergePass(SequenceFile.java:2712)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2285)
at org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:2324)
at
CrossPartitionSimilarity.TestSorter(CrossPartitionSimilarity.java:164)
at CrossPartitionSimilarity.main(CrossPartitionSimilarity.java:47)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Yet, the file is still there: wc -c SortedOutput.0 ---> 1918661230
../tmp/mapred/local/SortedOutput.0
and if it is because of space, I checked and it can hold up to 209 GB. So,
my question are there restrictions on some JVM configurations that I should
take care of ?
Thank you,
Maha