Hello, I have 8 Datanodes and each having storage capacity of only 3GB. I am running word count on 1GB of text file.
Initially df h shows it has 2.8GB after HDFS write. When Shuffling Starts it goes on consuming the disc space of only one node. I think it is the reducer. Finally df h shows 2MB. Why can¹t it just use all 4 reducer disc space ? Thanks & Regards, Abdul Navaz Research Assistant University of Houston Main Campus, Houston TX Ph: 281-685-0388 From: Rohith Sharma K S <rohithsharm...@huawei.com> Reply-To: <user@hadoop.apache.org> Date: Monday, October 6, 2014 at 5:52 AM To: "user@hadoop.apache.org" <user@hadoop.apache.org> Subject: RE: Reduce fails always Hi How much data does wordcount job is processing? What is the disk space (³df -h² ) available in the node where it always fail? The point I didn¹t understand is why it uses only one datanode disc space? >> For reducers task running, containers can be allocated at any node. I think, in your cluster one of the machines disk space is very low. So whichever the task running on that particular node is failing. Thanks & Regards Rohith Sharma K S From: Abdul Navaz [mailto:navaz....@gmail.com] Sent: 06 October 2014 08:21 To: user@hadoop.apache.org Subject: Reduce fails always Hi All, I am running sample word count job in a 9 node cluster and I am getting the below error message. hadoop jar chiu-wordcount2.jar WordCount /user/hduser/getty/file1.txt /user/hduser/getty/out10 -D mapred.reduce.tasks=2 14/10/05 18:08:45 INFO mapred.JobClient: map 99% reduce 26% 14/10/05 18:08:48 INFO mapred.JobClient: map 99% reduce 28% 14/10/05 18:08:51 INFO mapred.JobClient: map 100% reduce 28% 14/10/05 18:08:57 INFO mapred.JobClient: map 98% reduce 0% 14/10/05 18:08:58 INFO mapred.JobClient: Task Id : attempt_201410051754_0003_r_000000_0, Status : FAILED FSError: java.io.IOException: No space left on device 14/10/05 18:08:59 WARN mapred.JobClient: Error reading task outputhttp://pcvm1-10.utahddc.geniracks.net:50060/tasklog?plaintext=true&att emptid=attempt_201410051754_0003_r_000000_0&filter=stdout 14/10/05 18:08:59 WARN mapred.JobClient: Error reading task outputhttp://pcvm1-10.utahddc.geniracks.net:50060/tasklog?plaintext=true&att emptid=attempt_201410051754_0003_r_000000_0&filter=stderr 14/10/05 18:08:59 INFO mapred.JobClient: Task Id : attempt_201410051754_0003_m_000015_0, Status : FAILED FSError: java.io.IOException: No space left on device 14/10/05 18:09:02 INFO mapred.JobClient: map 99% reduce 0% 14/10/05 18:09:07 INFO mapred.JobClient: map 99% reduce 1% I can see it uses all disk space on one of the datanode when shuffling starts. As soon as disc space on the node becomes nill it throws me this error and job aborts. The point I didn¹t understand is why it uses only one datanode disc space. I have change the number of reducer as 4 still it uses only one datanode disc and throws above error. How can I fix this issue? Thanks & Regards, Navaz