Hi everyone, I'm using now Hadoop 0.18.0 with 1 NameNode and 4 data nodes. By writing the file bigger than the maximal free space of each data node the job is often failed.
I've seen that the file is mostly written only on one node (e.g. N1) and if this node doesn't have enough space, Hadoop deletes the old chunks on node N1, tries on another node (e.g. N2) and so on. The job will be failed if the maximal retries are reached. (I don't use the script "start-balancer.sh" or something like that for balancing my cluster in this test.) Sometimes it works after Hadoop really spread the file across the data nodes. I think it's not so good that Hadoop writes (and deletes) the whole huge file again and again instead of spreading it. So my question is "how does the write algorithm work or how can I find such information ?" Any help is appreciated, thanks a lot. Tien Duc Dinh -- View this message in context: http://www.nabble.com/Job-failed-when-writing-a-huge-file-tp21647888p21647888.html Sent from the Hadoop core-user mailing list archive at Nabble.com.
