Job failed when writing a huge file

tienduc_dinh Sat, 24 Jan 2009 18:24:29 -0800

Hi everyone,

I'm using now Hadoop 0.18.0 with 1 NameNode and 4 data nodes. By writing the
file bigger than the maximal free space of each data node the job is often
failed.


I've seen that the file is mostly written only on one node (e.g. N1) and if
this node doesn't have enough space, Hadoop deletes the old chunks on node
N1, tries on another node (e.g. N2) and so on. The job will be failed if the
maximal retries are reached. 

(I don't use the script "start-balancer.sh" or something like that for
balancing my cluster in this test.)

Sometimes it works after Hadoop really spread the file across the data
nodes.

I think it's not so good that Hadoop writes (and deletes) the whole huge
file again and again instead of spreading it. 

So my question is "how does the write algorithm work or how can I find such
information ?"

Any help is appreciated, thanks a lot.

Tien Duc Dinh


-- 
View this message in context: 
http://www.nabble.com/Job-failed-when-writing-a-huge-file-tp21647888p21647888.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Job failed when writing a huge file

Reply via email to