RE: Limit the space used by hadoop on a slave node

2008-01-08 Thread Hairong Kuang
Has anybody tried 15.0? Please check https://issues.apache.org/jira/browse/HADOOP-1463. Hairong -Original Message- From: Joydeep Sen Sarma [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 08, 2008 11:33 AM To: hadoop-user@lucene.apache.org; hadoop-user@lucene.apache.org Subject: RE:

RE: Limit the space used by hadoop on a slave node

2008-01-08 Thread Joydeep Sen Sarma
at least up until 14.4, these options are broken. see https://issues.apache.org/jira/browse/HADOOP-2549 (there's a trivial patch - but i am still testing). -Original Message- From: Khalil Honsali [mailto:[EMAIL PROTECTED] Sent: Tue 1/8/2008 11:21 AM To: hadoop-user@lucene.apache.org

RE: Limit the space used by hadoop on a slave node

2008-01-08 Thread Hairong Kuang
Most of the time dfs and map/reduce share disks. Keep in mind that du options can not control how much space that map/reduce tasks take. Sometimes we get the out of disk space problem because data intensive map/reduce tasks take a lot of disk space. Hairong -Original Message- From: Ted

Re: Limit the space used by hadoop on a slave node

2008-01-08 Thread Ted Dunning
My problem was caused purely by copying files to HDFS using [hadoop dfs -put]. No map-reduce activity was going on at the time (and all of the jobs I had around that time were counting jobs that had very powerful reduction in data volumes due to combiner functions. On 1/8/08 1:32 PM, Hairong

Re: Limit the space used by hadoop on a slave node

2008-01-08 Thread Ted Dunning
And I have both but have had disk full problems. I can't be sure right now whether this occurred under 14.4 or 15.1, but I think it was 15.1. In any case, new file creation from a non-datanode host is definitely not well balanced and will lead to disk full conditions if you have dramatically

RE: Limit the space used by hadoop on a slave node

2008-01-08 Thread Koji Noguchi
We use, dfs.datanode.du.pct for 0.14 and dfs.datanode.du.reserved for 0.15. Change was made in the Jira Hairong mentioned. https://issues.apache.org/jira/browse/HADOOP-1463 Koji -Original Message- From: Ted Dunning [mailto:[EMAIL PROTECTED] Sent: Tuesday, January 08, 2008 1:13 PM

RE: Limit the space used by hadoop on a slave node

2008-01-08 Thread Hairong Kuang
I agree that block distribution does not deal with heterogeneous cluster well. Basically block replication does not favor less utilized datanode. After 0.16 is released, you may periodically run the balancer to redistribute blocks with the command bin/start-balancer.sh. I checked the datanode

RE: Limit the space used by hadoop on a slave node

2008-01-08 Thread Joydeep Sen Sarma
can you please check the problem description in https://issues.apache.org/jira/browse/HADOOP-2549 ? i am not sure whether the bug u referred to fixes the problem. the issue is that the getNextVolume() api in the dfs code is getting called with a argument of 0 (for blocksize). as a result

Re: Limit the space used by hadoop on a slave node

2008-01-08 Thread Ted Dunning
I don't have the specific data you request, but I can give you a general outline for the dev cluster in question. I have 4 nodes that are general use. These have about 1TB of storage each, but this is largely used by other processes. These nodes usually have 50-500GB free. I have 8 nodes that

RE: Limit the space used by hadoop on a slave node

2008-01-08 Thread Hairong Kuang
Joydeep, Thanks for pointing out the problem. The cause of block size being 0 is that block size is not past as a parameter in block transfer protocol. So a Block object is initialized, we set its block size to be zero that leads to a parameter of zero when getNextVolume is called. I will put