Re: HDFS on non-identical nodes

He Chen Thu, 12 Feb 2009 08:32:55 -0800

 I think you should confirm your balancer is still running. Do you change
the threshold of the HDFS balancer? May be too large?


The balancer will stop working when meets 5 conditions:

1. Datanodes are balanced (obviously you are not this kind);
2. No more block to be moved (all blocks on unbalanced nodes are busy or
recently used)
3. No more block to be moved in 20 minutes and 5 times consecutive attempts
4. Another balancer is working
5. I/O exception


The default setting is 10% for each datanodes, for 1TB it is 100GB, for 3T
is 300GB, and for 60GB is 6GB

Hope helpful


On Thu, Feb 12, 2009 at 10:06 AM, Brian Bockelman <bbock...@cse.unl.edu>wrote:

>
> On Feb 12, 2009, at 2:54 AM, Deepak wrote:
>
> Hi,
>>
>> We're running Hadoop cluster on 4 nodes, our primary purpose of
>> running is to provide distributed storage solution for internal
>> applications here in TellyTopia Inc.
>>
>> Our cluster consists of non-identical nodes (one with 1TB another two
>> with 3 TB and one more with 60GB) while copying data on HDFS we
>> noticed that node with 60GB storage ran out of disk-space and even
>> balancer couldn't balance because cluster was stopped. Now my
>> questions are
>>
>> 1. Is Hadoop is suitable for non-identical cluster nodes?
>>
>
> Yes.  Our cluster has between 60GB and 40TB on our nodes.  The majority
> have around 3TB.
>
>
>> 2. Is there any way to automatically balancing of nodes?
>>
>
> We have a cron script which automatically starts the Balancer.  It's dirty,
> but it works.
>
>
>> 3. Why Hadoop cluster stops when one node ran our of disk?
>>
>>
> That's not normal.  Trust me, if that was always true, we'd be perpetually
> screwed :)
>
> There might be some other underlying error you're missing...
>
> Brian
>
>
> Any futher inputs are appericiapted!
>>
>> Cheers,
>> Deepak
>> TellyTopia Inc.
>>
>
>


-- 
Chen He
RCF CSE Dept.
University of Nebraska-Lincoln
US

Re: HDFS on non-identical nodes

Reply via email to