Thanks Brain and Chen!

I finally sort that out why cluster is being stopped after running out
of space. Its because of master failure due to disk space.

Regarding automatic balancer, I guess in our case, rate of copying is
faster than balancer rate, we found balancer do start but couldn't
perform its job.

Anyways thanks for your help! It helped me sort out somethings.

Cheers,
Deepak

On Thu, Feb 12, 2009 at 5:32 PM, He Chen <air...@gmail.com> wrote:
>  I think you should confirm your balancer is still running. Do you change
> the threshold of the HDFS balancer? May be too large?
>
> The balancer will stop working when meets 5 conditions:
>
> 1. Datanodes are balanced (obviously you are not this kind);
> 2. No more block to be moved (all blocks on unbalanced nodes are busy or
> recently used)
> 3. No more block to be moved in 20 minutes and 5 times consecutive attempts
> 4. Another balancer is working
> 5. I/O exception
>
>
> The default setting is 10% for each datanodes, for 1TB it is 100GB, for 3T
> is 300GB, and for 60GB is 6GB
>
> Hope helpful
>
>
> On Thu, Feb 12, 2009 at 10:06 AM, Brian Bockelman <bbock...@cse.unl.edu>wrote:
>
>>
>> On Feb 12, 2009, at 2:54 AM, Deepak wrote:
>>
>> Hi,
>>>
>>> We're running Hadoop cluster on 4 nodes, our primary purpose of
>>> running is to provide distributed storage solution for internal
>>> applications here in TellyTopia Inc.
>>>
>>> Our cluster consists of non-identical nodes (one with 1TB another two
>>> with 3 TB and one more with 60GB) while copying data on HDFS we
>>> noticed that node with 60GB storage ran out of disk-space and even
>>> balancer couldn't balance because cluster was stopped. Now my
>>> questions are
>>>
>>> 1. Is Hadoop is suitable for non-identical cluster nodes?
>>>
>>
>> Yes.  Our cluster has between 60GB and 40TB on our nodes.  The majority
>> have around 3TB.
>>
>>
>>> 2. Is there any way to automatically balancing of nodes?
>>>
>>
>> We have a cron script which automatically starts the Balancer.  It's dirty,
>> but it works.
>>
>>
>>> 3. Why Hadoop cluster stops when one node ran our of disk?
>>>
>>>
>> That's not normal.  Trust me, if that was always true, we'd be perpetually
>> screwed :)
>>
>> There might be some other underlying error you're missing...
>>
>> Brian
>>
>>
>> Any futher inputs are appericiapted!
>>>
>>> Cheers,
>>> Deepak
>>> TellyTopia Inc.
>>>
>>
>>
>
>
> --
> Chen He
> RCF CSE Dept.
> University of Nebraska-Lincoln
> US
>



-- 
Deepak
TellyTopia Inc.

Reply via email to