On Feb 15, 2009, at 3:21 AM, Deepak wrote:

Thanks Brain and Chen!

I finally sort that out why cluster is being stopped after running out
of space. Its because of master failure due to disk space.

Regarding automatic balancer, I guess in our case, rate of copying is
faster than balancer rate, we found balancer do start but couldn't
perform its job.

There are parameters you can set which control how quickly the balancer is allowed to copy files about.

Nevertheless, you shouldn't rely on it to work for anything performance critical -- you'll probably want to ensure there's enough space around to do your work in the short-term.

Brian



Anyways thanks for your help! It helped me sort out somethings.

Cheers,
Deepak

On Thu, Feb 12, 2009 at 5:32 PM, He Chen <air...@gmail.com> wrote:
I think you should confirm your balancer is still running. Do you change
the threshold of the HDFS balancer? May be too large?

The balancer will stop working when meets 5 conditions:

1. Datanodes are balanced (obviously you are not this kind);
2. No more block to be moved (all blocks on unbalanced nodes are busy or
recently used)
3. No more block to be moved in 20 minutes and 5 times consecutive attempts
4. Another balancer is working
5. I/O exception


The default setting is 10% for each datanodes, for 1TB it is 100GB, for 3T
is 300GB, and for 60GB is 6GB

Hope helpful


On Thu, Feb 12, 2009 at 10:06 AM, Brian Bockelman <bbock...@cse.unl.edu >wrote:


On Feb 12, 2009, at 2:54 AM, Deepak wrote:

Hi,

We're running Hadoop cluster on 4 nodes, our primary purpose of
running is to provide distributed storage solution for internal
applications here in TellyTopia Inc.

Our cluster consists of non-identical nodes (one with 1TB another two
with 3 TB and one more with 60GB) while copying data on HDFS we
noticed that node with 60GB storage ran out of disk-space and even
balancer couldn't balance because cluster was stopped. Now my
questions are

1. Is Hadoop is suitable for non-identical cluster nodes?


Yes. Our cluster has between 60GB and 40TB on our nodes. The majority
have around 3TB.


2. Is there any way to automatically balancing of nodes?


We have a cron script which automatically starts the Balancer. It's dirty,
but it works.


3. Why Hadoop cluster stops when one node ran our of disk?


That's not normal. Trust me, if that was always true, we'd be perpetually
screwed :)

There might be some other underlying error you're missing...

Brian


Any futher inputs are appericiapted!

Cheers,
Deepak
TellyTopia Inc.





--
Chen He
RCF CSE Dept.
University of Nebraska-Lincoln
US




--
Deepak
TellyTopia Inc.

Reply via email to