Re: HDFS on non-identical nodes

Brian Bockelman Thu, 12 Feb 2009 07:07:16 -0800


On Feb 12, 2009, at 2:54 AM, Deepak wrote:

Hi,

We're running Hadoop cluster on 4 nodes, our primary purpose of
running is to provide distributed storage solution for internal
applications here in TellyTopia Inc.

Our cluster consists of non-identical nodes (one with 1TB another two
with 3 TB and one more with 60GB) while copying data on HDFS we
noticed that node with 60GB storage ran out of disk-space and even
balancer couldn't balance because cluster was stopped. Now my
questions are

1. Is Hadoop is suitable for non-identical cluster nodes?

Yes. Our cluster has between 60GB and 40TB on our nodes. Themajority have around 3TB.


2. Is there any way to automatically balancing of nodes?

We have a cron script which automatically starts the Balancer. It'sdirty, but it works.


3. Why Hadoop cluster stops when one node ran our of disk?

That's not normal. Trust me, if that was always true, we'd beperpetually screwed :)


There might be some other underlying error you're missing...

Brian

Any futher inputs are appericiapted!

Cheers,
Deepak
TellyTopia Inc.

Re: HDFS on non-identical nodes

Reply via email to