How large a cluster? How large is each data-node? How much disk is devoted to hbase?
How does your HDFS data arrive? From one or a few machines in the cluster? From outside the cluster? On Thu, Mar 17, 2011 at 12:13 PM, Stuart Smith <stu24m...@yahoo.com> wrote: > Parts of this may end up on the hbase list, but I thought I'd start here. > My basic problem is: > > My cluster is getting full enough that having one data node go down does > put a bit of pressure on the system (when balanced, every DN is more than > half full). > > I write (and delete) pretty actively to Hbase & some hdfs direct. > > The cluster keeps drifting dangerously out of balance. > > I run the balancer daily, but: > > - I've seen reports that you shouldn't rebalance with regionservers > running, yet, I don't really have a choice. Without HBase, my system is > pretty much down. If it gets out of balance, it will also come down. > > Anybody here have any idea how badly running the balancer on a heavily > active system messes things up? (for hdfs/hbase - if anyone knows). > > - Possibly somewhat related: I'm seeing more "failed to move block" > errors in my balancer logs. It got to the point were I wasn't seeing any > effective rebalancing occur. I've turned off access to the cluster and > rebalanced (one node was down to 10% free space, a couple others when up to > 50 or more). I'm back down to around 20-40% free space on each node (as > reported by the hdfs web interface). > > How effective is the balancer on a active cluster? Is there any way to > make it's life easier, so it can stay in balance with daily runs? > > I'm not sure why the one node ends up being so heavily favored, either. The > favoritism even seems to survive taking the node down, and bringing it back > up. If I can't find the resources to upgrade, I might try that again, but > I'm less than hopeful about it. > > Any ideas? Or do I just need better hardware? Not sure if that's an option, > though.. > > Take care, > -stu > > > >