Matt,
Since you are using ZooKeeper already, conceivably you could keep a hosts file
in ZooKeeper somewhere, use a strategy for updates similar to what is done for
implementing locking to insure a new slave gets and updates the latest version
"atomically", and use Twitcher to trigger updates on each host:
http://github.com/twitter/twitcher
?
Best regards,
- Andy
> From: Matt Corgan <[email protected]>
> Subject: Re: region doesn't split after 32+ GB
> To: "user" <[email protected]>
> Date: Wednesday, September 29, 2010, 11:30 AM
> Thanks for your help again Stack...
> sorry i don't have the logs. Will do a
> better job of saving them. By the way, this time the
> insert job maintained
> about 22k rows/sec all night without any pauses, and even
> though it was
> sequential insertion, it did a nice job of rotating the
> active region around
> the cluster.
>
> As for the hostnames, there are no problems in .89, and
> nothing is onerous
> by any means... we are just trying to come to some level of
> familiarity
> before putting any real data into hbase.
>
> EC2/RightScale make it very easy to add/remove
> regionservers to the cluster
> with the click of a button, which is the reason that the
> hosts file can
> change more often then you'd want to modify it
> manually. We're going to go
> the route of having each newly added regionserver append
> it's name to the
> host file of every other server in our EC2 account (~30
> servers). The only
> downsides I see there are that it doesn't scale very
> elegantly, and that it
> gets complicated if you want to launch multiple
> regionservers or new clients
> at the same time.
>
> For the sake of brainstorming, maybe it's possible to have
> the master always
> broadcast IP addresses and have all communication done via
> IP. This may be
> more robust anyway. Then the first time a new
> regionserver or cient gets an
> unfamiliar IP address, it can try to figure out the
> hostname (the same way
> the master currently does this), and cache it
> somewhere. The hostname could
> be added alongside the IP address or replace it in the logs
> for convenience.
>
> Thanks again,
> Matt