Jonathan, I think I got it :) Do you know the upper bound for regions per node for 0.18 ? As I understand, 1000 regions is still OK but 4000 is not.
How can I estimate the amount of memory and number of xceivers for 0.18 if I know the key and value size ? M. On Tue, Feb 3, 2009 at 10:48 PM, Jonathan Gray <[email protected]> wrote: > Yes, you can of course add more disk space so you do not need as many nodes > to support the dataset. > > However, the ability for a regionserver to scale to 4000 regions is largely > unknown (and almost certainly impossible with 0.19 release). With a dataset > that large you might increase region size from 256MB up to 1GB or so and > then you'd be back to 1000 regions per node. > > Andrew Purtell has done the most experimentation with respect to scaling an > individual regionserver, but you'll need to do some of your own > experimentation to see how that would work with your setup. > > Amount of memory and number of xceivers will depend on your dataset and the > version you're running on. Memory usage is largely tied to index sizes (not > including writes/memcache and any caching) and currently those are directly > related to key and value size. Xceivers will hopefully change dramatically > with HADOOP-3856 but this will likely be an issue with scaling the number of > regions on a single RS until it's fixed in the Datanode. > > JG > >> -----Original Message----- >> From: Michael Dagaev [mailto:[email protected]] >> Sent: Tuesday, February 03, 2009 12:11 PM >> To: [email protected] >> Subject: Re: Hbase cluster configuration >> >> Thank you, Jonathan. I should have done the math :) >> >> > You would need ~40 nodes just to support 3X replication on HDFS. >> With about >> > 250GB per node, you would have around 1000 regions per node. >> >> Ok. Can I add just more disk space to the existing nodes >> instead of adding nodes to the cluster ? >> >> For instance, if I want 10 nodes rather than 40, I will add 1TB per >> node. >> Thus, I will have 4000 regions per node and I will have to increase >> the number of xceivers. >> Should I add more memory to the nodes as well ? >> >> > With 7.5GB of memory on each node, if you can give 3-4GB to the >> > RegionServer, you should be able to handle that number of regions and >> have >> > sufficient memory for indexes and some caching. >> >> How much memory do I need to handle 1000 regions ? >> >> > With 0.19.0 hadoop and hbase, you'll be hitting xceiver issues for >> sure, >> >> How many xceivers should I have >> >> > but this should be >> > resolved for the 0.20 release, at which point I am confident we could >> handle >> > that load. >> >> Thank you for your cooperation, >> M. > >
