Andrew, thanks. Looks like I should think of scalability of a single region server. I will probably think of it again and ask questions on the list later.
M. On Wed, Feb 4, 2009 at 1:03 AM, Andrew Purtell <[email protected]> wrote: > Hi Michael, > > I have found that trial and error is necessary now. There are no > clear formulas. How large the system can scale depends entirely > on the distributions of various aspects of your data set and on > the application specific load. > > Cluster start up is the most demanding time. If you have an > inadequate number of xceivers available in the data node, you > will see regions fail to deploy at start up with transient > errors recorded in the master log regarding missing blocks. > This is an indication you need to increase data node resources. > I keep a tail of the master log up in a window when the cluster > is starting up. Add a grep on "ERROR" if you just want to catch > exceptional conditions. When you see this, increase the number > of configured xceivers by a factor of two and restart. > > You should also increase the number of handlers in the data > node configuration to the number of nodes in your cluster. > > Hope this helps, > > - Andy > >> From: Michael Dagaev >> Do you know the upper bound for regions per node for 0.18? >> As I understand, 1000 regions is still OK but 4000 is not. >> How can I estimate the amount of memory and number of >> xceivers for 0.18 if I know the key and value size ? > > > > >
