Again, the read/write load has much more to do with cluster sizing than the 
dataset (total capacity aside).

To give you an idea of how widely it varies, I had a client who put several 
hundred GBs of data onto a single node setup of HBase.  I've also seen clusters 
of 20-100 nodes with only 10s of GBs on it (very high concurrent write load).  
Recently I've been playing with a 100 node cluster with about 20TB of data on 
it (before replication).

Each of these clusters had very different load profiles.  And node count is not 
the only important metric.  That one node cluster was a pair of 2TB disks while 
these 100 node clusters are packed with 12 1TB disks per node.

JG

> -----Original Message-----
> From: Jinsong Hu [mailto:jinsong...@hotmail.com]
> Sent: Wednesday, September 01, 2010 11:22 AM
> To: user@hbase.apache.org
> Subject: Re: how many regions a regionserver can support
> 
> I did a testing with 6 regionserver cluster with a key design that
> spread
> the incoming data to all regions.
> I noticed after pumping data for 3-4 days for about 3 TB data, one of
> the
> regionserver shuts down because
> of channel IO error.  on a 3 regionserver cluster and same key design,
> the
> regionservers shuts down after only
> 45G data insertion.
> 
> I notice that if the key is designed so that it doesn't spread to all
> regions, but only to small portion of regions and that
> portion of regions spread approximately evenly among all regionservers,
> then
> the HDFS  size becomes the limit of
> the total number of regions that can be supported and I don't run into
> this
> IO issue.
> 
> Can any body show us the actual example of the hbase data size and
> cluster
> size ?
> 
> Jimmy.
> 
> --------------------------------------------------
> From: "Jonathan Gray" <jg...@facebook.com>
> Sent: Friday, August 27, 2010 10:55 AM
> To: <user@hbase.apache.org>
> Subject: RE: how many regions a regionserver can support
> 
> > There is no fixed limit, it has much more to do with the read/write
> load
> > than the actual dataset size.
> >
> > HBase is usually fine having very densely packed RegionServers, if
> much of
> > the data is rarely accessed.  If you have extremely high numbers of
> > regions per server and you are writing to all of these regions, or
> even
> > reading from all of them, you could have issues.  Though storage
> capacity
> > needs to be considered, capacity planning often has much more to do
> with
> > how much memory you need to support the read/write load you expect.
> Reads
> > mostly from a performance POV but for writes, there are some
> important
> > considerations related to the number of regions per server (and thus
> data
> > density and determining your max region size).
> >
> > In any case, you should probably increase your max size to 1GB or so
> and
> > can go higher if necessary.
> >
> > JG
> >
> >> -----Original Message-----
> >> From: Jinsong Hu [mailto:jinsong...@hotmail.com]
> >> Sent: Friday, August 27, 2010 10:03 AM
> >> To: user@hbase.apache.org
> >> Subject: how many regions a regionserver can support
> >>
> >> Hi, There :
> >>    Does anybody know how many region a regionserver can support ? I
> >> have
> >> regionservers with 8G ram and 1.5T disk and 4 core CPU.
> >> I searched http://www.facebook.com/note.php?note_id=142473677002 and
> >> they
> >> say google target is 100 regions of 200M for each
> >> regionserver.
> >>   In my case, I have 2700 regions spread to 6 regionservers. each
> >> region is
> >> set to default size of 256M . and it seems it is still running fine.
> I
> >> am
> >> running CDH3.  I just wonder what is the upper limit so that I can
> do
> >> capacity planning. Does anybody know this ?
> >>
> >> Jimmy.
> >
> >

Reply via email to