You can go way beyond the max region split / split size.  HBase will never 
split the region once it is a single row, even if beyond the split size.

Also, if you're using large values, you should have region sizes much larger 
than the default.  It's common to run with 1-2GB regions in many cases.

What you may have seen are recommendations that if your cell values are 
approaching the default block size on HDFS (64MB), you should consider putting 
the data directly into HDFS rather than HBase.

JG

> -----Original Message-----
> From: William Kang [mailto:weliam.cl...@gmail.com]
> Sent: Tuesday, September 07, 2010 7:36 PM
> To: user@hbase.apache.org; apurt...@apache.org
> Subject: Re: Limits on HBase
> 
> Hi,
> Thanks for your reply. How about the row size? I read that a row should
> not
> be larger than the hdfs file on region server which is 256M in default.
> Is
> it right? Many thanks.
> 
> 
> William
> 
> On Tue, Sep 7, 2010 at 2:22 PM, Andrew Purtell <apurt...@apache.org>
> wrote:
> 
> > In addition to what Jon said please be aware that if compression is
> > specified in the table schema, it happens at the store file level --
> > compression happens after write I/O, before read I/O, so if you
> transmit a
> > 100MB object that compresses to 30MB, the performance impact is that
> of
> > 100MB, not 30MB.
> >
> > I also try not to go above 50MB as largest cell size, for the same
> reason.
> > I have tried storing objects larger than 100MB but this can cause out
> of
> > memory issues on busy regionservers no matter the size of the heap.
> When/if
> > HBase RPC can send large objects in smaller chunks, this will be less
> of an
> > issue.
> >
> > Best regards,
> >
> >    - Andy
> >
> > Why is this email five sentences or less?
> > http://five.sentenc.es/
> >
> >
> > --- On Mon, 9/6/10, Jonathan Gray <jg...@facebook.com> wrote:
> >
> > > From: Jonathan Gray <jg...@facebook.com>
> > > Subject: RE: Limits on HBase
> > > To: "user@hbase.apache.org" <user@hbase.apache.org>
> > > Date: Monday, September 6, 2010, 4:10 PM
> > > I'm not sure what you mean by
> > > "optimized cell size" or whether you're just asking about
> > > practical limits?
> > >
> > > HBase is generally used with cells in the range of tens of
> > > bytes to hundreds of kilobytes.  However, I have used
> > > it with cells that are several megabytes, up to about
> > > 50MB.  Up at that level, I have seen some weird
> > > performance issues.
> > >
> > > The most important thing is to be sure to tweak all of your
> > > settings.  If you have 20MB cells, you need to be sure
> > > to increase the flush size beyond 64MB and the split size
> > > beyond 256MB.  You also need enough memory to support
> > > all this large object allocation.
> > >
> > > And of course, test test test.  That's the easiest way
> > > to see if what you want to do will work :)
> > >
> > > When you run into problems, e-mail the list.
> > >
> > > As far as row size is concerned, the only issue is that a
> > > row can never span multiple regions so a given row can only
> > > be in one region and thus be hosted on one server at a
> > > time.
> > >
> > > JG
> > >
> > > > -----Original Message-----
> > > > From: William Kang [mailto:weliam.cl...@gmail.com]
> > > > Sent: Monday, September 06, 2010 1:57 PM
> > > > To: hbase-user
> > > > Subject: Limits on HBase
> > > >
> > > > Hi folks,
> > > > I know this question may have been asked many times,
> > > but I am wondering
> > > > if
> > > > there is any update on the optimized cell size (in
> > > megabytes) and row
> > > > size
> > > > (in megabytes)? Many thanks.
> > > >
> > > >
> > > > William
> > >
> >
> >
> >
> >
> >

Reply via email to