Re: Help with continuous loading configuration

Amit Jain Wed, 16 Nov 2011 16:37:50 -0800

We would prefer not to do this.  It's important that we have all of the
historical data without any loss.  But thanks for the suggestion.


- Amit

On Wed, Nov 16, 2011 at 4:30 PM, Matt Corgan <[email protected]> wrote:

> You can set put.setWriteToWAL(false) to skip the write ahead logging which
> slows down puts significantly.  But, you will lose data if a regionserver
> crashes with data in its memstore.
>
>
> On Wed, Nov 16, 2011 at 4:09 PM, Amit Jain <[email protected]> wrote:
>
> > Hi Stack,
> >
> > Thanks for the feedback.  Comments inline ...
> >
> > On Wed, Nov 16, 2011 at 3:35 PM, Stack <[email protected]> wrote:
> >
> > > On Wed, Nov 16, 2011 at 3:26 PM, Amit Jain <[email protected]>
> wrote:
> > > > Hi Lars,
> > > >
> > > > The keys are arriving in random order.  The HBase monitoring page
> shows
> > > > evenly distributed load across all of the region servers.
> > >
> > > What kind of ops rates are you seeing?  They are running nice and
> > > smooth across all servers?   No stuttering?   Whats your regionserver
> > > logs look like?
> > >
> > > Are you presplitting your table or just letting hbase run and do up the
> > > splits?
> > >
> >
> > As far as I can tell, the operations look smooth across all servers.
>  We're
> > not doing any pre-splitting, just letting HBase do the splits.
> >
> >
> > > >  I didn't see
> > > > anything weird in the gc logs, no mention of any failures.  I'm a
> > little
> > > > unclear about what the optimal values for the following properties
> > should
> > > > be:
> > > >
> > > > hbase.hstore.compactionThreshold
> > >
> > > Default is 3.  Look in regionserver logs.  See how many files you have
> > > on average by region columnfamily (you could also look in filesystem).
> > >  Are we constantly rewriting them?   If write only load mostly, you
> > > might up this putting off compactions till more files around (but
> > > looking in regionserver logs, if high write rate, we might be having
> > > trouble keeping up with this default threshold anyways?).
> > >
> >
> > Well, it looks like half of the regions are in the 25-32 file range and
> the
> > other half just have 1 or 2 files.  This was when we ran it with a
> > compactionThreshold of 15.
> >
> > How can I tell by looking at the region server logs if we're seeing a
> "high
> > write rate" ?  We've got 48 clients sending load, 12 region servers
> total.
> >  We're pushing the system pretty hard.
> >
> >
> > > > hbase.hstore.blockingStoreFiles
> > > >
> > >
> > > The higher this is, the bigger the price you'll pay if a server
> > > crashes because this will be the upper bound on how many WAL logs we
> > > need to split for the server before its regions come back on line
> > > again.  Leave it default I'd say for now.
> > >
> >
> > Ok, we'll leave it default for now.
> >
> >
> > > > Is there some rule of thumb that I can use to determine good values
> for
> > > > these properties?
> > > >
> > >
> > > You've checked out this section of the book:
> > > http://hbase.apache.org/book.html#performance
> > >
> > > Are you filling the machines?   Are they burning cpu?  Or io-bound?
> > > If not, perhaps open the front gate wider by upping the number of
> > > concurrent handlers.
> > >
> >
> > I have read through that section of the HBase book.  There is plenty of
> CPU
> > available.  How do I up the number of concurrent handlers?  Increase
> > hbase.regionserver.handler.count ?
> >
> > - Amit
> >
>

Re: Help with continuous loading configuration

Reply via email to