Re: The write process in the Region Server

Amit Sela Sun, 17 Jun 2012 09:53:06 -0700

Hi all,

Thanks for all the help, I think I got it.
In addition to everyone's advice I also found a useful post regarding
stability and performance:
http://kisalay.com/2012/04/09/hbase-configurations/


That led me to configure a smaller memstore flush size of 128MB times 4
block multiplier but with more StoreFiles (20) and about 30 handlers.

My write job looks fluid (no long pauses) now and the heap is managed well,
the only thing I still don't get is the small flushes (less than 100MB and
sometimes less than 10MB) I get sometime...

Hope this digging helps someone in the future ;)

Thanks,

Amit.

On Sun, Jun 17, 2012 at 10:03 AM, Harsh J <[email protected]> wrote:

> Hi,
>
> No, in that case my comment can be considered incorrect. The HLog
> shouldn't fill up very fast - and your problem does sound memory bound
> now (upper/lower watermark hits).
>
> On Sun, Jun 17, 2012 at 11:49 AM, Infolinks <[email protected]> wrote:
> > Hi Harsh J,
> >
> > I'm not using WAL in my writes.
> > Is there still a log rolling ?
> >
> > ב-Jun 17, 2012, בשעה 7:40, Harsh J <[email protected]> כתב/ה:
> >
> >> Amit,
> >>
> >> Your values for HLog block size (hbase.regionserver.hlog.blocksize,
> >> default is the HDFS default block size (64 MB unless you've raised it
> >> properly), too low unless you also have HLog compression) and the
> >> factor of max-hlogs-to-keep (hbase.regionserver.maxlogs, default 32
> >> files) can easily cause premature flushing as it is another criteria.
> >> Given your write workload (which hit the WAL), this is definitely what
> >> you're hitting.
> >>
> >> On Sat, Jun 16, 2012 at 7:47 PM, Amit Sela <[email protected]> wrote:
> >>> Thanks Doug, I read the regions section from the book like you
> recommended
> >>> but I still have some questions left.
> >>>
> >>> When running a massive write job, the regionserver log show the memsize
> >>> that is flushed. The problem is that most of the time the memsize is
> either
> >>> much smaller then the memstore.flush.size configured (resulting in
> writing
> >>> more files, which leads to frequent compactions) or bigger
> >>> than memstore.flush.size * memstore.block.multiplier (resulting in
> Blocking
> >>> updates for 'IPC Server handler # on <port>...).
> >>> In some cases I also see HBaseServer throwing a ClosedChannelException:
> >>> "WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler <handler
> #> on
> >>> <port #> caught: java.nio.channels.ClosedChannelException"
> >>>
> >>> I guess these problems are also the cause for long (few minutes)
> pauses and
> >>> in extreme cases Full GC during the write jobs.
> >>>
> >>> Any ideas anyone ?
> >>>
> >>> In general, I did some digging and couldn't find much about the write
> >>> process in HBase from a "memory usage" point of view... besides the
> >>> configurations description - maybe worth adding to the book.
> >>>
> >>> Thank you for all your help,
> >>>
> >>> Amit.
> >>>
> >>>
> >>> On Mon, Jun 11, 2012 at 3:22 PM, Doug Meil <
> [email protected]>wrote:
> >>>
> >>>>
> >>>> Hi there-
> >>>>
> >>>> Your understanding is on track.
> >>>>
> >>>>
> >>>> You probably want to read this section..
> >>>>
> >>>> http://hbase.apache.org/book.html#regions.arch
> >>>>
> >>>> Š as it covers those topics in more detail.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 6/10/12 1:02 PM, "Amit Sela" <[email protected]> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> I'm trying to better understand what's going on in the region server
> >>>>> during
> >>>>> write to HBase.
> >>>>>
> >>>>> As I understand the process:
> >>>>>
> >>>>> 1. Data is written to memstore.
> >>>>> 2. Once the memstore has reached hbase.hregion.memstore.flush.size ->
> >>>>> memstore executes flush and writes a new StoreFile.
> >>>>> 3. The number of StoreFiles increases until a compaction is
> triggered.
> >>>>>
> >>>>> To my understanding, the compaction is triggered after a compaction
> check
> >>>>> is done by either CheckCompaction thread running in the background
> or by
> >>>>> the flush memstore executed.
> >>>>> The compaction triggered will be a minor compaction BUT it could
> promote
> >>>>> to
> >>>>> major if it includes all store files.
> >>>>> When will it NOT include all store files ? say I set
> compactionThreshld to
> >>>>> 3, then when the 3rd (or 4th) flush is executed, a compaction wiil be
> >>>>> triggered and will promote to major since it includes all store
> files.
> >>>>>
> >>>>> Is this right ? can anyone elaborate ?
> >>>>
> >>>>
> >>>>
> >>
> >>
> >>
> >> --
> >> Harsh J
>
>
>
> --
> Harsh J
>

Re: The write process in the Region Server

Reply via email to