Hi all, Thanks for all the help, I think I got it. In addition to everyone's advice I also found a useful post regarding stability and performance: http://kisalay.com/2012/04/09/hbase-configurations/
That led me to configure a smaller memstore flush size of 128MB times 4 block multiplier but with more StoreFiles (20) and about 30 handlers. My write job looks fluid (no long pauses) now and the heap is managed well, the only thing I still don't get is the small flushes (less than 100MB and sometimes less than 10MB) I get sometime... Hope this digging helps someone in the future ;) Thanks, Amit. On Sun, Jun 17, 2012 at 10:03 AM, Harsh J <[email protected]> wrote: > Hi, > > No, in that case my comment can be considered incorrect. The HLog > shouldn't fill up very fast - and your problem does sound memory bound > now (upper/lower watermark hits). > > On Sun, Jun 17, 2012 at 11:49 AM, Infolinks <[email protected]> wrote: > > Hi Harsh J, > > > > I'm not using WAL in my writes. > > Is there still a log rolling ? > > > > ב-Jun 17, 2012, בשעה 7:40, Harsh J <[email protected]> כתב/ה: > > > >> Amit, > >> > >> Your values for HLog block size (hbase.regionserver.hlog.blocksize, > >> default is the HDFS default block size (64 MB unless you've raised it > >> properly), too low unless you also have HLog compression) and the > >> factor of max-hlogs-to-keep (hbase.regionserver.maxlogs, default 32 > >> files) can easily cause premature flushing as it is another criteria. > >> Given your write workload (which hit the WAL), this is definitely what > >> you're hitting. > >> > >> On Sat, Jun 16, 2012 at 7:47 PM, Amit Sela <[email protected]> wrote: > >>> Thanks Doug, I read the regions section from the book like you > recommended > >>> but I still have some questions left. > >>> > >>> When running a massive write job, the regionserver log show the memsize > >>> that is flushed. The problem is that most of the time the memsize is > either > >>> much smaller then the memstore.flush.size configured (resulting in > writing > >>> more files, which leads to frequent compactions) or bigger > >>> than memstore.flush.size * memstore.block.multiplier (resulting in > Blocking > >>> updates for 'IPC Server handler # on <port>...). > >>> In some cases I also see HBaseServer throwing a ClosedChannelException: > >>> "WARN org.apache.hadoop.ipc.HBaseServer: IPC Server handler <handler > #> on > >>> <port #> caught: java.nio.channels.ClosedChannelException" > >>> > >>> I guess these problems are also the cause for long (few minutes) > pauses and > >>> in extreme cases Full GC during the write jobs. > >>> > >>> Any ideas anyone ? > >>> > >>> In general, I did some digging and couldn't find much about the write > >>> process in HBase from a "memory usage" point of view... besides the > >>> configurations description - maybe worth adding to the book. > >>> > >>> Thank you for all your help, > >>> > >>> Amit. > >>> > >>> > >>> On Mon, Jun 11, 2012 at 3:22 PM, Doug Meil < > [email protected]>wrote: > >>> > >>>> > >>>> Hi there- > >>>> > >>>> Your understanding is on track. > >>>> > >>>> > >>>> You probably want to read this section.. > >>>> > >>>> http://hbase.apache.org/book.html#regions.arch > >>>> > >>>> Š as it covers those topics in more detail. > >>>> > >>>> > >>>> > >>>> > >>>> On 6/10/12 1:02 PM, "Amit Sela" <[email protected]> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> I'm trying to better understand what's going on in the region server > >>>>> during > >>>>> write to HBase. > >>>>> > >>>>> As I understand the process: > >>>>> > >>>>> 1. Data is written to memstore. > >>>>> 2. Once the memstore has reached hbase.hregion.memstore.flush.size -> > >>>>> memstore executes flush and writes a new StoreFile. > >>>>> 3. The number of StoreFiles increases until a compaction is > triggered. > >>>>> > >>>>> To my understanding, the compaction is triggered after a compaction > check > >>>>> is done by either CheckCompaction thread running in the background > or by > >>>>> the flush memstore executed. > >>>>> The compaction triggered will be a minor compaction BUT it could > promote > >>>>> to > >>>>> major if it includes all store files. > >>>>> When will it NOT include all store files ? say I set > compactionThreshld to > >>>>> 3, then when the 3rd (or 4th) flush is executed, a compaction wiil be > >>>>> triggered and will promote to major since it includes all store > files. > >>>>> > >>>>> Is this right ? can anyone elaborate ? > >>>> > >>>> > >>>> > >> > >> > >> > >> -- > >> Harsh J > > > > -- > Harsh J >
