I'm trying to increase write throughput of our hbase cluster. we'r
currently doing around 7500 messages per sec per node. I think we have room
for improvement. Especially since the heap is under utilized and memstore
size doesn't seem to fluctuate much between regular and peak ingestion
loads.

We mainly have one large table that we write most of the data to. Other
tables are mainly opentsdb and some relatively small summary tables. This
table is read in batch once a day but otherwise is mostly serving writes
99% of the time. This large table has 1 CF and get's flushed at around
~128M fairly regularly like below..

{log}

2014-10-31 16:56:09,499 INFO org.apache.hadoop.hbase.regionserver.HRegion:
Finished memstore flush of ~128.2 M/134459888, currentsize=879.5 K/900640
for region
msg,00102014100515impression\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x002014100515040200049358\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x004138647301\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0002e5a329d2171149bcc1e83ed129312b\x00\x00\x00\x00,1413909604591.828e03c0475b699278256d4b5b9638a2.
in 640ms, sequenceid=16861176169, compaction requested=true

{log}

Here's a pastebin of my hbase site : http://pastebin.com/fEctQ3im

What i'v tried..
-  turned of major compactions , and handling these manually.
-  bumped up heap Xmx from 24G to 48 G
-  hbase.hregion.memstore.flush.size = 512M
- lowerLimit/ upperLimit on memstore are defaults (0.38 , 0.4) since the
global heap has enough space to accommodate the default percentages.
 - Currently running Hbase 98.1 on an 8 node cluster that's scaled up to
128GB RAM.


There hasn't been any appreciable increase in write perf. Still hovering
around the 7500 per node write throughput number. The flushes still seem to
be hapenning at 128M (instead of the expected 512)

I'v attached a snapshot of the memstore size vs. flushQueueLen. the block
caches are utilizing the extra heap space but not the memstore. The flush
Queue lengths have increased which leads me to believe that it's flushing
way too often without any increase in throughput.

Please let me know where i should dig further. That's a long email, thanks
for reading through :-)



Cheers,
-Gautam.

Reply via email to