RE: Hbase tuning for heavy write cluster

Vladimir Rodionov Sun, 26 Jan 2014 18:29:22 -0800

140 regions per server with up to 16GB max each.
Data size is too large (per server). Its > 2TB per server (w/o compression)?


450 Mbps (bits or bytes)? Per server is 17Mbit or MByte. I do not think that 
17MBytes sustained per server is feasible,
therefore its 17MBits (~ 2MB). Ok, to give you idea that your problem is not 
the ingestion rate but

data size, region numbers, splits and compactions.

I just finished tests on Amazon where I was able to load 200M rows into tiny 5 
node (m1.xlarge) clusters in less that 3 Hours
with 20K insert per sec (not quite sustained though). I was able to reach 
sustained (w/o stalls) rate over 5Kps (> 1MB per sec, per node)

but .. no splits and no major compactions. That is the key.

I do feel that the root cause of your problems in order of significance are:

1. Large data size per node. This results in 18min major compactions, for 
example. HBase is not able to manage over 1TB per node w/o
  custom compactions and managed region splitting (this is my assumptions - 
never tried it personally). 
  We have several systems in production and usually its less than 200-300GB per 
node of data.
2. Large number of active regions per region server. Do they all hot? 140 is OK 
when most of the regions are cold, not hot.

2. is related to 1.

I hope you have pre-split your tables properly? If you want to keep 1-2TB per 
server consider custom compactions and managed region splitting.
For time -series data, custom compactions make much sense, because, you access 
usually data in a specific time range ( like now(), now() -7d), therefore
you may want to compact data into non-overlapping (by timestamp) files. In this 
case, all compactions will be minor and you will never need to sort-merge
all region's data files.
  

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: [email protected]

________________________________________
From: lars hofhansl [[email protected]]
Sent: Sunday, January 26, 2014 9:32 AM
To: [email protected]
Subject: Re: Hbase tuning for heavy write cluster

 First, you want the RegionServer to use the available memory for caching, etc. 
Every byte of unused RAM is wasted.

I would make the heap slightly smaller than 32GB, so that the JVM can still use 
compressed OOPs.
So I'd set to 31GB.


Lastly, 800 writes/s still a bit low. How does the CPU usage look across the 
RegionServers?
If CPU is high, you might want to make the memstores *smaller* (it is expensive 
to read/write from/to a SkipList).
If you see bad IO, and many store files (as might be case following the 
discussion below) maybe you want to increase the memstores.

-- Lars



________________________________
 From: Rohit Dev <[email protected]>
To: [email protected]
Sent: Sunday, January 26, 2014 3:35 AM
Subject: Re: Hbase tuning for heavy write cluster


Hi Vladimir,

Here is my cluster status:

Cluster Size: 26
Server memory: 128GB
Total Writes per sec (data): 450 Mbps
Writes per sec (count) per server: avg ~800 writes/sec (some spikes
upto 3000 writes/sec)
Max Region Size: 16GB
Regions per server: ~140 (not sure if I would be able to merge some
empty regions while table is online)
We are running CDH 4.3

Recently I changed setttings to:
Java heap size for Region Server: 32GB
hbase.hregion.memstore.flush.size: 536870912
hbase.hstore.blockingStoreFiles: 30
hbase.hstore.compaction.max: 15
hbase.hregion.memstore.block.multiplier: 3
hbase.regionserver.maxlogs: 90 (it is too high for 512MB memstore flush size ?)

I'm seeing weird stuff, like one region has grown upto 34GB! and has
21 store files. MAX_FILESIZE for this table is only 16GB.
Could this be a problem ?



On Sat, Jan 25, 2014 at 9:49 PM, Vladimir Rodionov
<[email protected]> wrote:
> What is the load (ingestion) rate per server in your cluster?
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: [email protected]
>
> ________________________________________
> From: Rohit Dev [[email protected]]
> Sent: Saturday, January 25, 2014 6:09 PM
> To: [email protected]
> Subject: Re: Hbase tuning for heavy write cluster
>
> Compaction queue is ~600 in one of the Region-Server, while it is less
> than 5 is others (total 26 nodes).
> Compaction queue started going up after I increased the settings[1].
> In general, one Major compaction takes about 18 Mins.
>
> In the same region-server I'm seeing these two log messages frequently:
>
> 2014-01-25 17:56:27,312 INFO
> org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs:
> logs=167, maxlogs=32; forcing flush of 1 regions(s):
> 3788648752d1c53c1ec80fad72d3e1cc
>
> 2014-01-25 17:57:48,733 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for
> 'IPC Server handler 53 on 60020' on region
> tsdb,\x008WR\xE2+\x90\x00\x00\x02Qu\xF1\x00\x00(\x00\x97A\x00\x008M(7\x00\x00Bl\xE85,1390623438462.e6692a1f23b84494015d111954bf00db.:
> memstore size 1.5 G is >= than blocking 1.5 G size
>
> Any suggestion what else I can do or is ok to ignore these messages ?
>
>
> [1]
> New settings are:
>  - hbase.hregion.memstore.flush.size - 536870912
>  - hbase.hstore.blockingStoreFiles - 30
>  - hbase.hstore.compaction.max - 15
>  - hbase.hregion.memstore.block.multiplier - 3
>
> On Sat, Jan 25, 2014 at 3:00 AM, Ted Yu <[email protected]> wrote:
>> Yes, it is normal.
>>
>> On Jan 25, 2014, at 2:12 AM, Rohit Dev <[email protected]> wrote:
>>
>>> I changed these settings:
>>> - hbase.hregion.memstore.flush.size - 536870912
>>> - hbase.hstore.blockingStoreFiles - 30
>>> - hbase.hstore.compaction.max - 15
>>> - hbase.hregion.memstore.block.multiplier - 3
>>>
>>> Things seems to be getting better now, not seeing any of those
>>> annoying ' Blocking updates' messages. Except that, I'm seeing
>>> increase in 'Compaction Queue' size on some servers.
>>>
>>> I noticed memstores are getting flushed, but some with 'compaction
>>> requested=true'[1]. Is this normal ?
>>>
>>>
>>> [1]
>>> INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore
>>> flush of ~512.0 M/536921056, currentsize=3.0 M/3194800 for region
>>> tsdb,\x008ZR\xE1t\xC0\x00\x00\x02\x01\xB0\xF9\x00\x00(\x00\x0B]\x00\x008M((\x00\x00Bk\x9F\x0B,1390598160292.7fb65e5fd5c4cfe08121e85b7354bae9.
>>> in 3422ms, sequenceid=18522872289, compaction requested=true
>>>
>>> On Fri, Jan 24, 2014 at 6:51 PM, Bryan Beaudreault
>>> <[email protected]> wrote:
>>>> Also, I think you can up the hbase.hstore.blockingStoreFiles quite a bit
>>>> higher.  You could try something like 50.  It will reduce read performance
>>>> a bit, but shouldn't be too bad especially for something like opentsdb I
>>>> think.  If you are going to up the blockingStoreFiles you're probably also
>>>> going to want to up hbase.hstore.compaction.max.
>>>>
>>>> For my tsdb cluster, which is 8 i2.4xlarges in EC2, we have 90 regions for
>>>> tsdb.  We were also having issues with blocking, and I upped
>>>> blockingStoreFiles to 35, compaction.max to 15, and
>>>> memstore.block.multiplier to 3.  We haven't had problems since.  Memstore
>>>> flushsize for the tsdb table is 512MB.
>>>>
>>>> Finally, 64GB heap may prove problematic, but it's worth a shot.  I'd
>>>> definitely recommend java7 with the G1 garbage collector though.  In
>>>> general, Java would have a hard time with heap sizes greater than 20-25GB
>>>> without some careful tuning.
>>>>
>>>>
>>>> On Fri, Jan 24, 2014 at 9:44 PM, Bryan Beaudreault 
>>>> <[email protected]
>>>>> wrote:
>>>>
>>>>> It seems from your ingestion rate you are still blowing through HFiles too
>>>>> fast.  You're going to want to up the MEMSTORE_FLUSHSIZE for the table 
>>>>> from
>>>>> the default of 128MB.  If opentsdb is the only thing on this cluster, you
>>>>> can do the math pretty easily to find the maximum allowable, based on your
>>>>> heap size and accounting for 40% (default) used for the block cache.
>>>>>
>>>>>
>>>>> On Fri, Jan 24, 2014 at 9:38 PM, Rohit Dev <[email protected]> wrote:
>>>>>
>>>>>> Hi Kevin,
>>>>>>
>>>>>> We have about 160 regions per server with 16Gig region size and 10
>>>>>> drives for Hbase. I've looked at disk IO and that doesn't seem to be
>>>>>> any problem ( % utilization is < 2 across all disk)
>>>>>>
>>>>>> Any suggestion what heap size I should allocation, normally I allocate
>>>>>> 16GB.
>>>>>>
>>>>>> Also, I read increasing  hbase.hstore.blockingStoreFiles and
>>>>>> hbase.hregion.memstore.block.multiplier is good idea for write-heavy
>>>>>> cluster, but in my case it seem to be heading to wrong direction.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Fri, Jan 24, 2014 at 6:31 PM, Kevin O'dell <[email protected]>
>>>>>> wrote:
>>>>>>> Rohit,
>>>>>>>
>>>>>>>  64GB heap is not ideal, you will run into some weird issues. How many
>>>>>>> regions are you running per server, how many drives in each node, any
>>>>>> other
>>>>>>> settings you changed from default?
>>>>>>> On Jan 24, 2014 6:22 PM, "Rohit Dev" <[email protected]> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We are running Opentsdb on CDH 4.3 hbase cluster, with most of the
>>>>>>>> default settings. The cluster is heavy on write and I'm trying to see
>>>>>>>> what parameters I can tune to optimize the write performance.
>>>>>>>>
>>>>>>>>
>>>>>>>> # I get messages related to Memstore[1] and Slow Response[2] very
>>>>>>>> often, is this an indication of any issue ?
>>>>>>>>
>>>>>>>> I tried increasing some parameters on one node:
>>>>>>>> - hbase.hstore.blockingStoreFiles - from default 7 to 15
>>>>>>>> - hbase.hregion.memstore.block.multiplier - from default 2 to 8
>>>>>>>> - and heap size from 16GB to 64GB
>>>>>>>>
>>>>>>>> * 'Compaction queue' went up to ~200 within 60 mins after restarting
>>>>>>>> region server with new parameters and the log started to get even more
>>>>>>>> noisy.
>>>>>>>>
>>>>>>>> Can anyone please suggest if I'm going to right direction with these
>>>>>>>> new settings ? or if there are other thing that I could monitor or
>>>>>>>> change to make it better.
>>>>>>>>
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
>>>>>>>> for 'IPC Server handler 19 on 60020' on region
>>>>>> tsdb,\x008XR\xE0i\x90\x00\x00\x02Q\x7F\x1D\x00\x00(\x00\x0B]\x00\x008M(r\x00\x00Bl\xA7\x8C,1390556781703.0771bf90cab25c503d3400206417f6bf.:
>>>>>>>> memstore size 256.3 M is >= than blocking 256 M size
>>>>>>>>
>>>>>>>> [2]
>>>>>>>> WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow):
>>>>>> {"processingtimems":17887,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@586940ea
>>>>>>>> ),
>>>>>>>> rpc version=1, client version=29,
>>>>>>>> methodsFingerPrint=0","client":"192.168.10.10:54132
>>>>>> ","starttimems":1390587959182,"queuetimems":1498,"class":"HRegionServer","responsesize":0,"method":"multi"}
>>>>>
>>>>>
>
> Confidentiality Notice:  The information contained in this message, including 
> any attachments hereto, may be confidential and is intended to be read only 
> by the individual or entity to whom this message is addressed. If the reader 
> of this message is not the intended recipient or an agent or designee of the 
> intended recipient, please note that any review, use, disclosure or 
> distribution of this message or its attachments, in any form, is strictly 
> prohibited.  If you have received this message in error, please immediately 
> notify the sender and/or [email protected] and delete or destroy 
> any copy of this message and its attachments.

RE: Hbase tuning for heavy write cluster

Reply via email to