how to optimize for heavy writes scenario

Hef Fri, 17 Mar 2017 09:31:49 -0700

Hi group,
I'm using HBase to store large amount of time series data, the usage case
is heavy on writes then reads. My application stops at writing 600k
requests per second and I can't tune up for better tps.


Hardware:
I have 6 Region Servers, each has 128G memory, 12 HDDs, 2cores with
24threads,

Schema:
The schema for these time series data is similar as OpenTSDB that the data
points of a same metric within an hour are store in one row, and there
could be maximum 3600 columns per row.
The cell is about 70bytes on its size, including the rowkey, column
qualifier, column family and value.

HBase config:
CDH 5.6 HBase 1.0.0
100G memory for each RegionServer
hbase.hstore.compactionThreshold = 50
hbase.hstore.blockingStoreFiles = 100
hbase.hregion.majorcompaction disable
hbase.client.write.buffer = 20MB
hbase.regionserver.handler.count = 100
hbase.hregion.memstore.flush.size = 128MB


HBase Client:
write in BufferedMutator with 100000/batch

Inputs Volumes:
The input data throughput is more than 2millions/sec from Kafka


My writer applications are distributed, how ever I scaled them up, the
total write throughput won't get larger than 600K/sec.
The severs have 20% CPU usage and 5.6 wa,
GC  doesn't look good though, it shows a lot 10s+.

In my opinion,  1M/s input data will result in only  70MByte/s write
throughput to the cluster, which is quite a small amount compare to the 6
region servers. The performance should not be bad like this.

Is anybody has idea why the performance stops at 600K/s?
Is there anything I have to tune to increase the HBase write throughput?

how to optimize for heavy writes scenario

Reply via email to