Slow reads coinciding with higher compaction time avg time

Girish Joshi Sun, 01 Nov 2015 10:25:28 -0800

Hello

In my hbase cluster, I observe the following consistently happening over
several days:-


- There is a spike in compaction time avg time metric. At the same time the
swap bytes in and swap bytes out also have higher value.
- Around the same time, I see the FS PRead and FS Read latencies and client
latencies doing random reads increase.

My hbase cluster consisting of 16 nodes and setup with a replication to
another cluster of 16 nodes has the following workload:-

- There are around 4 tables which have lot of write activity(around 500k
per second writes on m1/m15 moving average). 2 of these tables have atomic
counter columns keeping track of some analytics data and being incremented
with every write.

- There are 2 tables which receive bulk uploaded data periodically(around
once a day)

- We expect reads at around 100k per second mainly from tables which have
bulk upload data and the one which has counter columns. The read
latencies(p99) spike up to around 1000-5000 ms when the above compaction
time avg time metric increases. In other times, they are below 100 ms.

I have set the hbase.hregion.majorcompaction to 0 on region servers; I plan
to set it to 0 on master nodes too so that I can take out the possibility
of time triggered major compactions being the problem. But I suspect there
are lot of minor compactions and those leading to major compactions
happening at the time of spikes.

*Any suggestions on how to avoid this situation of read latency spikes and
have better read performance?*

Thanks,

Girish.

Slow reads coinciding with higher compaction time avg time

Reply via email to