Hi all,
we store serialised hyperloglog object into hbase by use of coprocessor,
and the size distribution is below:
Row size (bytes):
min = 4279.00
max = 770757.00
mean = 67340.24
stddev = 153968.88
median = 14453.00
75% <= 63178.00
95% <= 336917.20
98% <= 761028.00
99% <= 767500.36
99.9% <= 770757.00
count = 827
the this value will be update 400 times every minute but the regionserver
where this table locate responsed slowly for other table's get request.
In my option, write mob/blob should not bother the performance of reading
other table's region in the same regionserver, only put pressure to hlog
fsync,flush and compaction; right?
thanks
ps:
hdfs blocksize=128MB
memstore size=128MB
max hlog=64
low/high water: 0.35/0.4
memstore percent:0.4
heap size=32GB