[DISCUSS]HBase2.1.0 is slower than HBase1.2.0

zheng wang Sun, 07 Jun 2020 13:20:20 -0700

Hi guys:


I did some test on my pc to find the reason as Jan Van Besien mentioned in user 
channel.


#test env
OS : win10
JDK: 1.8
MEM: 8GB


#test data:
1 million rows with only one columnfamily and one qualifier.


rowkey: rowkey-#index#
value: value-#index#


#test method:
just use client api to scan with default config several times, no pe, no ycsb


#test result(avg):
v1.2.0: 800ms
v2.1.0: 1050ms


So, it is sure that v2.1 is slower than v1.2, after this, i did some statistics 
on regionserver.
Then i find the partly reason is related to the size estimated.


The code line below in StoreScanner.next() cost about 100ms in v2.1, and it 
added from v2.0, see HBASE-17647.&nbsp;
"int cellSize = PrivateCellUtil.estimatedSerializedSizeOf(cell);"


Should we support to disable the MaxResultSize limit(2MB by default now) to get 
more efficient if user exactly knows their data and could limit results only by 
setBatch and setLimit?&nbsp;

[DISCUSS]HBase2.1.0 is slower than HBase1.2.0

Reply via email to