Thanks for the detailed analysis and update zheng wang. >The code line below in StoreScanner.next() cost about 100ms in v2.1, and it added from v2.0, see HBASE-17647.  So still there is some additional cost in 2.1 right? Do u have any other observation? Are we doing more cell compares in 2.x?
Anoop On Mon, Jun 8, 2020 at 1:50 AM zheng wang <[email protected]> wrote: > Hi guys: > > > I did some test on my pc to find the reason as Jan Van Besien mentioned in > user channel. > > > #test env > OS : win10 > JDK: 1.8 > MEM: 8GB > > > #test data: > 1 million rows with only one columnfamily and one qualifier. > > > rowkey: rowkey-#index# > value: value-#index# > > > #test method: > just use client api to scan with default config several times, no pe, no > ycsb > > > #test result(avg): > v1.2.0: 800ms > v2.1.0: 1050ms > > > So, it is sure that v2.1 is slower than v1.2, after this, i did some > statistics on regionserver. > Then i find the partly reason is related to the size estimated. > > > The code line below in StoreScanner.next() cost about 100ms in v2.1, and > it added from v2.0, see HBASE-17647. > "int cellSize = PrivateCellUtil.estimatedSerializedSizeOf(cell);" > > > Should we support to disable the MaxResultSize limit(2MB by default now) > to get more efficient if user exactly knows their data and could limit > results only by setBatch and setLimit?
