Hello Hbase community,
We have recently switched to hbase 2.2.6 and have noticed that the SCANs are 
very slow. When we scan a very small amount of data (eg 100k, 200k) we do not 
encounter any problems. But when the amount of data reaches 1 million, the 
scans become very slow.For the scans we basically set startRow and endRow and 
apply different filters. Several threads always require batches of 1000 rows. 
To get the 1000 rows, while we call next (), we use a counter and when we get 
to 1000 we close the scan with an InterupException. This didn't give us any 
problems in hbase 94 and we had good performance.
In Hbase2 we saw that there is a setLimit (int) option to specify to the 
regionserver the number of rows it wants. Also I see that it is possible to set 
a readType which can be PREAD or STREAM.- Do you think that setting this option 
can lead to better scan performance?- What is the difference between PREAD and 
STREAM?- In which case does it make sense to use PREAD / STREAM?
We have already done some hbase server-side tuning, but we still can't get good 
scan performance.When we start working with large amounts of data, we start to 
see a lot of server-side "responseTooSlow".like:2021-10-28 16: 45: 00,854 WARN 
[RpcServer.default.FPBQ.Fifo.handler = 46, queue = 1, port = 16020] 
ipc.RpcServer: (responseTooSlow): {"call": "Scan (org. 
apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos $ ScanRequest) "," 
starttimems ":" 1635432272849 "," responsesize ":" 221799 "," method ":" Scan 
"," param ":" scanner_id: 3011016724423115474 number_of_rows: 1000 
close_scanner: false next_call_seq: 0 client_handles_partials: true 
client_handles_heartbeats: tr \ u003cTRUNCATED \ u003e "," processingtimems ": 
28005," client ":" 10.200.86.173:60806","queuetimclass "":0 HRegionServer "," 
scandetails ":" table: mn1_7491_hinvio region: mn1_7491_hinvio .....}

Thanks,
Hamado Dene

Reply via email to