Anoop Sam John created HBASE-25582:
--------------------------------------
Summary: Support setting scan ReadType to be STREAM at cluster
level
Key: HBASE-25582
URL: https://issues.apache.org/jira/browse/HBASE-25582
Project: HBase
Issue Type: Improvement
Reporter: Anoop Sam John
Assignee: Anoop Sam John
We have the config 'hbase.storescanner.use.pread' at cluster level to set
ReadType to be PRead if not explicitly specified in Scan object.
Same way we can have a way to make scan as STREAM type at cluster level (if not
specified at Scan object level)
We do not need any new configs or so. We have the config
'hbase.storescanner.pread.max.bytes' which specifies when to switch read type
to stream and it defaults to 4 * HFile block size. If one config this value as
<= 0 means user need the switch when scanner is created itself. With such a
handling we can support it.
So every scan need not set the read type.
The issue is in Cloud storage based system using Stream reads might be better.
We introduced this PRead based scan with tests on HDFS based storage. In my
customer case, Azure storage in place and WASB driver been used. We have a read
ahead mechanism there (Read an entire Block of a blob in one REST call) and
buffer that in WASB driver. This helps a lot wrt longer scans. Ya with
config 'hbase.storescanner.pread.max.bytes' we can make the switch to happen
early but better to go with 1.x way where the scan starts with Stream read
itself.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)