Anoop Sam John created HBASE-25582:
--------------------------------------

             Summary: Support setting scan ReadType to be STREAM at cluster 
level
                 Key: HBASE-25582
                 URL: https://issues.apache.org/jira/browse/HBASE-25582
             Project: HBase
          Issue Type: Improvement
            Reporter: Anoop Sam John
            Assignee: Anoop Sam John


We have the config 'hbase.storescanner.use.pread' at cluster level to set 
ReadType to be PRead if not explicitly specified in Scan object.
Same way we can have a way to make scan as STREAM type at cluster level (if not 
specified at Scan object level)
We do not need any new configs or so.  We have the config 
'hbase.storescanner.pread.max.bytes' which specifies when to switch read type 
to stream and it defaults to 4 * HFile block size.  If one config this value as 
<= 0 means user need the switch when scanner is created itself.  With such a 
handling we can support it.
So every scan need not set the read type.

The issue is in Cloud storage based system using Stream reads might be better.  
We introduced this PRead based scan with tests on HDFS based storage.   In my 
customer case, Azure storage in place and WASB driver been used. We have a read 
ahead mechanism there (Read an entire Block of a blob in one REST call) and 
buffer that in WASB driver.  This helps a lot wrt longer scans.   Ya with 
config 'hbase.storescanner.pread.max.bytes'  we can make the switch to happen 
early but better to go with 1.x way where the scan starts with Stream read 
itself.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to