[ 
https://issues.apache.org/jira/browse/HBASE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407061#comment-17407061
 ] 

Yuefeng Wang commented on HBASE-26239:
--------------------------------------

ClientScanTest.java is attached, I focused on 'lineitem' table generated by 
TPC-H (in 100GB scale). Without the "scan.setFilter(filterList);" in row 72, I 
can get the correct result. But added, it to be incorrect. I debugged the read 
process, found that there is a read size limitation of PREAD in the "next" 
method in StoreScanner.java which causes scan task unexpectedly terminate.
{code:java}
if (this.scanUsePread && this.scan.getReadType() == ReadType.DEFAULT && 
this.bytesRead > this.preadMaxBytes) {
    scannerContext.returnImmediately();
}
{code}
 

 

> Return incorrect result for a scan rpc call when readType switch from pread 
> to stream
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-26239
>                 URL: https://issues.apache.org/jira/browse/HBASE-26239
>             Project: HBase
>          Issue Type: Bug
>          Components: scan, Scanners
>            Reporter: Yuefeng Wang
>            Priority: Major
>         Attachments: ClientScanTest.java, image-2021-08-31-10-16-37-417.png
>
>
> Scan's default readType is PREAD, and there is a readType transform from 
> PREAD to STREAM when a long scan is executed. E.G. I ask for result of 'count 
> table', and now it will return immediately before the RPC get real data 
> because of this transform, so the result returned maybe incorrect. Although, 
> we can explicitly use STREAM instead of PREAD to avoid this transform. I am 
> still confused, how can we judge whether transform is called? How can I trust 
> my result is real when we allow transform occur?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to