[
https://issues.apache.org/jira/browse/HBASE-26239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407061#comment-17407061
]
Yuefeng Wang commented on HBASE-26239:
--------------------------------------
ClientScanTest.java is attached, I focused on 'lineitem' table generated by
TPC-H (in 100GB scale). Without the "scan.setFilter(filterList);" in row 72, I
can get the correct result. But added, it to be incorrect. I debugged the read
process, found that there is a read size limitation of PREAD in the "next"
method in StoreScanner.java which causes scan task unexpectedly terminate.
{code:java}
if (this.scanUsePread && this.scan.getReadType() == ReadType.DEFAULT &&
this.bytesRead > this.preadMaxBytes) {
scannerContext.returnImmediately();
}
{code}
> Return incorrect result for a scan rpc call when readType switch from pread
> to stream
> -------------------------------------------------------------------------------------
>
> Key: HBASE-26239
> URL: https://issues.apache.org/jira/browse/HBASE-26239
> Project: HBase
> Issue Type: Bug
> Components: scan, Scanners
> Reporter: Yuefeng Wang
> Priority: Major
> Attachments: ClientScanTest.java, image-2021-08-31-10-16-37-417.png
>
>
> Scan's default readType is PREAD, and there is a readType transform from
> PREAD to STREAM when a long scan is executed. E.G. I ask for result of 'count
> table', and now it will return immediately before the RPC get real data
> because of this transform, so the result returned maybe incorrect. Although,
> we can explicitly use STREAM instead of PREAD to avoid this transform. I am
> still confused, how can we judge whether transform is called? How can I trust
> my result is real when we allow transform occur?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)