[
https://issues.apache.org/jira/browse/HBASE-27402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805193#comment-17805193
]
Duo Zhang commented on HBASE-27402:
-----------------------------------
Let's open a discuss thread on the dev mailing list about this change? I agree
that data loss is more important. And maybe we could introduce some dirty hacks
to let the Scan.getScanMetrics still work, such as when cloning the Scan
object, we still have a reference to the original Scan object, when calling
setScanMetrics, we set both?
Thanks.
> Clone Scan in ClientScanner to avoid errors with Scan re-used
> -------------------------------------------------------------
>
> Key: HBASE-27402
> URL: https://issues.apache.org/jira/browse/HBASE-27402
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Priority: Major
> Labels: patch-available
>
> This has come up before in https://issues.apache.org/jira/browse/HBASE-1774
> and https://issues.apache.org/jira/browse/HBASE-4891. The major pushback was
> around ScanMetrics, which relied on sharing a mutable Scan object.
> Since https://issues.apache.org/jira/browse/HBASE-17584, ScanMetrics are
> available on ResultScanner and the method on Scan was deprecated (removed in
> master).
> I think this issue became pretty urgent in
> https://issues.apache.org/jira/browse/HBASE-17167, when we started passing
> mvcc into the Scan object. If a user unknowingly reuses the Scan object, this
> can seem like data loss since the Scan will return none of the expected data.
> We recently hit this in our upgrade from hbase client 1.2 to 2.4.6, where
> use-cases that had worked in 1.2 suddenly started returning no results in
> 2.4.6. It's very hard to debug.
> I suggest that we now add the clone in master branch. For branch-2, I think
> we could put it behind a config param to preserve backwards compatibility of
> Scan.getScanMetrics. If the config param is enabled, scan cloning occurs and
> Scan.getScanMetrics will be inaccurate. Personally I think this is far better
> scenario, because data result accuracy is more important than metrics. But we
> can leave it to the user to decide, and provide a release note.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)