[
https://issues.apache.org/jira/browse/PHOENIX-1146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086377#comment-14086377
]
Lars Hofhansl commented on PHOENIX-1146:
----------------------------------------
I see. Thanks James.
It may not be generally possible to detect this always *before* the scan starts
as regions can move while the scan is running.
We still need to look at the interrupt issues.
> Detect stale client region cache on server and retry scans in split regions
> ---------------------------------------------------------------------------
>
> Key: PHOENIX-1146
> URL: https://issues.apache.org/jira/browse/PHOENIX-1146
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 5.0.0, 3.1, 4.1
> Reporter: James Taylor
> Assignee: James Taylor
>
> HBase cannot recover correctly from an aggregate scan run on the coprocessor
> side (see HBASE-116670). This can lead to incorrect query results the first
> time a query is run after a split occurs (due to the region boundary cache
> being stale). Phoenix can work around this by:
> - detecting on server before the scan starts that the region cache used by
> the client is out-of-date. This can be done up-front because the start/stop
> row of the scan should never span across a region boundary. In this case, a
> DoNotRetryIOException is thrown with some embedded information to cause a
> StaleRegionBoundaryCacheException to be thrown on the client.
> - catching this exception on the client (in ParallelIterators), refreshing
> the region boundary cache, and re-running the necessary scans based on the
> new region boundaries.
> - detecting if this happens more than N times to prevent any kind of
> excessive looping due to splits occurring over and over again.
> Phoenix has additional requirements above and beyond standard HBase clients,
> so even if HBase could recover from this situation, Phoenix would likely need
> this workaround to ensure that a scan does not span across region boundaries.
> This is required when the client is doing a merge sort on the results of the
> parallel scans, mainly in ORDER BY (including topN) and local indexing, and
> potentially GROUP BY if we move toward sorting the distinct groups on the
> server side.
--
This message was sent by Atlassian JIRA
(v6.2#6252)