[
https://issues.apache.org/jira/browse/HBASE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086370#comment-14086370
]
Andrew Purtell commented on HBASE-11667:
----------------------------------------
bq. The client cannot track what the server is doing unless the server tells it
what it did (i.e. how far it got with its scan). I don't think we can recover
if there is no way to know which state to recover to. All the client can know
without help from the server (this particular scan) if last startkey of the
last region. I tried to only use that information, but it turns out that does
not work. Interesting problem :)
The problem with your patch was the client ended up handing duplicate rows to
the application layer because we removed a kludge. The client is in the
position of observing the Results it is passing up. I wonder if it is possible
to detect the 'skipFirst' condition and handle it without relying on the
current server response, since we see at least in this case a coprocessor can
cause the server to lie. (Of course, we can always declare that to be an
invalid thing to do.) I am not looking at code when saying this, just thinking
out loud.
> Comment ClientScanner logic for NSREs.
> --------------------------------------
>
> Key: HBASE-11667
> URL: https://issues.apache.org/jira/browse/HBASE-11667
> Project: HBase
> Issue Type: Improvement
> Reporter: Lars Hofhansl
> Assignee: Lars Hofhansl
> Priority: Minor
> Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6
>
> Attachments: 11667-0.94.txt, 11667-doc-0.94.txt, 11667-trunk.txt,
> HBASE-11667-0.98.patch, IntegrationTestBigLinkedListWithRegionMovement.patch
>
>
> We ran into an issue with Phoenix where a RegionObserver coprocessor
> intercepts a scan and returns an aggregate (in this case a count) with a fake
> row key. It turns out this does not work when the {{ClientScanner}}
> encounters NSREs, as it uses the last key it saw to reset the scanner to try
> again (which in this case would be the fake key).
> While this is arguably a rare case and one could also argue that a region
> observer just shouldn't do this... While looking at {{ClientScanner}}'s code
> I found this logic not necessary.
> A NSRE occurred because we contacted a region server with a key that it no
> longer hosts. This is the start key, so it is always correct to retry with
> this same key. That simplifies the ClientScanner logic and also make this
> sort of coprocessors possible,
--
This message was sent by Atlassian JIRA
(v6.2#6252)