[ 
https://issues.apache.org/jira/browse/HBASE-16132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15356481#comment-15356481
 ] 

Yu Li commented on HBASE-16132:
-------------------------------

Maybe we could create some UT case following the sanity test way in another 
JIRA to cover the case and let the patch here in first since the code logic 
error is straight-forward and will cause real problem under heavy load. 
Thoughts? Thanks.

btw, the change here is already online in our product env and runs ok till now

> Scan does not return all the result when regionserver is busy
> -------------------------------------------------------------
>
>                 Key: HBASE-16132
>                 URL: https://issues.apache.org/jira/browse/HBASE-16132
>             Project: HBase
>          Issue Type: Bug
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-16132.patch, HBASE-16132_v2.patch, 
> HBASE-16132_v3.patch, HBASE-16132_v3.patch, TestScanMissingData.java
>
>
> We have find some corner case, when regionserver is busy and last a long 
> time. Some scanner may return null even if they do not scan all data.
> We find in ScannerCallableWithReplicas there is a case do not handler 
> correct, when cs.poll timeout and do not return any result , it is will 
> return a null result, so scan get null result, and end the scan. 
>  {code}
>     try {
>       Future<Pair<Result[], ScannerCallable>> f = cs.poll(timeout, 
> TimeUnit.MILLISECONDS);
>       if (f != null) {
>         Pair<Result[], ScannerCallable> r = f.get(timeout, 
> TimeUnit.MILLISECONDS);
>         if (r != null && r.getSecond() != null) {
>           updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, 
> pool);
>         }
>         return r == null ? null : r.getFirst(); // great we got an answer
>       }
>     } catch (ExecutionException e) {
>       RpcRetryingCallerWithReadReplicas.throwEnrichedException(e, retries);
>     } catch (CancellationException e) {
>       throw new InterruptedIOException(e.getMessage());
>     } catch (InterruptedException e) {
>       throw new InterruptedIOException(e.getMessage());
>     } catch (TimeoutException e) {
>       throw new InterruptedIOException(e.getMessage());
>     } finally {
>       // We get there because we were interrupted or because one or more of 
> the
>       // calls succeeded or failed. In all case, we stop all our tasks.
>       cs.cancelAll();
>     }
>     return null; // unreachable
>  {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to