[ 
https://issues.apache.org/jira/browse/HBASE-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15168664#comment-15168664
 ] 

Jianwei Cui commented on HBASE-15325:
-------------------------------------

When user set batch for scan, the client may also return partial row result to 
application and suffer this problem if region moves. The reason is that the 
server will judge whether the result is partial as:
{code}
  boolean partialResultFormed() {
    return scannerState == NextState.SIZE_LIMIT_REACHED_MID_ROW
        || scannerState == NextState.TIME_LIMIT_REACHED_MID_ROW;
  }
{code}
The NextState.BATCH_LIMIT_REACHED is not considered as partial result, so that 
the ClientScanner won't get a partial result from server and will go to the 
next row when retrying:
          if (!this.lastResult.isPartial()) {
            if (scan.isReversed()) {
              scan.setStartRow(createClosestRowBefore(lastResult.getRow()));
            } else {
              scan.setStartRow(Bytes.add(lastResult.getRow(), new byte[1]));  
// <=== partial result from batch limit reached case will go to the next row 
and missing rest data
            }
          } else {
            // we need rescan this row because we only load partial row before
            scan.setStartRow(lastResult.getRow());
          }
{code}
I think if user sets batch for scan, it means the user allows partial result? 
We can set scan.allowPartialResults to true in this situation, and the server 
should also take NextState.BATCH_LIMIT_REACHED as a partial result, then the 
ClientScanner will receive a partial result and retry the same row if region 
moved after applied the patch.  

> ResultScanner allowing partial result will miss the rest of the row if the 
> region is moved between two rpc requests
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-15325
>                 URL: https://issues.apache.org/jira/browse/HBASE-15325
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Phil Yang
>            Assignee: Phil Yang
>            Priority: Critical
>         Attachments: 15325-test.txt, HBASE-15325-v1.txt
>
>
> HBASE-11544 allow scan rpc return partial of a row to reduce memory usage for 
> one rpc request. And client can setAllowPartial or setBatch to get several 
> cells in a row instead of the whole row.
> However, the status of the scanner is saved on server and we need this to get 
> the next part if there is a partial result before. If we move the region to 
> another RS, client will get a NotServingRegionException and open a new 
> scanner to the new RS which will be regarded as a new scan from the end of 
> this row. So the rest cells of the row of last result will be missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to