[jira] [Commented] (HBASE-19818) Scan time limit not work if the filter always filter row key

Guanghao Zhang (JIRA) Wed, 24 Jan 2018 00:29:17 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-19818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16337123#comment-16337123
 ]


Guanghao Zhang commented on HBASE-19818:
----------------------------------------

Open HBASE-19855 to refactor RegionScannerImpl.nextInternal method.

> Scan time limit not work if the filter always filter row key
> ------------------------------------------------------------
>
>                 Key: HBASE-19818
>                 URL: https://issues.apache.org/jira/browse/HBASE-19818
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.0.0-beta-2
>            Reporter: Guanghao Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>         Attachments: HBASE-19818.master.003.patch
>
>
> [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java]
> nextInternal() method.
> {code:java}
> // Check if rowkey filter wants to exclude this row. If so, loop to next.
>  // Technically, if we hit limits before on this row, we don't need this call.
>  if (filterRowKey(current)) {
>  incrementCountOfRowsFilteredMetric(scannerContext);
>  // early check, see HBASE-16296
>  if (isFilterDoneInternal()) {
>  return 
> scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues();
>  }
>  // Typically the count of rows scanned is incremented inside 
> #populateResult. However,
>  // here we are filtering a row based purely on its row key, preventing us 
> from calling
>  // #populateResult. Thus, perform the necessary increment here to rows 
> scanned metric
>  incrementCountOfRowsScannedMetric(scannerContext);
>  boolean moreRows = nextRow(scannerContext, current);
>  if (!moreRows) {
>  return 
> scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues();
>  }
>  results.clear();
>  continue;
>  }
> // Ok, we are good, let's try to get some results from the main heap.
>  populateResult(results, this.storeHeap, scannerContext, current);
>  if (scannerContext.checkAnyLimitReached(LimitScope.BETWEEN_CELLS)) {
>  if (hasFilterRow) {
>  throw new IncompatibleFilterException(
>  "Filter whose hasFilterRow() returns true is incompatible with scans that 
> must "
>  + " stop mid-row because of a limit. ScannerContext:" + scannerContext);
>  }
>  return true;
>  }
> {code}
> If filterRowKey always return ture, then it skip to checkAnyLimitReached. For 
> batch/size limit, it is ok to skip as we don't read anything. But for time 
> limit, it is not right. If the filter always filter row key, we will stuck 
> here for a long time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-19818) Scan time limit not work if the filter always filter row key

Reply via email to