[ https://issues.apache.org/jira/browse/HBASE-19818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Guanghao Zhang updated HBASE-19818: ----------------------------------- Attachment: (was: HBASE-19818.master.002.patch) > Scan time limit not work if the filter always filter row key > ------------------------------------------------------------ > > Key: HBASE-19818 > URL: https://issues.apache.org/jira/browse/HBASE-19818 > Project: HBase > Issue Type: Bug > Affects Versions: 3.0.0, 2.0.0-beta-2 > Reporter: Guanghao Zhang > Assignee: Guanghao Zhang > Priority: Major > Attachments: HBASE-19818.master.003.patch > > > [https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java] > nextInternal() method. > {code:java} > // Check if rowkey filter wants to exclude this row. If so, loop to next. > // Technically, if we hit limits before on this row, we don't need this call. > if (filterRowKey(current)) { > incrementCountOfRowsFilteredMetric(scannerContext); > // early check, see HBASE-16296 > if (isFilterDoneInternal()) { > return > scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues(); > } > // Typically the count of rows scanned is incremented inside > #populateResult. However, > // here we are filtering a row based purely on its row key, preventing us > from calling > // #populateResult. Thus, perform the necessary increment here to rows > scanned metric > incrementCountOfRowsScannedMetric(scannerContext); > boolean moreRows = nextRow(scannerContext, current); > if (!moreRows) { > return > scannerContext.setScannerState(NextState.NO_MORE_VALUES).hasMoreValues(); > } > results.clear(); > continue; > } > // Ok, we are good, let's try to get some results from the main heap. > populateResult(results, this.storeHeap, scannerContext, current); > if (scannerContext.checkAnyLimitReached(LimitScope.BETWEEN_CELLS)) { > if (hasFilterRow) { > throw new IncompatibleFilterException( > "Filter whose hasFilterRow() returns true is incompatible with scans that > must " > + " stop mid-row because of a limit. ScannerContext:" + scannerContext); > } > return true; > } > {code} > If filterRowKey always return ture, then it skip to checkAnyLimitReached. For > batch/size limit, it is ok to skip as we don't read anything. But for time > limit, it is not right. If the filter always filter row key, we will stuck > here for a long time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)