[
https://issues.apache.org/jira/browse/HBASE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519448#comment-14519448
]
Anoop Sam John commented on HBASE-5980:
---------------------------------------
Nice work Jonathan.
Some comments below.. Mainly I feel we can add some more tests. I fear whether
we are missing the counting in places. I may be wrong but better to add tests
which consider all return points of the scan loop in RegionScannerImpl.
incrementCountOfRowsScannedMetric(scannerContext);
We are doing this in nextRow(ScannerContext scannerContext, byte[] currentRow,
int offset, short length). This method basically skips all cells in a filtered
row and call CP hook. Can we do this increment in calling place instead? That
looks better for understanding for code (IMO)
Can we add a test case for
- There are N rows to scan totally and out of which M are filtered out
and remaining returned back
- Case where the size limit reaches in btw rows. (old batch case)
- We have test for cases of rowKeys getting filtered out. But other
cases are missing (like filterRow() etc). We have ColumnPrefixFilter but that
seems not filtering any rows case.
> Scanner responses from RS should include metrics on rows/KVs filtered
> ---------------------------------------------------------------------
>
> Key: HBASE-5980
> URL: https://issues.apache.org/jira/browse/HBASE-5980
> Project: HBase
> Issue Type: Improvement
> Components: Client, metrics, regionserver
> Affects Versions: 0.95.2
> Reporter: Todd Lipcon
> Assignee: Jonathan Lawlor
> Priority: Minor
> Attachments: HBASE-5980-branch-1.patch, HBASE-5980-v1.patch,
> HBASE-5980-v2.patch, HBASE-5980-v2.patch
>
>
> Currently it's difficult to know, when issuing a filter, what percentage of
> rows were skipped by that filter. We should expose some basic counters back
> to the client scanner object. For example:
> - number of rows filtered by row key alone (filterRowKey())
> - number of times each filter response was returned by filterKeyValue() -
> corresponding to Filter.ReturnCode
> What would be slickest is if this could actually return a tree of counters
> for cases where FilterList or other combining filters are used. But a
> top-level is a good start.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)