[ 
https://issues.apache.org/jira/browse/HBASE-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519448#comment-14519448
 ] 

Anoop Sam John commented on HBASE-5980:
---------------------------------------

Nice work Jonathan.
Some comments below.. Mainly I feel we can add some more tests. I fear whether 
we are missing the counting in places. I may be wrong but better to add tests 
which consider all return points of the scan loop in RegionScannerImpl.

incrementCountOfRowsScannedMetric(scannerContext);
We are doing this in nextRow(ScannerContext scannerContext, byte[] currentRow, 
int offset, short length). This method basically skips all cells in a filtered 
row and call CP hook. Can we do this increment in calling place instead? That 
looks better for understanding for code (IMO)

Can we add a test case for
        - There are N rows to scan totally and out of which M are filtered out 
and remaining returned back
        - Case where the size limit reaches in btw rows. (old batch case)
        - We have test for cases of rowKeys getting filtered out. But other 
cases are missing (like filterRow() etc). We have ColumnPrefixFilter but that 
seems not filtering any rows case. 


> Scanner responses from RS should include metrics on rows/KVs filtered
> ---------------------------------------------------------------------
>
>                 Key: HBASE-5980
>                 URL: https://issues.apache.org/jira/browse/HBASE-5980
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client, metrics, regionserver
>    Affects Versions: 0.95.2
>            Reporter: Todd Lipcon
>            Assignee: Jonathan Lawlor
>            Priority: Minor
>         Attachments: HBASE-5980-branch-1.patch, HBASE-5980-v1.patch, 
> HBASE-5980-v2.patch, HBASE-5980-v2.patch
>
>
> Currently it's difficult to know, when issuing a filter, what percentage of 
> rows were skipped by that filter. We should expose some basic counters back 
> to the client scanner object. For example:
> - number of rows filtered by row key alone (filterRowKey())
> - number of times each filter response was returned by filterKeyValue() - 
> corresponding to Filter.ReturnCode
> What would be slickest is if this could actually return a tree of counters 
> for cases where FilterList or other combining filters are used. But a 
> top-level is a good start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to