[
https://issues.apache.org/jira/browse/KUDU-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190325#comment-16190325
]
Will Berkeley commented on KUDU-2162:
-------------------------------------
[~twmarshall] I'd prefer to expose
1. Number of blocks read from disk/cache and checked for matching rows
2. Number of rows passing all the filters (i.e. the number returned to the
client)
3. Number of rows read from disk/cache but then filtered
2 exists today but not at the scanner level; 3 is one you asked for and is easy
to implement. 1 is the complement of the number of blocks totally skipped and
is easier to implement-- would it be good enough for Impala?
> Expose stats about scan filters
> -------------------------------
>
> Key: KUDU-2162
> URL: https://issues.apache.org/jira/browse/KUDU-2162
> Project: Kudu
> Issue Type: Improvement
> Components: client
> Reporter: Thomas Tauber-Marshall
>
> Impala is working on implementing runtime filters that get pushed down into
> Kudu using KuduScanner::AddConjunctPredicate()
> It would be useful for perf analysis and debugging to be able to include info
> in Impala's runtime profile about the effectiveness of the filters, eg.
> number of rows that are filtered.
> This would probably require at least two counters:
> - # of blocks that are entirely skipped
> - # of rows that are filtered from blocks that aren't entirely skipped
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)