[ 
https://issues.apache.org/jira/browse/KUDU-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16190325#comment-16190325
 ] 

Will Berkeley commented on KUDU-2162:
-------------------------------------

[~twmarshall] I'd prefer to expose

1. Number of blocks read from disk/cache and checked for matching rows
2. Number of rows passing all the filters (i.e. the number returned to the 
client)
3. Number of rows read from disk/cache but then filtered

2 exists today but not at the scanner level; 3 is one you asked for and is easy 
to implement. 1 is the complement of the number of blocks totally skipped and 
is easier to implement-- would it be good enough for Impala?

> Expose stats about scan filters
> -------------------------------
>
>                 Key: KUDU-2162
>                 URL: https://issues.apache.org/jira/browse/KUDU-2162
>             Project: Kudu
>          Issue Type: Improvement
>          Components: client
>            Reporter: Thomas Tauber-Marshall
>
> Impala is working on implementing runtime filters that get pushed down into 
> Kudu using KuduScanner::AddConjunctPredicate()
> It would be useful for perf analysis and debugging to be able to include info 
> in Impala's runtime profile about the effectiveness of the filters, eg. 
> number of rows that are filtered.
> This would probably require at least two counters:
> - # of blocks that are entirely skipped
> - # of rows that are filtered from blocks that aren't entirely skipped



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to