Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8986 )
Change subject: IMPALA-4252: [DOCS] Document min/max filters for Kudu tables ...................................................................... Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/8986/1/docs/topics/impala_runtime_filtering.xml File docs/topics/impala_runtime_filtering.xml: http://gerrit.cloudera.org:8080/#/c/8986/1/docs/topics/impala_runtime_filtering.xml@173 PS1, Line 173: : For HD > Done. Because this paragraph is followed by info that's only relevant for B Actually, the partitioned/broadcast and local/global discussion applies to min-max filters as well. I should also add that the long term plan is to have all filter types supported by all scan types, so no need to separate out min-max as being a really specifically Kudu thing (though of course it only applies to Kudu at the moment). http://gerrit.cloudera.org:8080/#/c/8986/2/docs/topics/impala_runtime_filtering.xml File docs/topics/impala_runtime_filtering.xml: http://gerrit.cloudera.org:8080/#/c/8986/2/docs/topics/impala_runtime_filtering.xml@181 PS2, Line 181: a complete list of relevant values This is the only part I see that doesn't make sense for min-max filters, as they're not a 'list of values', but then a bloom filter isn't a 'list of values' either. Maybe rephrase it something like "A broadcast filter reflects the complete set of relevant values and can be immediately evaluated..." and "A partitioned filter reflects only the values processed by one host..." or perhaps "contains" instead of reflects http://gerrit.cloudera.org:8080/#/c/8986/2/docs/topics/impala_runtime_filtering.xml@203 PS2, Line 203: These filters are used by Kudu to scan a range of values : for join columns when identifying matching rows within a join query. I find this sentence confusing, as Kudu isn't identifying the matching rows (Kudu doesn't even know we're doing a join, its just scanning values for us) Maybe say something like "These filters are passed to Kudu to reduce the number of rows returnrf to Impala when scanning the probe side of the join" -- To view, visit http://gerrit.cloudera.org:8080/8986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I15d8c952ab5b90e89fdd57640dfb4da882f7ecb2 Gerrit-Change-Number: 8986 Gerrit-PatchSet: 2 Gerrit-Owner: John Russell <[email protected]> Gerrit-Reviewer: John Russell <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Thu, 11 Jan 2018 19:12:47 +0000 Gerrit-HasComments: Yes
