Henry Robinson has posted comments on this change. Change subject: IMPALA-3804: Push per-split filtering into scanners ......................................................................
Patch Set 2: (11 comments) http://gerrit.cloudera.org:8080/#/c/3561/2/be/src/exec/base-sequence-scanner.cc File be/src/exec/base-sequence-scanner.cc: PS2, Line 125: ) > nit: extraneous parens Done http://gerrit.cloudera.org:8080/#/c/3561/2/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: PS2, Line 1875: IssueInitialRanges() will mark all ranges *except* the footer as complete. > It does this only if there are no slots to be materialized. Done http://gerrit.cloudera.org:8080/#/c/3561/2/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: Line 244: #endif > it would be nice to not have this if-def in the code, or at least around so Done - rather than introduce a new method, I added a variable that is false by default, and overwritten to FLAGS_skip_file_runtime_filtering when debug mode is on. PS2, Line 1137: PartitionIsFiltered > nit: This sounded a little confusing to me on first glance. I didn't know i A filter removes things (think air filter, or python list comprehensions). But I made the change. PS2, Line 1158: passed_filter > nit: Same here, instead of saying '!passed_filter', 'filtered_out' would be That means moving the negation to line 1156, which I think is less explicit. If something passes a filter, it is not removed. Let me know if you think differently. http://gerrit.cloudera.org:8080/#/c/3561/2/be/src/exec/hdfs-scan-node.h File be/src/exec/hdfs-scan-node.h: PS2, Line 307: true > false Done http://gerrit.cloudera.org:8080/#/c/3561/2/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: Line 709: return true; > rather than doing all this close logic here, it seems more straight forward Done http://gerrit.cloudera.org:8080/#/c/3561/2/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: PS2, Line 429: artefact > never knew about this spelling It's British English, and a bit archaic. Removed the comment anyhow. PS2, Line 432: header > footer Done PS2, Line 433: process remaining splits in the *same* scanner and thread > Again, I'm not sure how true this is. Done PS2, Line 439: CLOSE_ONLY_THIS_RANGE, : CLOSE_NONE > There doesn't seem to be a distinction between the two. To me, they feel l I removed the policy type - but there is a difference between NONE (which means close no scan ranges here) and ONE (which means close your own scan range here, because you won't have the chance to do it later). -- To view, visit http://gerrit.cloudera.org:8080/3561 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9f92178f642695e0e9ef901373a5e9f2878a78ce Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Henry Robinson <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-HasComments: Yes
