Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8684 )
Change subject: IMPALA-3804: Re-enable per-scan filtering for sequence-based scanners ...................................................................... Patch Set 1: (2 comments) The code looks correct to me, but I had some ideas about how a cleaner solution - let me know if they make sense or if I'm missing something. http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/base-sequence-scanner.cc File be/src/exec/base-sequence-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/base-sequence-scanner.cc@174 PS1, Line 174: if (!scan_node_->PartitionPassesFilters(context_->partition_descriptor()->id(), This is a bit inconsistent with the other file types because it checks once per 1024-row batch (GetNextInternal() is called many times per file) instead of once per scan range. Not the biggest deal but it adds another special case to this code (which unfortunately already has a lot of them). http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/hdfs-scan-node.cc@a497 PS1, Line 497: : After looking at the code I'm wondering if this idea would lead to a more elegant solution. If I understand correctly, the idea is that, instead of marking the scan range done by incrementing progress_, for sequence file header ranges, we should instead mark all the ranges in the file as done by calling RangeComplete() with all of the ranges in the file. I think then we could maybe remove the special casing for FileFormatIsSequenceBased() and instead check something like if (metadata->is_header_range()) Which seems to express the logic more directly. -- To view, visit http://gerrit.cloudera.org:8080/8684 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4b38c26bcbe67f83efcc65a1723d766626ae3d3e Gerrit-Change-Number: 8684 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Henry Robinson <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Wed, 29 Nov 2017 17:20:15 +0000 Gerrit-HasComments: Yes
