Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8684 )

Change subject: IMPALA-3804: Re-enable per-scan filtering for sequence-based 
scanners
......................................................................


Patch Set 1:

(2 comments)

The code looks correct to me, but I had some ideas about how a cleaner solution 
- let me know if they make sense or if I'm missing something.

http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/base-sequence-scanner.cc@174
PS1, Line 174:   if 
(!scan_node_->PartitionPassesFilters(context_->partition_descriptor()->id(),
This is a bit inconsistent with the other file types because it checks once per 
1024-row batch (GetNextInternal() is called many times per file) instead of 
once per scan range. Not the biggest deal but it adds another special case to 
this code (which unfortunately already has a lot of them).


http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/8684/1/be/src/exec/hdfs-scan-node.cc@a497
PS1, Line 497:
             :
After looking at the code I'm wondering if this idea would lead to a more 
elegant solution. If I understand correctly, the idea is that, instead of 
marking the scan range done by incrementing progress_, for sequence file header 
ranges, we should instead mark all the ranges in the file as done by calling 
RangeComplete() with all of the ranges in the file.

I think then we could maybe remove the special casing for 
FileFormatIsSequenceBased() and instead check something like

  if (metadata->is_header_range())

Which seems to express the logic more directly.



--
To view, visit http://gerrit.cloudera.org:8080/8684
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4b38c26bcbe67f83efcc65a1723d766626ae3d3e
Gerrit-Change-Number: 8684
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Henry Robinson <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Wed, 29 Nov 2017 17:20:15 +0000
Gerrit-HasComments: Yes

Reply via email to