Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8170 )

Change subject: IMPALA-5789: Add always_false flag in bloom filter
......................................................................


Patch Set 4:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8170/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/8170/4//COMMIT_MSG@15
PS4, Line 15:
Did you do any perf runs? It would be good to verify that the extra flag 
checking doesn't affect perf (I suspect it doesn't).

It would also be good to confirm that 
testdata/workloads/targeted-perf/queries/primitive_empty_build_join_1.test gets 
faster (I think it should!).


http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-parquet-scanner.cc@403
PS4, Line 403: std::
Shouldn't need std:: - it's imported in common/names.h


http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-scanner.cc@107
PS4, Line 107:     if (!BaseSequenceScanner::FileFormatIsSequenceBased(
Might be more readable to factor subexpression into variable e.g. 
is_sequence_based.


http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py
File tests/custom_cluster/test_always_false_filter.py:

http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py@53
PS4, Line 53:   seq_table_suffixes = ['_avro', '_rc', '_seq']
We'd normally create a test matrix based on these. Is the idea here to avoid 
restarting the cluster for each file format? If so it would be good to leave a 
comment so that readers understand why.


http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py@82
PS4, Line 82:   def test_skip_file(self, cursor):
Does this need to be a custom cluster test? I.e. does it need a special 
minicluster to execute. It's best to make things query tests if possible since 
starting a cluster is slow and the tests aren't parallelisable.



--
To view, visit http://gerrit.cloudera.org:8080/8170
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d
Gerrit-Change-Number: 8170
Gerrit-PatchSet: 4
Gerrit-Owner: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Sailesh Mukil <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Fri, 13 Oct 2017 20:34:05 +0000
Gerrit-HasComments: Yes

Reply via email to