Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8170 )
Change subject: IMPALA-5789: Add always_false flag in bloom filter ...................................................................... Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/8170/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8170/4//COMMIT_MSG@15 PS4, Line 15: Did you do any perf runs? It would be good to verify that the extra flag checking doesn't affect perf (I suspect it doesn't). It would also be good to confirm that testdata/workloads/targeted-perf/queries/primitive_empty_build_join_1.test gets faster (I think it should!). http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-parquet-scanner.cc@403 PS4, Line 403: std:: Shouldn't need std:: - it's imported in common/names.h http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8170/4/be/src/exec/hdfs-scanner.cc@107 PS4, Line 107: if (!BaseSequenceScanner::FileFormatIsSequenceBased( Might be more readable to factor subexpression into variable e.g. is_sequence_based. http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py File tests/custom_cluster/test_always_false_filter.py: http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py@53 PS4, Line 53: seq_table_suffixes = ['_avro', '_rc', '_seq'] We'd normally create a test matrix based on these. Is the idea here to avoid restarting the cluster for each file format? If so it would be good to leave a comment so that readers understand why. http://gerrit.cloudera.org:8080/#/c/8170/4/tests/custom_cluster/test_always_false_filter.py@82 PS4, Line 82: def test_skip_file(self, cursor): Does this need to be a custom cluster test? I.e. does it need a special minicluster to execute. It's best to make things query tests if possible since starting a cluster is slow and the tests aren't parallelisable. -- To view, visit http://gerrit.cloudera.org:8080/8170 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If680240a3cd4583fc97c3192177d86d9567c4f8d Gerrit-Change-Number: 8170 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi Wang <[email protected]> Gerrit-Reviewer: Sailesh Mukil <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Fri, 13 Oct 2017 20:34:05 +0000 Gerrit-HasComments: Yes
