Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/11575 )
Change subject: IMPALA-6964: Track stats about column and page sizes in Parquet reader ...................................................................... Patch Set 14: (12 comments) http://gerrit.cloudera.org:8080/#/c/11575/14/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: http://gerrit.cloudera.org:8080/#/c/11575/14/be/src/exec/hdfs-scan-node-base.h@577 PS14, Line 577: std::unordered_map<SlotId, std::pair<std::atomic<int64_t>, std::atomic<int64_t>>> > Use our own atomic implementations in common/atomic.h Done http://gerrit.cloudera.org:8080/#/c/11575/14/be/src/exec/parquet/hdfs-parquet-scanner.h File be/src/exec/parquet/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/11575/14/be/src/exec/parquet/hdfs-parquet-scanner.h@654 PS14, Line 654: Update > nit: updates Done http://gerrit.cloudera.org:8080/#/c/11575/14/be/src/exec/parquet/hdfs-parquet-scanner.h@658 PS14, Line 658: Update > nit: updates Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/query_test/test_scanners.py File tests/query_test/test_scanners.py: http://gerrit.cloudera.org:8080/#/c/11575/14/tests/query_test/test_scanners.py@808 PS14, Line 808: def test_page_size_counters_uncompressed(self, vector): > I think you could also pull all (un)compressed tests into one method, e.g. Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/query_test/test_scanners.py@822 PS14, Line 822: assert summary['max'] == summary['min'] == summary['avg'] == summary['samples'] == 0 > maybe Moved into a helper method http://gerrit.cloudera.org:8080/#/c/11575/14/tests/query_test/test_scanners.py@832 PS14, Line 832: assert summary['max'] > 0 and summary['min'] > 0 and summary['avg'] > 0 and \ > maybe Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/query_test/test_scanners.py@839 PS14, Line 839: + > I suspect the + might not be needed here. Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/query_test/test_scanners.py@844 PS14, Line 844: get_bytes_summary_stats_counter(summary_name, runtime_profile) > nit: indent 4, or wrap after opening parentheses (see PEP8, I think it says Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/query_test/test_scanners.py@859 PS14, Line 859: "ParquetCompressedBytesReadPerColumn", runtime_profile) > nit: indent 4, maybe check everywhere else Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/util/parse_util.py File tests/util/parse_util.py: http://gerrit.cloudera.org:8080/#/c/11575/14/tests/util/parse_util.py@152 PS14, Line 152: [.]*[0-9]* > Then this only needs to match the number in (), right? so we don't need the Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/util/parse_util.py@152 PS14, Line 152: .*? > This matches non-greedily until the opening (, right? Done http://gerrit.cloudera.org:8080/#/c/11575/14/tests/util/parse_util.py@155 PS14, Line 155: * > Nit: here and elsewhere you likely mean "+" (1 or more) instead of "*" (zer Done -- To view, visit http://gerrit.cloudera.org:8080/11575 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I322f9b324b6828df28e5caf79529085c43d7c817 Gerrit-Change-Number: 11575 Gerrit-PatchSet: 14 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Comment-Date: Thu, 03 Jan 2019 18:06:28 +0000 Gerrit-HasComments: Yes
