Lars Volker has posted comments on this change. Change subject: IMPALA-5061: Populate null_count in parquet::statistics ......................................................................
Patch Set 6: (6 comments) http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 961: Status err(Substitute("Corrupt Parquet file '$0': column '$1' " > Sorry about that! I won't do it henceforth. No worries :) http://gerrit.cloudera.org:8080/#/c/7058/6/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 553: // We need min stats. Where did "to get" go? http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: PS5, Line 197: page_stats_base_ > I am not sure, I noticed Line #688 does a DCHECK. Should I change it to a D I think changing this to a DCHECK is preferable, since it always has to be set. http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: Line 162: void Update(const T& min_value, const T& max_value); > Done My idea was to keep this line, but change it to void Update(const T& v) { Update(v, v); } and remove the implementation from the .cc file. http://gerrit.cloudera.org:8080/#/c/7058/6/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: PS6, Line 60: nit: double space PS6, Line 99: appeneded nit: typo -- To view, visit http://gerrit.cloudera.org:8080/7058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pooja Nilangekar <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Pooja Nilangekar <[email protected]> Gerrit-HasComments: Yes
