Lars Volker has posted comments on this change. Change subject: IMPALA-5061: Populate null_count in parquet::statistics ......................................................................
Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 961: Status err(Substitute("Corrupt Parquet file '$0': column '$1' " If possible, please don't rebase changes while they are under review. Otherwise the rebase can clutter the view when comparing recent change sets. http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: PS5, Line 197: page_stats_base_ can this ever be nullptr? http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.cc File be/src/exec/parquet-column-stats.cc: Line 58: DCHECK(false) << "Unsupported statistics value field requested"; undo http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: Line 60: /// Enum to select the statistics field to be read. Values do not nit: Please undo this change, too. Line 72: static bool ReadValueFromThrift(const parquet::ColumnChunk& col_chunk, Undo Line 100: void UpdateNullCount(int64_t count = 1) { null_count_ += count; } Update sounds like it will set it to a new value. Maybe "AddToNullCount" or "IncrementNullCount" would be better names. When calling this, IncrementNullCount(1) may be more explicit and thus more clear. Then you could drop the default value. Line 162: void Update(const T& v); You can remove the implementation of Update(const T& v) and instead put a single line "Update(v, v);" here. Line 163: void Update(const T& min_value, const T& max_value); Please add a comment to this method, too. http://gerrit.cloudera.org:8080/#/c/7058/5/be/src/exec/parquet-column-stats.inline.h File be/src/exec/parquet-column-stats.inline.h: Line 61: Update(cs->min_value_, cs->max_value_); nit: This could go on a single line now -- To view, visit http://gerrit.cloudera.org:8080/7058 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4c49a63af84c2234f0633be63206cb52eb7e8ebb Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Pooja Nilangekar <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Pooja Nilangekar <[email protected]> Gerrit-HasComments: Yes
