Hello Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/16771 to look at the new patch set (#4). Change subject: IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h ...................................................................... IMPALA-10345: Impala hits DCHECK in parquet-column-stats.inline.h During Parquet file writing, a DCHECK checks if row group stats have copied the min/max string values into their internal buffers. This check is at the finalization of each page. The copying of the string values happened at the end of each row batch. Thus, if a row batch spans over multiple pages then the min/max string values don't get copied by the end of the page. Since the memory is attached to the row batch this isn't really an error. As a workaround this commit also copies the min/max string values at the end of the page if they haven't been copied yet. Testing * Added e2e test Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b --- M be/src/exec/parquet/hdfs-parquet-table-writer.cc M testdata/workloads/functional-query/queries/QueryTest/parquet-page-index.test M tests/query_test/test_parquet_stats.py 3 files changed, 23 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/16771/4 -- To view, visit http://gerrit.cloudera.org:8080/16771 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4289bd743e951cc4c607d5a5ea75d27825a1c12b Gerrit-Change-Number: 16771 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>