Joe McDonnell has uploaded a new change for review. http://gerrit.cloudera.org:8080/6181
Change subject: Fix parquet table writer dictionary leak ...................................................................... Fix parquet table writer dictionary leak Currently, in HdfsTableSink, OutputPartitions are added to the RuntimeState object pool to be freed at the end of the query. However, for clustered inserts into a partitioned table, the OutputPartitions are only used one at a time. They can be immediately freed once done writing to that partition. In addition, the HdfsParquetTableWriter's ColumnWriters are also added to this object pool. These constitute a significant amount of memory, as they contain the dictionaries for Parquet encoding. This change makes HdfsParquetTableWriter's ColumnWriters use unique_ptrs so that they are cleaned up when the HdfsParquetTableWriter is deleted. It also explicitly cleans up the OutputPartition rather than leaving it to the object pool. Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f --- M be/src/exec/hdfs-parquet-table-writer.cc M be/src/exec/hdfs-parquet-table-writer.h M be/src/exec/hdfs-table-sink.cc 3 files changed, 14 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/6181/1 -- To view, visit http://gerrit.cloudera.org:8080/6181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I06e354086ad24071d4fbf823f25f5df23933688f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Joe McDonnell <[email protected]>
