Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 )
Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder ...................................................................... Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-scanner.h@485 PS6, Line 485: /// Tracks all the memory allocation used by dictionary in DictDecoder. : struct DictMemTrack { : std::unique_ptr<MemTracker> mem_tracker; : : DictMemTrack(MemTracker *parent_memtrack):mem_tracker(new MemTracker(-1, : "Decoder parquet dict", parent_memtrack, false)) { }; : : MemTracker* GetMemTracker() { : return mem_tracker.get(); : } : : ~DictMemTrack() { : mem_tracker->CloseAndUnregisterFromParent(); : } : }; I dont think we need to add a new struct for this, you can just call CloseAndUnregisterFromParent() on the dict_mem_tracker in HdfsParquetScanner::Close Similarly for the writer http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc File be/src/exec/hdfs-parquet-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@113 PS6, Line 113: // Bytes used for memory tracking by dictionary in DictEncoder is decremented. : virtual void Cleanup() { : if (dict_encoder_base_ != nullptr) { : dict_encoder_base_->Cleanup(); : } : } you can move this to the BaseColumnWriter::Close() method http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/exec/hdfs-parquet-table-writer.cc@767 PS6, Line 767: // Below code will decrement the bytes used for memory tracking by dictionary in : // DictEncoder class for each ColumnWriter. : for (int i = 0; i < columns_.size(); i++) { : if (columns_[i] != nullptr) { : columns_[i]->Cleanup(); : } : } similarly you can move this to the HdfsParquetTableWriter::Close() method http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@138 PS6, Line 138: SizeofDict I think we can use a more precise name here, SizeOfDict might be confused with returning the number of elements in the Dictionary. Something on the lines of DictByteSize? http://gerrit.cloudera.org:8080/#/c/8034/6/be/src/util/dict-encoding.h@438 PS6, Line 438: bytes_alloc += sizeof(value); can you switch to what joe suggested earlier, that is, instead of bytes_alloc, using something like the function SizeofDict() that you wrote to update ConsumeBytes at the end. Refer Joe's first comment. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Wed, 04 Oct 2017 22:29:59 +0000 Gerrit-HasComments: Yes
