Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8034 )
Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder ...................................................................... Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/data-sink.h File be/src/exec/data-sink.h: http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/data-sink.h@126 PS3, Line 126: /// This memtracker tracks all the allocations done by parquet dictonary encoder. : boost::scoped_ptr<MemTracker> dict_mem_tracker_; Can we move this down to HdfsParquetTableWriter? http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/exec-node.h File be/src/exec/exec-node.h: http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/exec/exec-node.h@325 PS3, Line 325: /// Tracks all the memory allocation by parquet dictonary decoders : boost::scoped_ptr<MemTracker> dict_mem_tracker_; Can this move down to HdfsParquetScanner? Other ExecNodes don't need this. http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h File be/src/util/dict-encoding.h: http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@113 PS3, Line 113: Remove trailing spaces. This also applies to lines that don't have any code (there are a couple others in this file). These are easily visible in the Gerrit UI. If using emacs, the following will highlight trailing whitespace: (setq-default show-trailing-whitespace t) http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@164 PS3, Line 164: mtrackp For class fields, the convention is to end the field name with an underscore. Also, look at other code that uses MemTrackers and see what variable name that code uses. It is good to harmonize these things across the codebase. http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412 PS3, Line 412: ConsumeBytes(sizeof(value)); The parquet DictionaryPageHeader contains a num_values field. Look where we call CreateDictionaryDecoder and you'll see that we reference to it on the next line. I think we should consider doing a single ConsumeBytes using num_values * size of type. -- To view, visit http://gerrit.cloudera.org:8080/8034 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299 Gerrit-Change-Number: 8034 Gerrit-PatchSet: 3 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Pranay Singh Gerrit-Comment-Date: Thu, 28 Sep 2017 16:55:24 +0000 Gerrit-HasComments: Yes
