Joe McDonnell has posted comments on this change.

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8034/2/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

PS2, Line 213: dict_decoder_(parent_->dictionary_pool_.get()),
We should think about having a separate MemPool for parts of the dictionary 
that will not be referenced by the RowBatch and can be freed at the end of the 
row group.

Since some datatypes are copied by value rather than pointing into the 
dictionary, this might be used for those as well.


http://gerrit.cloudera.org:8080/#/c/8034/2/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

PS2, Line 229:     memcpy(val_ptr, dict_[index], sizeof(T));
The memcpy is doing a dereference here and is not different from *val_ptr = 
*dict_[index];


PS2, Line 238:   std::vector<T*> dict_;
Using a T* means that GetValue() must do a dereference to obtain the value. 
This is a very performance sensitive codepath, so we need to avoid this extra 
dereference. This should maintain the value directly in the vector.


-- 
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Pranay Singh
Gerrit-HasComments: Yes

Reply via email to