Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8034 )

Change subject: IMPALA-5522:Use tracked memory for DictDecoder and DictEncoder
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h
File be/src/util/dict-encoding.h:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-encoding.h@412
PS3, Line 412: ConsumeBytes(sizeof(value));
> The parquet DictionaryPageHeader contains a num_values field. Look where we
using num_values * size of type might not work when dealing with variable sized 
type like string.

I would recommend keeping count of the bytes used in a local variable and then 
do a ConsumeBytes when we exit the loop. This is because mtrackp loops on all 
its parent trackers to update mem usage every time Consume(num_bytes) is 
called. Not a huge optimization but I think it might be worth avoiding another 
loop inside the hot path.


http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-test.cc
File be/src/util/dict-test.cc:

http://gerrit.cloudera.org:8080/#/c/8034/3/be/src/util/dict-test.cc@39
PS3, Line 39: tracker
can you add test cases that verify that the encoder/decoder is keeping track 
correctly.
You can do this by using tracker.consumption() to get the num of bytes consumed 
and compare it to the expected size you calculate separately.



--
To view, visit http://gerrit.cloudera.org:8080/8034
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02a3b54f6c107d19b62ad9e1c49df94175964299
Gerrit-Change-Number: 8034
Gerrit-PatchSet: 3
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Pranay Singh
Gerrit-Comment-Date: Fri, 29 Sep 2017 01:16:40 +0000
Gerrit-HasComments: Yes

Reply via email to