Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-5776: Write partial tuple to the correct mempool
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7639/4/be/src/exec/hdfs-text-scanner.h
File be/src/exec/hdfs-text-scanner.h:

Line 194:   boost::scoped_ptr<MemPool> boundary_pool_;
> I still don't understand the memory lifetime for boundary_row_, boundary_co
Data is not always copied out of boundary_col. It can end up here: 
http://github.mtv.cloudera.com/CDH/Impala/blob/e5e444a89008a22f987c517ef1ecaa9f1693b060/be/src/exec/text-converter.inline.h#L71

This happens if reuse_data is true, which it always is if codegen is enabled. 
So it's not safe to free the boundary pool. So the life time of the boundary 
pool is different than the partial tuple pool.

Maybe this can be addressed in a different patch?


http://gerrit.cloudera.org:8080/#/c/7639/4/tests/query_test/test_scanners_fuzz.py
File tests/query_test/test_scanners_fuzz.py:

Line 217:     num_corruptions = rng.randint(0, int(math.log(len(data))))
> Do these changes reproduce the bug?
No, I don't think they reproduce the bug, but I think these changes are good to 
make in general.


-- 
To view, visit http://gerrit.cloudera.org:8080/7639
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to