Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5776: Write partial tuple to the correct mempool ......................................................................
IMPALA-5776: Write partial tuple to the correct mempool In the text scanner, we were writing the partial tuple variable length data to data_buffer_pool_ mempool which caused strange behavior, such as incorrect results. If we are scanning compressed data, the pool gets attached to the row batch at the end of a GetNext() call and gets freed before the next GetNext() call. This is wrong because we expect the data in the partial tuple to survive between the GetNext() calls. If we are scanning non compressed data, data_buffer_pool_ never gets cleared and grows over time until the scanner finishes reading the scan range. We fix the problem by writing the varlen partial tuple data to boundary_pool, which is where the constant length partial tuple data is written. We also make sure that boundary pool does not hold any tuple data of returned batches by always deep copying it to output batches. Testing: - Ran some tests locally on ASAN build. - Updated test_scanners_fuzz.py to make slightly more significant changes to the data files. This change was helpful for finding issues while developing this patch. Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Reviewed-on: http://gerrit.cloudera.org:8080/7639 Reviewed-by: Taras Bobrovytsky <[email protected]> Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-text-scanner.cc M be/src/exec/hdfs-text-scanner.h M tests/query_test/test_scanners_fuzz.py 3 files changed, 65 insertions(+), 51 deletions(-) Approvals: Impala Public Jenkins: Verified Taras Bobrovytsky: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7639 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I60ba5c113aefd17f697c1888fd46a237ef396540 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
