IMPALA-5815: right outer join returns invalid memory The bug is that OutputAllBuild() called BufferedTupleStream::GetNext() while 'out_batch' still referenced data from the current page of the stream. When iterating over an unpinned stream, GetNext() sets the 'needs_deep_copy' flag when it hits the end of a page, so that the caller has an opportunity to flush or deep copy the data. On the next call to GetNext(), that page may be deleted or unpinned.
The fix is to check whether the batch is at capacity before calling BTS::GetNext(). This issue was masked by not using the 'delete_on_read' mode of the stream, which would have freed the stream's buffers earlier and increased the odds of ASAN detecting the problem. Testing: Running TestTPCHJoinQueries.test_outer_joins() reliably reproduced this for me locally under ASAN. After the fix the problem does not reoccur. Change-Id: Ia14148499ddaec41c2e70fef5d53e5d06ea0538d Reviewed-on: http://gerrit.cloudera.org:8080/7772 Reviewed-by: Dan Hecht <[email protected]> Tested-by: Impala Public Jenkins Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/3b1a1df7 Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/3b1a1df7 Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/3b1a1df7 Branch: refs/heads/master Commit: 3b1a1df7e30de7d21cc30eba8bc7d318ee063d5a Parents: 5f32312 Author: Tim Armstrong <[email protected]> Authored: Fri Aug 18 13:10:51 2017 -0700 Committer: Impala Public Jenkins <[email protected]> Committed: Wed Aug 23 06:17:17 2017 +0000 ---------------------------------------------------------------------- be/src/exec/partitioned-hash-join-node.cc | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3b1a1df7/be/src/exec/partitioned-hash-join-node.cc ---------------------------------------------------------------------- diff --git a/be/src/exec/partitioned-hash-join-node.cc b/be/src/exec/partitioned-hash-join-node.cc index 2c656b0..13d660f 100644 --- a/be/src/exec/partitioned-hash-join-node.cc +++ b/be/src/exec/partitioned-hash-join-node.cc @@ -387,7 +387,7 @@ Status PartitionedHashJoinNode::PrepareSpilledPartitionForProbe( DCHECK(NeedToProcessUnmatchedBuildRows()); bool got_read_buffer = false; RETURN_IF_ERROR(input_partition_->build_partition()->build_rows()->PrepareForRead( - false, &got_read_buffer)); + true, &got_read_buffer)); if (!got_read_buffer) { return mem_tracker()->MemLimitExceeded( runtime_state_, Substitute(PREPARE_FOR_READ_FAILED_ERROR_MSG, id_)); @@ -682,6 +682,12 @@ Status PartitionedHashJoinNode::OutputAllBuild(RowBatch* out_batch) { if (output_unmatched_batch_iter_->AtEnd()) { output_unmatched_batch_->TransferResourceOwnership(out_batch); output_unmatched_batch_->Reset(); + // Need to flush any resources attached to 'out_batch' before calling + // build_rows()->GetNext(). E.g. if the previous call to GetNext() set the + // 'needs_deep_copy' flag, then calling GetNext() before we return the current + // batch leave the batch referencing invalid memory (see IMPALA-5815). + if (out_batch->AtCapacity()) break; + RETURN_IF_ERROR(output_build_partitions_.front()->build_rows()->GetNext( output_unmatched_batch_.get(), &eos)); output_unmatched_batch_iter_.reset(
