Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-3105: rework handling of tuple buffer sizing in RowBatch ......................................................................
IMPALA-3105: rework handling of tuple buffer sizing in RowBatch RowBatch::MaxTupleBufferSize() tried to estimate the maximum number of rows that would fit in a batch based on the soft capacity memory limit of batches. The logic was wrong because the memory capacity can be exceeded, either because exec nodes do not check capacity, or because the limit is checked after adding a row, not before. Instead in this patch we achieve the same goal by setting the hard RowBatch::capacity_ limit to a value that keeps the total fixed-length data for a row batch below a cap (unless a single row would exceed that cap, in which case it can't be avoided). This avoids corner cases where the old MaxTupleBufferSize() calculation may have led to buffer overruns and simplifies the logic. Change-Id: Idfd9cd681875821c1c379d97586d3f4850aae622 --- M be/src/exec/data-source-scan-node.cc M be/src/exec/hbase-scan-node.cc M be/src/exec/hbase-scan-node.h M be/src/exec/union-node.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 7 files changed, 53 insertions(+), 52 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/73/2473/2 -- To view, visit http://gerrit.cloudera.org:8080/2473 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idfd9cd681875821c1c379d97586d3f4850aae622 Gerrit-PatchSet: 2 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]>
