Henry Robinson has uploaded a new patch set (#2).

Change subject: IMPALA-5773: Correctly account for memory used in data stream 
receiver queue
......................................................................

IMPALA-5773: Correctly account for memory used in data stream receiver queue

DataStreamRecvrs keep one or more queues of batches received to provide
some buffering. Each queue has a fixed byte size capacity. The estimate
of the contribution of a new RowBatch to that queue was using the
compressed size of the TRowBatch it would be deserialized from, which is
the wrong value (since the batch is uncompressed after deserialization).

* Add RowBatch::Get[Des|S]erializedSize(const TRowBatch&) to RowBatch
* Fix the estimate to use the uncompressed size.
* Add a DataStreamReceiver child profile to the exchg node so that the
  peak memory used by the receiver can be monitored easily.

Confirmed that the following query:

select count(distinct concat(cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120)),
                             cast(l_comment as char(120))) from lineitem;

succeeds with a mem-limit of 800Mb. Before this patch it would fail in a
one-node cluster as the datastream recvr would buffer more batches than
the memory limit would allow.

Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/runtime/data-stream-mgr.cc
M be/src/runtime/data-stream-recvr.cc
M be/src/runtime/data-stream-recvr.h
M be/src/runtime/data-stream-sender.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
7 files changed, 38 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/7646/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7646
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9e90f9596ee984438e3373af05e84d361702ca6a
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>

Reply via email to