Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4023: don't attach buffered tuple streams to batches

Patch Set 1:


File be/src/runtime/buffered-tuple-stream-test.cc:

Line 1185: }
> Add a simple test that checks pinned blocks are attached to a non-NULL batc

File be/src/runtime/row-batch.h:

Line 129:   bool ALWAYS_INLINE AtCapacity() {
> The meaning of AtCapacity() has slightly changed and I think it might be a 
We didn't check need_to_return_ before, it was sufficient to check num_rows_ == 
capacity_. The DCHECK I removed was redundant with the remaining one.

I think you're right that it's changed slightly, but I think the new behaviour 
makes more sense. The cases are:
1. The stream is empty - it doesn't make sense to consider it AtCapacity()
2. The stream has only small buffers < 8MB - in this case I think it's fine to 
not consider the batch AtCapacity() until the memory adds up to >= 8MB.
3. The stream has at least one large buffer >= 8MB - the aux memory usage check 
will cause this to be AtCapacity().

There's a bigger preexisting problem with the accounting of the attached 
blocks: the accounting isn't updated for the purpose of MemTrackers and 
BufferedBlockMgr reservations, so it's possible for another exec node to 
accumulate batches that are accounted against a different node. I'm going to 
need to fix that for the BufferPool changes, but I'm punting on it for now. 
This change will make that change simpler.

To view, visit http://gerrit.cloudera.org:8080/4448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6471422f86ce71e4c6ab277a276000051bc2e8ff
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to