Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3611: track unused Disk IO buffer memory
......................................................................


Patch Set 9:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/3246/9/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 246:   mem_tracker_->Release(buffer_len_);
> it's a bit confusing that the tracking happens both in BufferDescriptor met
I feel like this actually makes sense, since the tracking in DiskIoMgr methods 
is handling ownership transfer between DiskIoMgr and an external mem tracker, 
while this is handling transfer between external mem trackers. This method is 
self-contained.


PS9, Line 368: Untracked
> this is a little misleading given that they are tracked by this tracker.  W
I originally called it "Other Disk IO Buffers" but changed it as a a result of 
previous review comments. My comment then was "Went with untracked. I thought 
this might be a little confusing, since they are technically tracked, but seems 
clearer than 'other'.".  I haven't been able to come up with better 
nomenclature.

The tracker is easy to create, but I don't think we want it to hang off the 
query mem tracker, since it would mean we have temporary 8MB bumps in query mem 
usage while spilling that could put it over the limit. Doesn't seem worth the 
potential problems. I could have the buffered block mgr tracker hang off the 
process mem tracker, which probably works ok.

Do you think that's better? It seems more complicated and confusing to me than 
the current approach. My preference is to leave as-is and avoid investing 
further time since it's a temporary solution.


PS9, Line 650: Not a cached buffer. Return the io buffer.
> not your comment, but this comment is pretty useless, it just restates what
Done


Line 752:       delete[] buffer;
> would be nice to delete first then Release().
Done


http://gerrit.cloudera.org:8080/#/c/3246/9/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS9, Line 242: caller should check the memory limit
> do callers actually check the mem limit?
You're right, we don't transfer between queries, so there shouldn't be a net 
increase in memory usage as a result of this transfer. I was thinking that this 
would be a case where the QueryMaintenance() memory checks was still necessary, 
but I don't think it is for that reason.

Agree we need to revisit it, but I don't think there's a clear way to propagate 
the error, so I left the TODO.


PS9, Line 271: Non-NULL for non-cached
> double negative is hard to parse. maybe say "Can be NULL only for cached"
Done


PS9, Line 774: associated
> what does this mean? does this routine consume memory from 'mem_tracker', o
Updated the comment to reflect that this purely allocates and initializes the 
descriptor.


Line 792:   /// max_buffer_size_.
> would be good to comment on how mem tracking is affected.
Done


Line 802:   /// Returns the buffer in 'desc' (cannot be NULL), and sets 
desc->buffer_ to NULL.
> likewise.
Renamed to FreeBufferMemory() and updated the comment.


-- 
To view, visit http://gerrit.cloudera.org:8080/3246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8777cf76f04d34a46f53d53005412e0f1d63b5b7
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to