Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3611: track unused Disk IO buffer memory
......................................................................


Patch Set 6:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/3246/6/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

PS6, Line 441: NULL
> Is there a reason to not use "reader->mem_tracker_ != NULL ? reader->mem_tr
We don't want to track this memory, since it's a cached buffer that's owned by 
the HDFS client in the JVM.

Added a comment.


http://gerrit.cloudera.org:8080/#/c/3246/6/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

PS6, Line 224:       
> nit: wrong indentation.
Done


PS6, Line 368: Other
> Untracked / Unowned ?
Went with untracked. I thought this might be a little confusing, since they are 
technically tracked, but seems clearer than 'other'.


PS6, Line 702: (1 << idx)
> 1LL to be consistent with the 64-bit-ness above. I think in practice, this 
Done (here and elsewhere in this file)


PS6, Line 790: // Note: we can't use TryConsume(), which can indirectly call 
GcIoBuffers().
> , which can indirectly call GcIoBuffers() and lead to deadlock.
Done


PS6, Line 857:     // TODO: we can do a lot better here.  The reader can likely 
make progress
             :     // with fewer io buffers.
> Can you please replace this TODO statement with one which indicates that "t
Done. Also added a TODO referencing IMPALA-3209, since it's related to making 
scanners operate in a memory limit.


PS6, Line 859: free_buffer_mem_tracker_->AnyLimitExceeded();
> Regarding your previous comment about not calling GcIoBuffer() here, I am n
It's safe to do that here since we're not holding free_buffers_lock_ at this 
point in the code.

The change doesn't make a functional difference, it's just for consistency 
since we are no longer tracking directly against process_mem_tracker_. The 
pattern generally in the codebase is to interact only with your local 
MemTracker.


http://gerrit.cloudera.org:8080/#/c/3246/6/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

PS6, Line 232: bool is_cached() { return scan_range_->cached_buffer_ != NULL; }
> This can be private.
Done


PS6, Line 237: MemTracker* mem_tracker() { return mem_tracker_; }
> Is this used outside of disk-io-mgr at all ?
Not now. Removed it. It was at one point when I tried to move this to RowBatch.


PS6, Line 243: 'tracker'
> 'dst' ? It may be clearer to say transfer from 'mem_tracker_' to 'dst_'. Pl
I think this is more related to IMPALA-3209, since this is part of the cycle of 
memory transfer between the disk io mgr, scanners, and upstream exec nodes. 
Added a reference to that JIRA.

I looked at just checking the memory limit and returning a status, but it 
looked like exiting early from some of the calling functions could leave things 
in an inconsistent state.


-- 
To view, visit http://gerrit.cloudera.org:8080/3246
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8777cf76f04d34a46f53d53005412e0f1d63b5b7
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to