Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
......................................................................


Patch Set 17:

(37 comments)

http://gerrit.cloudera.org:8080/#/c/2569/17/be/src/bufferpool/buffer-allocator.h
File be/src/bufferpool/buffer-allocator.h:

Line 22: /// Memory allocator for buffers that are multiples of the minimum 
buffer length.
unclear what this is: who uses it, what is the minimum buffer length?


http://gerrit.cloudera.org:8080/#/c/2569/15/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

Line 34: /// Centralizing buffer management in a single object allows 
enforcement of buffer
this doesn't need to read like a design doc (ie, you don't have to present the 
pros and cons)


Line 37: /// buffers across queries, rather than reallocating buffers for every 
query.
rather than saying 'it allows us to...' you should simply describe what it 
does, possibly in bullet points


Line 40: /// of at least the minimum buffer length that won't be spilled can 
also be served by
"can also be" is vague; also, i didn't see that functionality here


PS15, Line 41: s are neede
what this is communicating is basically that there are minimum allocation 
sizes. 

rather than being vague and talking about clients, it's more helpful to 
describe what the audience of this interface is (exec nodes? other allocators? 
...).


PS15, Line 44: buffer
since the client is an important part of the interface, i feel there's 
information missing. who needs to allocate this client (ie, per exec node, per 
fragment, per query...)? is it thread-safe? are there other guarantees for the 
client?


PS15, Line 45: reservations
differentiate client against other classes (ReservationTracker)


Line 55: /// * A page is a logical block of memory that can reside in memory or 
on disk.
there is a need for an additional concept: buffers for mempools will never be 
spilled.


Line 59: /// * A page is pinned if it is pinned by least one buffer pool 
client. A pinned page
appears to assume sharing


Line 61: /// * An unpinned page can be written out to disk by the buffer pool 
so that the buffer
it's preferable to defer talking about implementation; here you should focus on 
user-visible semantics.

feel free to have an additional section below to talk about implementation 
highlights


Line 68: /// The buffer pool is of limited size and clients of the buffer pool 
often need
here also: don't talk about what clients "often need", but what this class 
provides


Line 75: /// If a client pins the same page multiple times, additional 
reservations are consumed.
i don't think this makes sense. it sounds like we should prevent pinning of 
already pinned pages, otherwise we're wasting memory (the reservation prevents 
other queries from claiming that memory)


Line 81: /// reserved buffers in any way, e.g. for keeping unpinned pages in 
memory, so long as
i don't think buffers should be "reserved" (which sounds like "set aside"). a 
reservation guarantees that you can later get a buffer, but buffers should only 
be either free or backing a pinned page (or otherwise in use; see mempool 
comment).


Line 88: /// in order to avoid memory lifetime issues such as dangling or 
invalid pointers to
very general comment/argument that doesn't need to be made here


PS15, Line 99:  a cli
> Done
why page? from my understanding, pinning only applies to page, not buffers 
(buffers either hold data or they don't).


PS15, Line 101: page c
> Done
i'd make the same argument here: transfer of resources involves resources. 
buffers are resources, pages are logical constructs. 

transfer of ownership also needs to work out of mempools (which don't have 
pages).


Line 102: ///   the same handle.
sound expensive. what's the rationale against handing off the handle?


Line 112: ///
implementation detail. worth pointing out in an 'implementation details' section


PS15, Line 147: On su
> Done
in the interest of modularity, you could also give PageHandle interface 
functions, such as Pin/Unpin/Close.


Line 149: 
create/close don't sound like duals of each other. create/destroy? get/release?


Line 157:   /// storage backing the page is freed. Idempotent.
what does it mean to call this on a handle that we previously closed?


Line 224: };
follow formatting convention


Line 253:   /// All pages pinned by the client count as consumption against 
'reservations_'
why?


Line 259: /// designed to only be used by a single thread at a time: 
concurrently calling
> Added to the class comment.
follow formatting conventions (and elsewhere)


Line 265:   PageHandle() { Reset(); }
might as well call this buffer


http://gerrit.cloudera.org:8080/#/c/2569/15/be/src/bufferpool/reservation-tracker.h
File be/src/bufferpool/reservation-tracker.h:

Line 35: /// is entitled to consumed and a 'consumption': the actual amount 
that is consumed.
to consume


http://gerrit.cloudera.org:8080/#/c/2569/17/be/src/bufferpool/reservation-tracker.h
File be/src/bufferpool/reservation-tracker.h:

Line 37: /// nodes. To increase its reservation, a tracker must consume some of 
its parent's
in l35 consumption refers to actual memory usage, not the reservation. now you 
use the same term 'consumption' to refer to 'subtracting from reservation'. 
keep the two concepts separated by separate terminology (decrease 
reservation/consume memory?).


Line 49: /// * A tracker's consumption is at least the sum of its childrens' 
reservations.
that doesn't make sense to me if consumption = memory consumption.

i would expect that a reservationtracker's reservation = the sum of its 
childrens' reservations.


Line 52: /// Each ReservationTracker can optionally have an associated 
MemTracker. Any
why optionally? if this class tracks memory consumption, i'd expect that to be 
done via a memtracker.


Line 57: /// MemTracker, so reservations count against both the exec node and 
the query MemTracker.
i don't understand that either, this seems to conflate concepts (reservation vs 
actual consumption).


Line 60: /// one level. Otherwise reservations will be double-counted against 
MemTrackers.
should the linking be in the other direction: a memtracker can have an 
associated reservationtracker.

that way, a memtracker for an execnode can check its reservation when 
increasing mem consumption (but memtrackers below the execnode level don't have 
associated reservationtrackers).


PS17, Line 67: s
don't refer to internal state (reservations_) in a function comment. the 
function comments should describe (all) externally observable behavior.


Line 73:   /// counted as consumption against 'mem_tracker'.
i don't understand why a reservation is reflected somewhere as consumption.


Line 99:   /// Decrease tracker's reservation by amount. This tracker must have 
used at least
standardize your terminology (here it's 'used', elsewhere 'consumed').


Line 103:   /// Consumes the given amount of the reservation. The tracker must 
have at least 'amount'
long line


Line 108:   void Release(int64_t amount);
release doesn't sound like the opposite of consume (in fact, i'd think it means 
'close').


Line 111:   int64_t Reservation();
Get...() (and the next 2)


-- 
To view, visit http://gerrit.cloudera.org:8080/2569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 17
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to