Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3201: headers and reservation logic for new buffer pool
......................................................................


Patch Set 15:

(23 comments)

http://gerrit.cloudera.org:8080/#/c/2569/15/be/src/bufferpool/buffer-pool.h
File be/src/bufferpool/buffer-pool.h:

Line 34: /// Centralizing buffer management in a single object allows 
enforcement of buffer
> this doesn't need to read like a design doc (ie, you don't have to present 
I trimmed this down a little. I still like having some motivation or 
explanation since that's hard to infer from just looking at the code.


Line 37: /// buffers across queries, rather than reallocating buffers for every 
query.
> rather than saying 'it allows us to...' you should simply describe what it 
Reworded a bit.


Line 40: /// non-spillable allocations of at least the minimum buffer length 
can also be
> "can also be" is vague; also, i didn't see that functionality here
Added some explanation.


PS15, Line 41: allocations
> what this is communicating is basically that there are minimum allocation s
I reworded a bit. I think we're going to see a bunch of different clients (disk 
i/o mgr, exec nodes, runtime filters, other allocators, etc, etc) so it doesn't 
seem feasible to enumerate.

I'd rather not spend too much time fine-tuning this prose unless there's 
something really misleading in it.


PS15, Line 44: buffer
> since the client is an important part of the interface, i feel there's info
Initially exec nodes, but I don't think that will be the only use case.


Line 55: /// * A page is a logical block of memory that can reside in memory or 
on disk.
> there is a need for an additional concept: buffers for mempools will never 
I don't think we need an additional concept. I added an example of how it 
should be used in this case.


Line 59: /// * A page is pinned if it is pinned by least one buffer pool 
client. A pinned page
> appears to assume sharing
Right, that's a natural consequence of the API.


Line 61: /// * An unpinned page can be written out to disk by the buffer pool 
so that the buffer
> it's preferable to defer talking about implementation; here you should focu
I think the fact it does disk I/O is significant to users. It seems contrived 
to deliberately avoid mentioning that it spills pages to disk.


Line 68: /// The buffer pool is of limited size and clients of the buffer pool 
often need
> here also: don't talk about what clients "often need", but what this class 
I like having a sentence of motivation since it's pretty straightforward to 
infer what the class does from the code and method comments, but hard to infer 
the motivation.


Line 75: /// If a client pins the same page multiple times, additional 
reservations are consumed.
> i don't think this makes sense. it sounds like we should prevent pinning of
I strongly disagree. Implementing that optimisation by default makes the 
invariants around Pin() and Unpin() more complicated and doesn't buy us much. 
The current solution both has simple invariants and is very flexible.

I'm not sure if you were talking about the multiple clients case, but I don't 
think it works at all there, since you need to decide which client's 
reservation to use. If you count it against client A, and client A unpins the 
page, then you're stuck because you then need to count it against client B, and 
there's no guarantee that client B has the reservation for it.

If it's the single client/multiple pins case, it still complicates the 
invariants because Pin() and Unpin() are no longer actually inverses. E.g. if 
you have two iterators over the same sequence of pages, then it would be nice 
if each could always advance to the next page by calling Unpin() to free 
reservation then Pin(). That only works if Unpin() always frees reservations, 
even if both iterators have the same page pinned. Otherwise you have to do a 
bunch of bookkeeping in the operator to make sure you don't accidentally use 
the reservation that you need to advance to the next page.

If saving the reservations is important for some particular case, a client can 
implement the optimisation you're describing by checking if the handle is 
pinned before pinning it.

So the current solution is the best of both worlds - the default behaviour is 
easy to reason about, and implementing the optimisation in an operator is 
trivial if it's desired.

We're also not actually wasting the memory, since the buffer pool can make use 
of the spare buffers.


Line 81: /// reserved buffers in any way, e.g. for keeping unpinned pages in 
memory, so long as
> i don't think buffers should be "reserved" (which sounds like "set aside").
Done


Line 88: /// in order to avoid memory lifetime issues such as dangling or 
invalid pointers to
> very general comment/argument that doesn't need to be made here
I think it's helpful to explain why PageHandle exists - otherwise it's not 
obvious why there are two separate apparently-redundant concepts.


PS15, Line 99: buffer
> why page? from my understanding, pinning only applies to page, not buffers 
Should have been page. Reworded these points.


Line 102: ///   DuplicateHandle().
> sound expensive. what's the rationale against handing off the handle?
I don't really see the problem, the operation is lightweight and should be 
amortised fairly well (main cost would be the atomics and locks to update the 
various trackers). 

If we implemented a "Transfer" operation it wouldn't be any more efficient - it 
would just copy the handle, update any trackers, and then clean up the old 
handle. I actually added a Transfer() convenience method implemented on top of 
DuplicateHandle().

Having DuplicateHandle() just makes it a bit more flexible.


Line 112: /// * pages_::lock_ -> Page::lock_
> implementation detail. worth pointing out in an 'implementation details' se
Done


PS15, Line 147: Close
> in the interest of modularity, you could also give PageHandle interface fun
I deliberately didn't do this. It's significant which client the operations 
occur in the context of, so I think it should be explicit to avoid masking bugs 
like invoking page operations using someone else's client.

I'd also rather avoid a proliferation of convenience methods that do nothing 
but call other methods - that was one of the things that made the 
BufferedBlockMgr code harder to follow than necessary. I feel like we should 
just add them in if the exec node code gets too verbose.


Line 149:   void CloseHandle(Client* client, PageHandle* handle);
> create/close don't sound like duals of each other. create/destroy? get/rele
They're not exactly duals, since one creates the page, but this only closes the 
handle to a page, but doesn't necessarily destroy the page.

I can rename DestroyHandle() if that's clearer.


Line 157:   Status Pin(Client* client, PageHandle* handle);
> what does it mean to call this on a handle that we previously closed?
It's invalid. Added a slight clarification to CloseHandle() - the general rule 
is that you can't do anything after closing it.


Line 224:   bool is_registered() const { return reservations_ != NULL; }
> follow formatting convention
What formatting convention are you referring to?


Line 253:   // Allow move construction of page handles.
> why?
See response to Dan's comment below:

One is needed for STL classes. When a std::vector is resized, it needs to move 
its contents.

Another is needed to support std::move(), since we don't allow copying the 
handles.

Added comments to document this.


Line 259:   bool is_open() const { return page_ != NULL; }
> follow formatting conventions (and elsewhere)
I don't understand what formatting conventions you're referring to. We format 
accessors in the way all over the place.


Line 265:   uint8_t* buf() const {
> might as well call this buffer
Do you feel strongly about this? It's going to be a headache rebasing and 
rewrapping all the longer lines.


http://gerrit.cloudera.org:8080/#/c/2569/15/be/src/bufferpool/reservation-tracker.h
File be/src/bufferpool/reservation-tracker.h:

Line 35: /// is entitled to consumed and a 'consumption': the actual amount 
that is consumed.
> to consume
Reworded to "use" anyway.


-- 
To view, visit http://gerrit.cloudera.org:8080/2569
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6
Gerrit-PatchSet: 15
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to