Tim Armstrong has posted comments on this change. Change subject: IMPALA-3201: headers and reservation logic for new buffer pool ......................................................................
Patch Set 15: (23 comments) http://gerrit.cloudera.org:8080/#/c/2569/15/be/src/bufferpool/buffer-pool.h File be/src/bufferpool/buffer-pool.h: Line 34: /// Centralizing buffer management in a single object allows enforcement of buffer > this doesn't need to read like a design doc (ie, you don't have to present I trimmed this down a little. I still like having some motivation or explanation since that's hard to infer from just looking at the code. Line 37: /// buffers across queries, rather than reallocating buffers for every query. > rather than saying 'it allows us to...' you should simply describe what it Reworded a bit. Line 40: /// non-spillable allocations of at least the minimum buffer length can also be > "can also be" is vague; also, i didn't see that functionality here Added some explanation. PS15, Line 41: allocations > what this is communicating is basically that there are minimum allocation s I reworded a bit. I think we're going to see a bunch of different clients (disk i/o mgr, exec nodes, runtime filters, other allocators, etc, etc) so it doesn't seem feasible to enumerate. I'd rather not spend too much time fine-tuning this prose unless there's something really misleading in it. PS15, Line 44: buffer > since the client is an important part of the interface, i feel there's info Initially exec nodes, but I don't think that will be the only use case. Line 55: /// * A page is a logical block of memory that can reside in memory or on disk. > there is a need for an additional concept: buffers for mempools will never I don't think we need an additional concept. I added an example of how it should be used in this case. Line 59: /// * A page is pinned if it is pinned by least one buffer pool client. A pinned page > appears to assume sharing Right, that's a natural consequence of the API. Line 61: /// * An unpinned page can be written out to disk by the buffer pool so that the buffer > it's preferable to defer talking about implementation; here you should focu I think the fact it does disk I/O is significant to users. It seems contrived to deliberately avoid mentioning that it spills pages to disk. Line 68: /// The buffer pool is of limited size and clients of the buffer pool often need > here also: don't talk about what clients "often need", but what this class I like having a sentence of motivation since it's pretty straightforward to infer what the class does from the code and method comments, but hard to infer the motivation. Line 75: /// If a client pins the same page multiple times, additional reservations are consumed. > i don't think this makes sense. it sounds like we should prevent pinning of I strongly disagree. Implementing that optimisation by default makes the invariants around Pin() and Unpin() more complicated and doesn't buy us much. The current solution both has simple invariants and is very flexible. I'm not sure if you were talking about the multiple clients case, but I don't think it works at all there, since you need to decide which client's reservation to use. If you count it against client A, and client A unpins the page, then you're stuck because you then need to count it against client B, and there's no guarantee that client B has the reservation for it. If it's the single client/multiple pins case, it still complicates the invariants because Pin() and Unpin() are no longer actually inverses. E.g. if you have two iterators over the same sequence of pages, then it would be nice if each could always advance to the next page by calling Unpin() to free reservation then Pin(). That only works if Unpin() always frees reservations, even if both iterators have the same page pinned. Otherwise you have to do a bunch of bookkeeping in the operator to make sure you don't accidentally use the reservation that you need to advance to the next page. If saving the reservations is important for some particular case, a client can implement the optimisation you're describing by checking if the handle is pinned before pinning it. So the current solution is the best of both worlds - the default behaviour is easy to reason about, and implementing the optimisation in an operator is trivial if it's desired. We're also not actually wasting the memory, since the buffer pool can make use of the spare buffers. Line 81: /// reserved buffers in any way, e.g. for keeping unpinned pages in memory, so long as > i don't think buffers should be "reserved" (which sounds like "set aside"). Done Line 88: /// in order to avoid memory lifetime issues such as dangling or invalid pointers to > very general comment/argument that doesn't need to be made here I think it's helpful to explain why PageHandle exists - otherwise it's not obvious why there are two separate apparently-redundant concepts. PS15, Line 99: buffer > why page? from my understanding, pinning only applies to page, not buffers Should have been page. Reworded these points. Line 102: /// DuplicateHandle(). > sound expensive. what's the rationale against handing off the handle? I don't really see the problem, the operation is lightweight and should be amortised fairly well (main cost would be the atomics and locks to update the various trackers). If we implemented a "Transfer" operation it wouldn't be any more efficient - it would just copy the handle, update any trackers, and then clean up the old handle. I actually added a Transfer() convenience method implemented on top of DuplicateHandle(). Having DuplicateHandle() just makes it a bit more flexible. Line 112: /// * pages_::lock_ -> Page::lock_ > implementation detail. worth pointing out in an 'implementation details' se Done PS15, Line 147: Close > in the interest of modularity, you could also give PageHandle interface fun I deliberately didn't do this. It's significant which client the operations occur in the context of, so I think it should be explicit to avoid masking bugs like invoking page operations using someone else's client. I'd also rather avoid a proliferation of convenience methods that do nothing but call other methods - that was one of the things that made the BufferedBlockMgr code harder to follow than necessary. I feel like we should just add them in if the exec node code gets too verbose. Line 149: void CloseHandle(Client* client, PageHandle* handle); > create/close don't sound like duals of each other. create/destroy? get/rele They're not exactly duals, since one creates the page, but this only closes the handle to a page, but doesn't necessarily destroy the page. I can rename DestroyHandle() if that's clearer. Line 157: Status Pin(Client* client, PageHandle* handle); > what does it mean to call this on a handle that we previously closed? It's invalid. Added a slight clarification to CloseHandle() - the general rule is that you can't do anything after closing it. Line 224: bool is_registered() const { return reservations_ != NULL; } > follow formatting convention What formatting convention are you referring to? Line 253: // Allow move construction of page handles. > why? See response to Dan's comment below: One is needed for STL classes. When a std::vector is resized, it needs to move its contents. Another is needed to support std::move(), since we don't allow copying the handles. Added comments to document this. Line 259: bool is_open() const { return page_ != NULL; } > follow formatting conventions (and elsewhere) I don't understand what formatting conventions you're referring to. We format accessors in the way all over the place. Line 265: uint8_t* buf() const { > might as well call this buffer Do you feel strongly about this? It's going to be a headache rebasing and rewrapping all the longer lines. http://gerrit.cloudera.org:8080/#/c/2569/15/be/src/bufferpool/reservation-tracker.h File be/src/bufferpool/reservation-tracker.h: Line 35: /// is entitled to consumed and a 'consumption': the actual amount that is consumed. > to consume Reworded to "use" anyway. -- To view, visit http://gerrit.cloudera.org:8080/2569 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I35cc89e863efb4cc506657bfdaaaf633a10bbab6 Gerrit-PatchSet: 15 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
