Dan Hecht has posted comments on this change. Change subject: IMPALA-3203: Part 2: per-core free lists in buffer pool ......................................................................
Patch Set 15: (15 comments) http://gerrit.cloudera.org:8080/#/c/6414/14/be/src/runtime/bufferpool/buffer-allocator.cc File be/src/runtime/bufferpool/buffer-allocator.cc: Line 347: lock_guard<SpinLock> pl(page->buffer_lock); > 'client_lock' would guard against the page being moved to a different arena Yup, realized that after writing this comment. oops. http://gerrit.cloudera.org:8080/#/c/6414/15/be/src/runtime/bufferpool/buffer-allocator.cc File be/src/runtime/bufferpool/buffer-allocator.cc: PS15, Line 59: . should be more explicit that it might return less memory than 'target_bytes'. Maybe say "Free as much memory as possible up to 'target_bytes' from this area back to the system allocator. PS15, Line 64: returns the total amount freed hmm, it's kinda confusing that the return value has this subtle difference in meaning depending on claim_memory. It seems the most intuitive return value would be "Returns the number of bytes of system memory claimed for the caller. Memory returned to the system in excess of that amount is added into 'system_bytes_remaining_'. Can the claim_memory=false path accommodate something like that? If we can't reconcile it to have a single return value, then it seems we really have two different abstractions bunched into one call. PS15, Line 70: FreeMemory do you think it makes sense to call this FreeSystemMemory() to make it clearer that this is about pushing memory back to the system? PS15, Line 254: ffinally typo Line 261: if (delta == len) break; since we can't get rid of the other DecreaseSystemBytesRemaining() probably better to move this back to where it was. PS15, Line 263: lock_arenas thought this would become 'final_attempt' Line 304: vector<std::unique_lock<SpinLock>> arena_locks; let's add a comment explaining all of this. Something like: There are two strategies for scavenging buffers: 1) Fast, opportunistic: Each arena is searched in succession. Although reservations guarantee that the memory we need is available somewhere, this can fail because we can race with another thread that returned buffers to an arena that we've already searched and took the buffers from an arena we haven't yet searched. 2) Slow, guaranteed to succeed: In order to ensure that we can find the memory in a single pass, we hold on to the locks for arenas we've already examined. That way, if another thread can't take the memory that we need from an arena that we haven't yet examined (or from 'system_bytes_available_') because in order to do so, it would have had to return the equivalent amount of memory to an earlier arena or added it back into 'systems_bytes_reamining_'. The former can't happen since we're still holding those locks, and the latter is solved by trying DecreaseSystemBytesRemaining() at the end. PS15, Line 317: // It's possible that 'system_bytes_remaining_' was incremented since we last checked : // it. maybe this sentence would now be explained in the proposed comment above. PS15, Line 319: another thread between when that thread released memory to the system and : // incremented 'system_bytes_remaining_'. I think it'd be good to explicitly say: ... since that happens while holding one of the arena locks. or similar. though the exception is when AllocateInternal() we add it back in without holding a lock, which isn't a correctness problem since that memory couldn't be needed by another client, but makes the statement a little confusing. PS15, Line 322: false why false instead of true? I guess it doesn't matter since it should succeed to get all of it, but 'true' seems more intuitive to me. http://gerrit.cloudera.org:8080/#/c/6414/15/be/src/runtime/bufferpool/buffer-allocator.h File be/src/runtime/bufferpool/buffer-allocator.h: Line 47: /// required memory from the OS). I think explaining a couple of more things would help: 1) The fact that BufferAllocator relies on the reservations (which are managed by the BufferPool & client) and the BufferPool's eviction policy, for correctness, to ensure that a client that has unused reservation can call Allocate() and the BufferAllocator() will always be able to find the memory to back the buffer somewhere. 2) Some implementation notes explaining that available memory is sort of kept in three forms: buffers in the free lists, buffers in the clean pages lists, and available memory that can be called up from the system. This information is kind of in the above paragraphs, but I think explaining it in that way will help understand the implementation. Probably paragraph two above can be moved into this section since the class abstraction itself doesn't expose arenas, right? PS15, Line 91: void does the caller care if it couldn't release 'bytes_to_free' bytes? PS15, Line 133: buffers memory? since some of it might come from 'system_bytes_remaining_'. PS15, Line 135: lock_arenas one thing that was confusing is that the name of this makes it sound like arenas are not locked when lock_arenas=false. But that's not true -- it's just they aren't concurrently locked. As mentioned in the cc file, I don't think the caller actually cares how the arena locking works, right? What this abstraction is really trying to expose is a "fast opportunistic path that might not succeed", and a "slow but guaranteed to succeed path". So, how about abstracting it and explaining it that way? -- To view, visit http://gerrit.cloudera.org:8080/6414 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I612bd1cd0f0e87f7d8186e5bedd53a22f2d80832 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
