Alexey Serbin has posted comments on this change. Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode ......................................................................
Patch Set 15: (41 comments) Thank you for the review! Will post updated version soon. http://gerrit.cloudera.org:8080/#/c/3952/15//COMMIT_MSG Commit Message: PS15, Line 20: waterline > Nit: I think elsewhere you changed this to watermark; could you do the same Done http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client-internal.h File src/kudu/client/client-internal.h: Line 240: private: > In the PIMPL idiom, the impl class is typically completely public. Not sure I just noticed the DISSALLOW_COPY_AND_ASSIGN() does not do what it's supposed to since the private was missing. I think the idea is to protect against unintentional copying. http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.cc File src/kudu/client/client.cc: Line 53: #include "kudu/gutil/strings/human_readable.h" > Is this used for anything? It turned out that it's not used. Will remove. Line 692: : data_(new KuduSession::Data(client, client->data_->messenger_)) { > Since the messenger is reachable through the client, why not have KuduSessi Exactly. It's not possible to declare an embedded class to be a friend (at least I don't know how to do that). I.e., KuduSession::Data is an embedded/internal class, and it's not possible to add it into the frendship list for the KuduClient class. Line 733: Status KuduSession::SetMutationBufferSpace(size_t size) { > One of the simplifying assumptions made by the Java client is that many (if OK, that's a good idea. I'll think in that direction, thanks! http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.h File src/kudu/client/client.h: PS15, Line 1035: In this mode, the Flush() call can be used to block until a batch : /// is sent and the buffer has available space again. > To be clear, Flush() also causes a background flush that normally would hav Yes, exactly. Thank you for making this clear. Will update the comment. PS15, Line 1127: the > Nit: can drop this. Done PS15, Line 1129: pushing > Nit: sending Done PS15, Line 1130: into > Nit: to the Done PS15, Line 1133: OK > Nit: safe Done PS15, Line 1134: does not come into play > Nit: has no effect Done PS15, Line 1139: > Nit: got two spaces here, can you fix and check the rest? Done PS15, Line 1140: by be > What happened here? Oops, that were aliens, probably. PS15, Line 1141: n > And here? Done Line 1146: Status SetMutationBufferFlushWatermark(double watermark_pct) > It would also be useful to describe (in the comment) what the default water Will do. BTW, do we need getters for those parameters? I think getters might be useful here. However, having no getter for the buffer space hints that there was some deliberate decision not having getters. OK, I'll figure it out. PS15, Line 1161: OK > Nit: safe Done PS15, Line 1162: does not come into play > Nit: has no effect Done PS15, Line 1166: internal > Nit: interval Done http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.cc File src/kudu/client/session-internal.cc: Line 57: flush_interval_(MonoDelta::FromMilliseconds(10)) { > The default flush interval seems really low. Isn't it something like 5s in I just thought it supposed to be that low. This is not due to a mistake in the logic which would block sometimes. I'll put 1 second here as in AsyncKuduSesion, as you mentioned. PS15, Line 64: -1, > PeriodicFlushTask uses 0, not -1. Is there a significance to this differenc This is to force getting a new batcher and flushing the prior, if any, even if the prior hasn't any buffered data. That was the default behavior prior to my modifications here: there is a unit test which verifies that upon a call of FlushAsync() even the empty batcher is 'flushed'. Line 66: PeriodicFlushTask(Status::OK(), messenger_, session); > I don't think this is quite the periodic semantics we want. OK, I'll change it then. Basically, the idea is to run the periodic task as it, making re-scheduling itself, but check against the latest flush time when the task executes. There, schedule the task to run next after (last_flush_time + interval - now) interval, so next periodic flush happens not earlier than the period of the background flush task. Does this sound good? Line 74: bytes_flushed = batcher->buffer_bytes_used(); > Pull this out of the lock, presumably it's only flushed_batchers_ that need Done PS15, Line 81: there might be several data pushers > How? Shouldn't there be at most one outstanding thread in Apply(), since on David mentioned that in the very first review of my changes, at least I understood it like 'make sure there is a test which verifies it's safe to call Apply() from multiple threads'. And I implemented that test, actually a couple of those. I've just talked to Todd regarding this, and Todd also thinks there can be only one thread calling KuduSession::Apply() on a single session object. He mentioned that the Java client would not allow to safely call Apply() from multiple threads on the same session object. And yes, it would more efficient to use just Signal(), not Broadcast() here. Will change. PS15, Line 101: flusher > Not clear what a "flusher" is (now that there's no dedicated thread). Done Line 102: std::lock_guard<Mutex> l(cond_mutex_); > Why is this taken here? flush_mode_ is not protected by cond_mutex_. oh, I see -- it seems I missed one in the task running at reactor thread. Also forgot to document that. Will fix. Line 109: flow_control_cond_.Broadcast(); > Why do this? We've guaranteed there are no pending operations, so who would The original idea was to get rid of those limitations like empty buffer when changing the flushing mode. The idea emerged after David reviewed my first patch on AUTO_FLUSH_BACKGROUND. All right, I'll remove that since we are going to have those limitation in place. BTW, is there any reason for keeping those limitations besides making the implementation simpler? PS15, Line 115: // The data flow control logic needs to know if the buffer limit changes. : // There might be several threads calling KuduSession::Apply(). > Let's simplify by preventing buffer size changes when there are pending ope OK, that's a good idea, thanks! I was under impression that multiple threads can call KuduSession::Apply(). Line 122: CHECK_GE(watermark_pct, 0); > This seems harsh for a client library; we can return Status::IllegalArgumen Good idea. However, one of my tests uses 200% to exercise one scenario :) OK, I'll try to resolve this. Line 135: CHECK_GE(timeout_ms, 0); > Too harsh here too, but I guess we're locked into the void return value now Yep. What we can do instead is to set it to 0 if the specified parameter was negative. It's kind of contradiction to the POLA (Principle Of Least Astonishment), but it's an option. What do you think? Line 169: return 0; > Nit: indentation. It seems QtCreator with vi plugin sometimes does not follow the code style it is configured to keep. PS15, Line 230: RETURN_NOT_OK(TryApplyWriteOp(write_op, &required_size, : ¤t_flush_mode, ¤t_flush_watermark)); > The output parameters are confusing: Yep, having those facts simplifies this. Again, I was under impression there might be several threads calling Apply(). The third point is complicated because it would require making KuduSession::Data a friend of WriteOp which is impossible (since KuduSession::Data is an embedded/internal class). If you know how to circumvent this restriction not making the Batcher::SizeInBuffer() public, please let me know. PS15, Line 321: accomodate > Nit: accommodate Done Line 393: if (data->flush_mode_ != AUTO_FLUSH_BACKGROUND) { > Doesn't this mean flush_mode_ needs to be synchronized? Otherwise a client Yes, that's the place where I missed the mutex. Will fix -- thanks. Line 398: data->NextBatcher(session, 0, nullptr); > Seems KuduSession::Data::Init() with AUTO_FLUSH_BACKGROUND will flush an em Nope. It would, however, if the second parameter were -1 (as you noticed that in the Init() method). http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.h File src/kudu/client/session-internal.h: Line 83: void NextBatcher(const sp::weak_ptr<KuduSession>& session, > Please doc the significance of 'watermark'. Done PS15, Line 87: On successful return, the output flush_mode parameter is set : // to the effective flush mode. > Why? Why can't the caller just look at flush_mode_ directly? This is because the flush_mode_ can change in the middle and the code which calls this methods makes a decision whether to call Flush() based on the result value of the flush_mode parameter. To avoid introduction of additional locks and races (the Apply() method can be called by multiple threads), I decided to do it this way. Line 124: private: > How did you decide which members were private and which were public? Especi Originally, all members were public. I started putting all new members into the private area. Later on, I accidentally added more members into the public area. Yes, I'll clean this up. Line 131: Status TryApplyWriteOp(KuduWriteOperation* write_op, int64_t* op_raw_size, > This doesn't actually apply the operation, though. It just checks to see if Right, good observation. PS15, Line 159: Mutex for the condition variables below and for other : // condition-related counters and members. > There's only one condition variable now. Done Line 161: Mutex cond_mutex_; > What's the relationship between this new lock and lock_? Please add a comme Done http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/write_op.h File src/kudu/client/write_op.h: PS15, Line 112: Once called, the result is cached : // so subsequent calls will return the size previously computed. Set the : // parameter to 'false' to re-compute the size. > Why cache at all? As far as I can tell, there's only one call to SizeInBuff Actually, there are two calls for SizeInBuffer() for a single call of KuduSession::Apply(). The first is from CanApplyWriteOp: it calls Batcher::GetOperationSizeInBuffer() which, in its turn, calls SizeInBuffer(). The second, as you mentioned, is when the Batcher::Apply() is called. The non-cached option is to have a way to re-compute the size, if operation has changed since the methods was called last time. I'm not sure whether it's a pure paranoia or that's a valid scenario, but as one can see, the idiom of cached size is not safe even given the fact that the method is private. I could not find better solution here. Probably, it's just paranoia. Let me remove the parameter. -- To view, visit http://gerrit.cloudera.org:8080/3952 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8 Gerrit-PatchSet: 15 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
