Adar Dembo has posted comments on this change. Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode ......................................................................
Patch Set 22: (18 comments) Did another pass. I'm still blown away by the overall complexity; I wonder if reducing the number of locks will help here, or making other infrastructure improvements (e.g. allowing reactor tasks to be canceled). http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/batcher.h File src/kudu/client/batcher.h: Line 189: // The time when the very first operation was added into the batcher. Should note that this is protected by lock_. While you're there, can you make sure all other protected fields are annotated as such? http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/client.h File src/kudu/client/client.h: PS22, Line 1037: current Nit: "the current" PS22, Line 1143: when Nit: "only when", right? PS22, Line 1180: called Nit: "called the" PS22, Line 1183: taken care Nit: "taken care of" PS22, Line 1184: Once flushing of the prior : /// mutation buffer is initiated, the new one is created and set current : /// to accommodate incoming write operations, limit permitting. Nit: "Upon flushing the current mutation buffer, a new buffer is created to replace it, provided the limit is not exceeded." Line 1188: /// The default and the minimum setting for this parameter is 2 (two). Hmm, technically couldn't the minimum be 1? If set to 1, an Apply() with an outstanding background flush ought to block (and an ApplyAsync() should return some sort of backpressure error). Though I guess you'd need that kind of behavior when the limit is reached regardless of whether it's 1 or higher. PS22, Line 1191: the Nit: drop 'the' PS22, Line 1201: implicitly amended : /// as if it were 0. Nit: "implicitly set to 0". http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc File src/kudu/client/session-internal.cc: PS22, Line 80: the Nit: double the Line 130: buffer_bytes_limit_ = size; Likewise, I don't see why this needs to be protected. Line 145: buffer_watermark_pct_ = watermark_pct; Why does this need to be protected? It's only ever accessed by the writer thread (not a reactor thread). PS22, Line 191: Synchronizer s; : KuduStatusMemberCallback<Synchronizer> ksmcb(&s, &Synchronizer::StatusCB); : NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, &ksmcb, BLOCK); : RETURN_NOT_OK(s.Wait()); Why do we need to use the synchronizer to wait on this batch if, below, we wait until total_bytes_used_ reaches 0? Is it just to report errors? Would the error collector usage below capture that? PS22, Line 222: relevent Nit: relevant http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.h File src/kudu/client/session-internal.h: PS22, Line 104: // On successful return, the output flush_mode parameter : // is set to the effective flush mode. I thought we no longer needed the output parameter for flush mode? Oh, is this because we read flush_mode_ from the flusher task, which means we need to protect it with something, or make it an atomic type of some kind? And so you'd rather pass back the flush mode from ApplyWriteOp() to its caller to avoid taking the lock a second time? If so, it seems like that's working around the lack of "canceling" a scheduled task. That is, if we could cancel the flusher task (e.g. when the flush mode changes away from AUTO_BACKGROUND_FLUSH), it'd never need to read the flush mode. I wonder how much complexity is involved with adding such cancellation behavior; we've wanted it elsewhere and it would certainly simplify the flusher task considerably. PS22, Line 142: Nit: looks like there are two spaces here. Below too. PS22, Line 147: BATCHER_WATERMARK_NON_EMPTY Our convention for naming class constants like these "kCamelCaseFoo". We reserve ALL_UPPER_CASE for macros. PS22, Line 158: // A dedicated lock to prevent multiple threads executing the ApplyWriteOp() : // methods simultaneously. Should not be used for any other purpose. I don't understand; we've expressly forbidden multiple Apply() threads. So why bother to do this? It's just another lock to maintain; if this is to prevent people from shooting themselves in the foot, I'd rather remove it and simplify. -- To view, visit http://gerrit.cloudera.org:8080/3952 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8 Gerrit-PatchSet: 22 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Dinesh Bhat <din...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes