Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
......................................................................


Patch Set 22:

(18 comments)

Did another pass. I'm still blown away by the overall complexity; I wonder if 
reducing the number of locks will help here, or making other infrastructure 
improvements (e.g. allowing reactor tasks to be canceled).

http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 189:   // The time when the very first operation was added into the 
batcher.
Should note that this is protected by lock_. While you're there, can you make 
sure all other protected fields are annotated as such?


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/client.h
File src/kudu/client/client.h:

PS22, Line 1037: current
Nit: "the current"


PS22, Line 1143: when
Nit: "only when", right?


PS22, Line 1180: called
Nit: "called the"


PS22, Line 1183: taken care
Nit: "taken care of"


PS22, Line 1184: Once flushing of the prior
               :   /// mutation buffer is initiated, the new one is created and 
set current
               :   /// to accommodate incoming write operations, limit 
permitting.
Nit: "Upon flushing the current mutation buffer, a new buffer is created to 
replace it, provided the limit is not exceeded."


Line 1188:   /// The default and the minimum setting for this parameter is 2 
(two).
Hmm, technically couldn't the minimum be 1? If set to 1, an Apply() with an 
outstanding background flush ought to block (and an ApplyAsync() should return 
some sort of backpressure error). Though I guess you'd need that kind of 
behavior when the limit is reached regardless of whether it's 1 or higher.


PS22, Line 1191: the
Nit: drop 'the'


PS22, Line 1201: implicitly amended
               :   ///   as if it were 0.
Nit: "implicitly set to 0".


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS22, Line 80: the
Nit: double the


Line 130:   buffer_bytes_limit_ = size;
Likewise, I don't see why this needs to be protected.


Line 145:   buffer_watermark_pct_ = watermark_pct;
Why does this need to be protected? It's only ever accessed by the writer 
thread (not a reactor thread).


PS22, Line 191:   Synchronizer s;
              :   KuduStatusMemberCallback<Synchronizer> ksmcb(&s, 
&Synchronizer::StatusCB);
              :   NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, &ksmcb, 
BLOCK);
              :   RETURN_NOT_OK(s.Wait());
Why do we need to use the synchronizer to wait on this batch if, below, we wait 
until total_bytes_used_ reaches 0? Is it just to report errors? Would the error 
collector usage below capture that?


PS22, Line 222: relevent
Nit: relevant


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS22, Line 104:   // On successful return, the output flush_mode parameter
              :   // is set to the effective flush mode.
I thought we no longer needed the output parameter for flush mode?

Oh, is this because we read flush_mode_ from the flusher task, which means we 
need to protect it with something, or make it an atomic type of some kind? And 
so you'd rather pass back the flush mode from ApplyWriteOp() to its caller to 
avoid taking the lock a second time?

If so, it seems like that's working around the lack of "canceling" a scheduled 
task. That is, if we could cancel the flusher task (e.g. when the flush mode 
changes away from AUTO_BACKGROUND_FLUSH), it'd never need to read the flush 
mode. I wonder how much complexity is involved with adding such cancellation 
behavior; we've wanted it elsewhere and it would certainly simplify the flusher 
task considerably.


PS22, Line 142:   
Nit: looks like there are two spaces here. Below too.


PS22, Line 147: BATCHER_WATERMARK_NON_EMPTY
Our convention for naming class constants like these "kCamelCaseFoo". We 
reserve ALL_UPPER_CASE for macros.


PS22, Line 158:   // A dedicated lock to prevent multiple threads executing the 
ApplyWriteOp()
              :   // methods simultaneously. Should not be used for any other 
purpose.
I don't understand; we've expressly forbidden multiple Apply() threads. So why 
bother to do this? It's just another lock to maintain; if this is to prevent 
people from shooting themselves in the foot, I'd rather remove it and simplify.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Dinesh Bhat <din...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to