Adar Dembo has posted comments on this change.
Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
Patch Set 22:
Did another pass. I'm still blown away by the overall complexity; I wonder if
reducing the number of locks will help here, or making other infrastructure
improvements (e.g. allowing reactor tasks to be canceled).
Line 189: // The time when the very first operation was added into the
Should note that this is protected by lock_. While you're there, can you make
sure all other protected fields are annotated as such?
PS22, Line 1037: current
Nit: "the current"
PS22, Line 1143: when
Nit: "only when", right?
PS22, Line 1180: called
Nit: "called the"
PS22, Line 1183: taken care
Nit: "taken care of"
PS22, Line 1184: Once flushing of the prior
: /// mutation buffer is initiated, the new one is created and
: /// to accommodate incoming write operations, limit
Nit: "Upon flushing the current mutation buffer, a new buffer is created to
replace it, provided the limit is not exceeded."
Line 1188: /// The default and the minimum setting for this parameter is 2
Hmm, technically couldn't the minimum be 1? If set to 1, an Apply() with an
outstanding background flush ought to block (and an ApplyAsync() should return
some sort of backpressure error). Though I guess you'd need that kind of
behavior when the limit is reached regardless of whether it's 1 or higher.
PS22, Line 1191: the
Nit: drop 'the'
PS22, Line 1201: implicitly amended
: /// as if it were 0.
Nit: "implicitly set to 0".
PS22, Line 80: the
Nit: double the
Line 130: buffer_bytes_limit_ = size;
Likewise, I don't see why this needs to be protected.
Line 145: buffer_watermark_pct_ = watermark_pct;
Why does this need to be protected? It's only ever accessed by the writer
thread (not a reactor thread).
PS22, Line 191: Synchronizer s;
: KuduStatusMemberCallback<Synchronizer> ksmcb(&s,
: NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, &ksmcb,
Why do we need to use the synchronizer to wait on this batch if, below, we wait
until total_bytes_used_ reaches 0? Is it just to report errors? Would the error
collector usage below capture that?
PS22, Line 222: relevent
PS22, Line 104: // On successful return, the output flush_mode parameter
: // is set to the effective flush mode.
I thought we no longer needed the output parameter for flush mode?
Oh, is this because we read flush_mode_ from the flusher task, which means we
need to protect it with something, or make it an atomic type of some kind? And
so you'd rather pass back the flush mode from ApplyWriteOp() to its caller to
avoid taking the lock a second time?
If so, it seems like that's working around the lack of "canceling" a scheduled
task. That is, if we could cancel the flusher task (e.g. when the flush mode
changes away from AUTO_BACKGROUND_FLUSH), it'd never need to read the flush
mode. I wonder how much complexity is involved with adding such cancellation
behavior; we've wanted it elsewhere and it would certainly simplify the flusher
PS22, Line 142:
Nit: looks like there are two spaces here. Below too.
PS22, Line 147: BATCHER_WATERMARK_NON_EMPTY
Our convention for naming class constants like these "kCamelCaseFoo". We
reserve ALL_UPPER_CASE for macros.
PS22, Line 158: // A dedicated lock to prevent multiple threads executing the
: // methods simultaneously. Should not be used for any other
I don't understand; we've expressly forbidden multiple Apply() threads. So why
bother to do this? It's just another lock to maintain; if this is to prevent
people from shooting themselves in the foot, I'd rather remove it and simplify.
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Dinesh Bhat <din...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>