[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

Alexey Serbin (Code Review) Thu, 18 Aug 2016 18:53:46 -0700

Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
......................................................................



Patch Set 9:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 64:   // Takes a reference on error_collector. Creates a weak_ptr to 
'session'.
> Could you update this comment, especially WRT the error_collector?  I can't
Done


Line 115:   static int64_t GetOperationSizeInBuffer(KuduWriteOperation* 
write_op);
> This doesn't seem too useful since it just calls write_op->SizeInBuffer(). 
I didn't want to introduce an additional KuduSession::Data friend to the 
Batcher class (and it's not possible because Data is a nested class in the 
KuduSession class).

Probably, there is a more elegant solution for this.  Basically, I want to get 
wire size for KuduWriteOperation.


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS9, Line 555: session
> the session
Done


PS9, Line 557: maximum 
> the maximum
Done


PS9, Line 557: 10x multiplier
> 10x seems like a lot of leeway.  Could we get away with 2 or 3x?  I would c
Done


PS9, Line 2221: Status s;
> looks like everywhere this is used the call could instead be wrapped in ASS
Somehow I overlooked that :)  Thanks!


PS9, Line 2307: scenrio
> 'scenario' here and below
Done


PS9, Line 2478: previos
> previous
Done


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 36: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> Remove this and use SetMutationBufferSpace instead.
Done


Line 49: DEFINE_int32(client_buffer_flush_watermark_pct, 75,
> Instead of a gflag, this should be an option on KuduSession, like SetMutati
Done


Line 61: DEFINE_int32(client_buffer_flush_timeout_ms, 10,
> Same here, this should be an option on KuduSession.  https://kudu.apache.or
Done


PS9, Line 82: :
> wrap init list.
Done


Line 115:       messenger->ScheduleOnReactor(
> It's not clear to me why this needs to be scheduled on a reactor thread.  I
Good point!  The original design had a separate thread which was busy with all 
flush-related activity, and I just transferred it on the new scheme with 
messenger/reactor threads.  This should not be blocking, so I can safely call 
it from the same thread where CheckAndRun is being executed.


PS9, Line 154:   if (PREDICT_FALSE(!status.ok())) {
             :     return;
             :   }
             :   // Check that the session is still alive to access the data 
safely.
             :   sp::shared_ptr<KuduSession> session(weak_session.lock());
             :   if (PREDICT_FALSE(!session)) {
             :     return;
             :   }
             :   // Flush mode could change during the operation.  If current 
mode
             :   // is no longer AUTO_FLUSH_BACKGROUND, then stop re-scheduling 
the task.
             :   if (PREDICT_FALSE(data_->flush_mode_ != 
AUTO_FLUSH_BACKGROUND)) {
             :     return;
             :   }
             :   // Detach the batcher, schedule its flushing,
             :   // and install a new one to accumulate incoming write 
operations.
             :   data_->NextBatcher(weak_session, 0, nullptr);
> Looks like this might be able to call Run instead of duplicating?
Done


PS9, Line 205: MutexLock
> prefer to use std::lock_guard if possible.
Done


PS9, Line 392: int64_t* op_raw_size
> is this still necessary now that the size is cached?
yep -- that's the issue of visibility of a private method.  I cannot make an 
internal class a friend to the Batcher class.


Line 421:       flow_control_cond_.Wait();
> Is a flush forced somewhere in this situation, or is it just waiting for th
The condition awaits till total size of pending operations drops below the 
limit.  This is enforced by the logic of the code.  If no updates happen on 
total_bytes_used_, this will wait forever.


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS9, Line 111: BACKGFOUND
> BACKGROUND
Done


Line 126:     ~BackgroundFlusher();
> I think this can use the default keyword instead of defining the no-op dtor
OK, I just removed it to rely on auto-generated ones.


Line 136:     void CheckAndRun(const std::shared_ptr<rpc::Messenger>& messenger,
> Could you doc this?  I can't tell exactly what it's for, or why it chooses 
Renamed and added more comments.


PS9, Line 151: KuduSession::Data* const
> It looks like all of the methods on the background flusher that do any work
It's a reference (a pointer, actually) to the outer/enclosing object.

It's possible to keep those weak pointers as members of the BackgroundFlusher, 
but when a BackgroundFlusher object, as a sub-object of KuduSession::Data, is 
deallocated on destruction of the enclosing KuduSession::Data object, those 
members cannot be used anyway.  So, I'm putting all those into the parameters 
of the methods which are being called by the reactor threads.  Basically, 
necessary weak references on involved object are kept in the functor object, 
which, in its turn, stored by corresponding reactor thread.

The BackgroundFlusher is a shim object; it's a leaky abstraction.  Actually, I 
was even thinking of having all of its methods static and passing everything by 
parameters.  But that was too uglier.

The destructor ordering thing is stale now.  It was actual in other version of 
this patch when a separate thread did flushing.  Will remove the stale comment 
-- thanks!


PS9, Line 159: write operation
> the write operation
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

Reply via email to