[kudu-CR] async background flush provision for C++ client

2016-07-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: async background flush provision for C++ client
..


Patch Set 2:

(59 comments)

Thank you for review.

I addressed most of the comments, and I'm working on testing the change more 
thoroughly by all means (more tests will follow) and also looking at the other 
option of implementing background flushing.  Adar recommended to look at the 
option to utilize one of the reactor threads for that purpose.

I'll post an update on this change, but more changes are expected to appear 
soon.

http://gerrit.cloudera.org:8080/#/c/3668/2//COMMIT_MSG
Commit Message:

PS2, Line 7: provision
> nit: remove provision
Done


Line 7: async background flush provision for C++ client
> nit: capitalize (Async)
Done


PS2, Line 10: is
> nit: avoid using the passive voice. i.e. "KuduSession starts a background f
Done


PS2, Line 13: virtual
> what's a virtual buffer?
There isn't a single buffer as is, there is a set of buffers used by multiple 
batchers.  Will add clarification comment.


Line 16: The flush criteria is based on buffer size for not-yet-scheduled
> I'm having trouble parsing this sentence.
Done


PS2, Line 17: operatations
> nit: type
Done


PS2, Line 20: operatations
> nit: typo
Done


PS2, Line 23: KUDU
> Mention this issue in the commit message title.
Done


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

Line 267: #ifndef NDEBUG
> spurious change
It's not spurious, it fixes compilation warning of unused variables for debug 
builds.


Line 383: 
> spurious change
Done


Line 553:   //since the other time it's called in KuduSession::Apply()
> nit: rephrase. How about:
Done


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 109:   // Return number of bytes used for batcher buffer
> nit: "Returns the"
I see most of comments in this file does not use third person singular form.  
The only exception is external_consistency_mode().  Are you sure it should be 
'Returns the number', but not 'Return the number'?


PS2, Line 216: //
> no need for this comment
Done


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

Line 1991: // return an error with session running in any supported flush mode
> nit: period
Done


PS2, Line 1993: static
> not need for static.
Done


PS2, Line 1993: buffer_space_bytes_limit
> style: kBufferSpaceBytesLimit or BUFFER_SPACE_BYTES_LIMIT
Done


PS2, Line 1994: static
> same
Done


Line 1995:   static const KuduSession::FlushMode modes[] = {
> same
Done


Line 2001:   for (const auto mode: modes) {
> isn't 'auto' already const ?
Done


Line 2007: ASSERT_TRUE(s.IsIncomplete()) << "got unexpected status: " << 
s.ToString();
> nit: capitalize Got
Done


Line 2014:   // Applying a bunch of small rows without a flush should not 
result in
> this sentence is huge and hard to read. please break it down.
Done


Line 2016:   // since there is a flow control which makes Session::Apply() 
block and wait
> what: "a flow control"?
Done


Line 2018:   {
> no need for the extra { scope
Done


Line 2023:   ASSERT_OK(ApplyInsertToSession(session.get(), client_table_, 
1, 1, "x"));
> How this testing that Apply() is eventually blocking, or making sure that t
This test is to verify that there isn't an error about overrunning the buffer.  
Yes, it's not an explicit test to verify that Apply() is blocking.  Will 
address that separately.


Line 2029: // operations are put into the queue flushed after some time.
> you mean : "into the queue and flushed" right?
Done


Line 2030: // The exact timeout interval is implementation-dependent,
> which timeout interval?
Done


Line 2031: // but 100 ms is a good upper limit anyways.
> s/anyways/anyway
Done


Line 2033:   {
> no need for the extra {
Done


Line 2041: SleepFor(MonoDelta::FromMilliseconds(100));
> loop instead of wait, otherwise this will be brittle in jenkins
Done


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 705:   "Cannot change flush mode when writes are buffered");
> nit period
Done


Line 707:   data_->SetFlushMode(m);
> would it simplify things to now allow the flush mode to change if there is 
I think we can allow the flush mode to change regardless presence of 
buffered/pending write operations, but it requires proper design and some 
additional logic to be implemented.  I would try to address this in a separate 
patch/change-list.


Line 728:   return data_->SetBufferBytesLimit(size);
> why not use the same method name?
Because from internal perspective there is no context of any mutation.  It's 
just a buffer that accommodates byte representation of write operations.

>From the other hand, the outer (API-facing) name cannot be changes now without 
>braking 

[kudu-CR] async background flush provision for C++ client

2016-07-18 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: async background flush provision for C++ client
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 707:   data_->SetFlushMode(m);
> would it simplify things to now allow the flush mode to change if there is 
"to _not_ allow the flush mode"


-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] async background flush provision for C++ client

2016-07-18 Thread David Ribeiro Alves (Code Review)
David Ribeiro Alves has posted comments on this change.

Change subject: async background flush provision for C++ client
..


Patch Set 2:

(62 comments)

overall I think this needs more tests. Since the flusher is a thread, likely 
the unit tests will also have to use threads to:
- make sure limits are not exceeded
- make sure that operations are in fact buffered up to a point
- Test the multiple triggers of a flush.

Beyond the the unit tests, I think this also needs an integration test that 
stresses it all.

http://gerrit.cloudera.org:8080/#/c/3668/2//COMMIT_MSG
Commit Message:

Line 7: async background flush provision for C++ client
nit: capitalize (Async)


PS2, Line 7: provision
nit: remove provision


PS2, Line 10: is
nit: avoid using the passive voice. i.e. "KuduSession starts a background 
flusher thread" not a thread is started


PS2, Line 13: virtual
what's a virtual buffer?


Line 16: The flush criteria is based on buffer size for not-yet-scheduled
I'm having trouble parsing this sentence.


PS2, Line 17: operatations
nit: type


PS2, Line 20: operatations
nit: typo


PS2, Line 23: KUDU
Mention this issue in the commit message title.


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

Line 267: #ifndef NDEBUG
spurious change


Line 383: 
spurious change


Line 553:   //since the other time it's called in KuduSession::Apply()
nit: rephrase. How about:
Optimize and call write_op->SizeInBuffer() only once. We're currently calling 
it twice, here and in KuduSession::Apply()


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 109:   // Return number of bytes used for batcher buffer
nit: "Returns the"
nit: period at the end os sentences/


Line 114:   // Compute in-buffer size for the given write operation
nit: Computes
nit: period


PS2, Line 216: //
no need for this comment


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

Line 1991: // return an error with session running in any supported flush mode
nit: period


PS2, Line 1993: buffer_space_bytes_limit
style: kBufferSpaceBytesLimit or BUFFER_SPACE_BYTES_LIMIT


PS2, Line 1993: static
not need for static.


PS2, Line 1994: static
same


Line 1995:   static const KuduSession::FlushMode modes[] = {
same


Line 2001:   for (const auto mode: modes) {
isn't 'auto' already const ?


Line 2007: ASSERT_TRUE(s.IsIncomplete()) << "got unexpected status: " << 
s.ToString();
nit: capitalize Got


Line 2014:   // Applying a bunch of small rows without a flush should not 
result in
this sentence is huge and hard to read. please break it down.


Line 2016:   // since there is a flow control which makes Session::Apply() 
block and wait
what: "a flow control"?


Line 2018:   {
no need for the extra { scope


Line 2023:   ASSERT_OK(ApplyInsertToSession(session.get(), client_table_, 
1, 1, "x"));
How this testing that Apply() is eventually blocking, or making sure that the 
buffer value is not surpassed for that matter?


Line 2029: // operations are put into the queue flushed after some time.
you mean : "into the queue and flushed" right?


Line 2030: // The exact timeout interval is implementation-dependent,
which timeout interval?


Line 2031: // but 100 ms is a good upper limit anyways.
s/anyways/anyway


Line 2033:   {
no need for the extra {


Line 2041: SleepFor(MonoDelta::FromMilliseconds(100));
loop instead of wait, otherwise this will be brittle in jenkins


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 705:   "Cannot change flush mode when writes are buffered");
nit period


Line 707:   data_->SetFlushMode(m);
would it simplify things to now allow the flush mode to change if there is a 
flusher thread? if not would it make sense to have this stop the thread (if the 
new mode is different) and wait for it to finish before allowing to change the 
mode? seems like there is some complexity attached to the fact that we allow 
this to change at anytime. simplifying the possible states there would be good


Line 728:   return data_->SetBufferBytesLimit(size);
why not use the same method name?


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 678: // The Flush() call can be used to block until the latest batch 
is sent.
maybe add/replace: "a batch is sent and the buffer has available space again"


http://gerrit.cloudera.org:8080/#/c/3668/2/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 48:   // intentionally left empty
no ned for this, you can also join the brackets in the line above, i.e. {}


Line 53: WARN_NOT_OK(Stop(), "unable to stop AUTO_FLUSH_BACKGROUND thread");
capitalize. Also a WARN? what happens if this fails?


Line 58:   

[kudu-CR] async background flush provision for C++ client

2016-07-18 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: async background flush provision for C++ client
..


Patch Set 2:

Build Started http://104.196.14.100/job/kudu-gerrit/2528/

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] async background flush provision for C++ client

2016-07-18 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: async background flush provision for C++ client
..


Patch Set 1:

> What was the rationale behind using a dedicated thread to manage
 > background flushing? I think synchronization would be net simpler
 > if instead, we reused the messenger's reactor threads for
 > background flushing. They're already used to run callbacks whenever
 > a Proxy::FooAsync() call is made, and you can use 
 > Messenger::ScheduleOnReactor()
 > to schedule the background flushing to happen at some time in the
 > future.
 > 
 > Some other arguments in favor of using reactor threads:
 > 1. It reduces the thread count  of the client.
 > 2. Structurally it makes the implementation similar to the Java
 > client.

I thought messenger reactor threads are for messaging, and I was not so 
familiar with that part of code.  OK, I'll explore that option in this context.

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] async background flush provision for C++ client

2016-07-18 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: async background flush provision for C++ client
..


Patch Set 1:

What was the rationale behind using a dedicated thread to manage background 
flushing? I think synchronization would be net simpler if instead, we reused 
the messenger's reactor threads for background flushing. They're already used 
to run callbacks whenever a Proxy::FooAsync() call is made, and you can use 
Messenger::ScheduleOnReactor() to schedule the background flushing to happen at 
some time in the future.

Some other arguments in favor of using reactor threads:
1. It reduces the thread count  of the client.
2. Structurally it makes the implementation similar to the Java client.

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] async background flush provision for C++ client

2016-07-18 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3668

Change subject: async background flush provision for C++ client
..

async background flush provision for C++ client

Implemented AUTO_FLUSH_BACKGROUND for C++ client library.
A dedicated background flusher thread is started by KuduSession
instance for this purpose.  In AUTO_FLUSH_BACKGROUND mode the
KuduSession::Apply() method blocks if there is not enough place
in the virtual buffer.  The limit on the buffer size is set by
KuduSession::SetMutationBufferSpace().

The flush criteria is based on buffer size for not-yet-scheduled
session's write operatations and timeout.

The flow control criteria in KuduSession::Apply() is based on
total buffer size of session's write operatations.

Addressed the following JIRA issues:
  KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
---
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/error_collector.cc
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
7 files changed, 432 insertions(+), 103 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/3668/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin