[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of
the freshly submitted (i.e. not-yet-scheduled-for-flush) write
operations is over the threshold.  The threshold can be set as
the percentage of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterval() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

Added functionality to control the maximum number of batchers
per session.  If number of batchers is at the limit already,
KuduSession::Apply() blocks until it's possible to add a new batcher
to accommodate the incoming operation.

Modified behavior of the KuduSession::Flush(): now it waits until all
batchers are flushed before returning.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Reviewed-on: http://gerrit.cloudera.org:8080/3952
Reviewed-by: Adar Dembo 
Tested-by: Kudu Jenkins
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
13 files changed, 1,509 insertions(+), 251 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 29
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 28: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Alexey Serbin (Code Review)
Hello Dan Burkert, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#28).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of
the freshly submitted (i.e. not-yet-scheduled-for-flush) write
operations is over the threshold.  The threshold can be set as
the percentage of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterval() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

Added functionality to control the maximum number of batchers
per session.  If number of batchers is at the limit already,
KuduSession::Apply() blocks until it's possible to add a new batcher
to accommodate the incoming operation.

Modified behavior of the KuduSession::Flush(): now it waits until all
batchers are flushed before returning.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
13 files changed, 1,509 insertions(+), 251 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/28
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 28:

Build Started http://104.196.14.100/job/kudu-gerrit/3194/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 28
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 27:

(9 comments)

Will send a new version in a moment.

http://gerrit.cloudera.org:8080/#/c/3952/27/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS27, Line 95: sing std::thread;
 : using std::bind;
 : using std::function;
 : using std::for_each;
> Nit: can you order this list so it's alphabetical?
Done


Line 2173:   const string LONG_STRING(kBufferSizeBytes + 1, 'x');
> Nit: kLongString.
Done


Line 2206:   const string LONG_STRING(kBufferSizeBytes / 2, 'x');
> Nit: kLongString.
Done


Line 2378: thread monitor(monitor_func);
> How about:
I was under impression that elsewhere in the code we have those intermediate 
variables for clarity.  OK, but it seems this time not having that would be an 
advantage.  Will do.

I was thinking about the lambda as well, but preferred to keep it as is to be 
in line with the MonitorSessionBufferSize counterpart (as you noticed).  I 
would prefer to keep it as is.  Do you insist on using a lambda here?


PS27, Line 2455:   function monitor_func(
   : bind(::MonitorSessionBufferSize,
   :  session.get(), _run_ctl,
   :  _max_buffer_size));
   :   thread monitor(monitor_func);
> Likewise here, combine or use auto to elide the std::function syntax.
Done


PS27, Line 2493:   function monitor_func(
   : bind(::MonitorSessionBufferSize,
   :  session.get(), _run_ctl, 
_max_buffer_size));
   :   thread monitor(monitor_func);
> And here.
Done


PS27, Line 2498: BUFFER_SIZE
> Nit: replace with kBufferSizeBytes. Below too.
Done


Line 2501:   for (size_t i = 3; i < 256; ++i) {
> Nit: maybe store 256 in a constant like kRowNum and reuse it in EXPECT_EQ b
Done


PS27, Line 2533:   function monitor_func(
   : bind(::MonitorSessionBufferSize,
   :  session.get(), _run_ctl, 
_max_buffer_size));
   :   thread monitor(monitor_func);
> Combine or auto.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 27
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 27:

(9 comments)

Almost there.

http://gerrit.cloudera.org:8080/#/c/3952/27/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS27, Line 95: sing std::thread;
 : using std::bind;
 : using std::function;
 : using std::for_each;
Nit: can you order this list so it's alphabetical?


Line 2173:   const string LONG_STRING(kBufferSizeBytes + 1, 'x');
Nit: kLongString.


Line 2206:   const string LONG_STRING(kBufferSizeBytes / 2, 'x');
Nit: kLongString.


Line 2378: thread monitor(monitor_func);
How about:

  thread monitor(bind(::MonitorSessionBatchersCount,
  session.get(), _run_ctl,
  _max_batchers_count));

That is, combine the two statements from above. This avoids exposing the 
"messy" std::function signature, though I suppose you could also elide it by 
using "auto" as the type.

Ideally you'd use a lambda here. I think MonitorSessionBatchersCount is only 
used in this test, so maybe make it a lambda instead? 
MonitorSessionBatchersSize is used in two places, so you can't use it there.


PS27, Line 2455:   function monitor_func(
   : bind(::MonitorSessionBufferSize,
   :  session.get(), _run_ctl,
   :  _max_buffer_size));
   :   thread monitor(monitor_func);
Likewise here, combine or use auto to elide the std::function syntax.


PS27, Line 2493:   function monitor_func(
   : bind(::MonitorSessionBufferSize,
   :  session.get(), _run_ctl, 
_max_buffer_size));
   :   thread monitor(monitor_func);
And here.


PS27, Line 2498: BUFFER_SIZE
Nit: replace with kBufferSizeBytes. Below too.


Line 2501:   for (size_t i = 3; i < 256; ++i) {
Nit: maybe store 256 in a constant like kRowNum and reuse it in EXPECT_EQ below?


PS27, Line 2533:   function monitor_func(
   : bind(::MonitorSessionBufferSize,
   :  session.get(), _run_ctl, 
_max_buffer_size));
   :   thread monitor(monitor_func);
Combine or auto.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 27
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 26:

Woops, I meant that to be a +1, definitely want to give Adar and others a 
chance to weigh in.

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 26
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 26: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3952/26/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

Line 2393: //monitor->Join();
This looks leftover?


PS26, Line 2503: every single
each individual


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 26
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 26: Code-Review+1

couple spelling/phrasing nits, but code LGTM!

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 26
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#26).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of
the freshly submitted (i.e. not-yet-scheduled-for-flush) write
operations is over the threshold.  The threshold can be set as
the percentage of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterval() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

Added functionality to control the maximum number of batchers
per session.  If number of batchers is at the limit already,
KuduSession::Apply() blocks until it's possible to add a new batcher
to accommodate the incoming operation.

Modified behavior of the KuduSession::Flush(): now it waits until all
batchers are flushed before returning.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
13 files changed, 1,521 insertions(+), 253 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/26
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 26
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-09-01 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 26:

Build Started http://104.196.14.100/job/kudu-gerrit/3184/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 26
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-31 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 25:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3952/25/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS25, Line 187: Microseconds
> Just to be clear, I meant "coarser" where I wrote "higher" granularity. Tha
I understand the concern.  What if we put 1 ms? :)

Frankly, this kind of check is not really reliable anyway.  It can spot an 
issue if operations are slowly pushed to tablet servers, but otherwise this 
check might not catch issues.

I just could not find anything else which would not be invasive.  Keeping max 
value in the session itself does not seem to be a good solution.  The other 
solution involves building mock Batcher or Messenger which seems to be an 
overkill.


PS25, Line 2186: // Applying a big operation which does not fit into the buffer 
should
   : // return an error with session running in any supported flush 
mode.
> Can you help me see the difference? AFAICT the second test in TestApplyTooM
Yes, after looking at TestApplyTooMuchWithoutFlushing once more I understand 
you are right.  For some reason I was thinking that the second part of the 
TestApplyTooMuchWithoutFlushing was about big operations where each fits the 
buffer, but eventually they overflow it.  My bad. 

OK, I'll remove that second part of the TestApplyTooMuchWithoutFlushing test.  
Because this test is a superset of the former: it verifies the behavior is 
correct for all flush modes.


PS25, Line 2258:   WaitForAutoFlushBackground(session, FLUSH_INTERVAL_MS);
   :   // There should be some pending ops: the automatic 
background flush
   :   // should not be active after switch to the MANUAL_FLUSH 
mode.
> Nah, I think that's fine.
OK, I see.  Thanks for the guideline.


Line 2428:   // AUTO_FLUSH_BACKGROUND should be faster than AUTO_FLUSH_SYNC.
> Yes, and please try TSAN. It slows things down in unexpected ways (e.g. con
Yep, did that as well.  However, I'm not sure I got 1000 cycles -- that was too 
slow.  Will run tonight.


PS25, Line 2574:   ASSERT_TRUE(session->HasPendingOperations());
> That's fine, just loop it using dist-test.py in TSAN mode, and I'll be sati
ok, will do.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 25
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-31 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 25:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3952/25/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS25, Line 187: Microseconds
> The idea was to monitor those values as closely as possible.  Without resor
Just to be clear, I meant "coarser" where I wrote "higher" granularity. That 
is, perhaps this could be as large of a value as 10ms, for example.

It just struck me as a very tight loop. Tight loops are useful when trying to 
exercise a data race, but in this case, I would expect a full end-to-end 
Flush() to take on the order of 1ms, especially in a debug or slower build. If 
that's true, checking every 10us is excessive.

But, it shouldn't slow down the tests significantly. If I'm the only one who 
feels this way, I'm fine with you leaving it as is.


PS25, Line 2186: // Applying a big operation which does not fit into the buffer 
should
   : // return an error with session running in any supported flush 
mode.
> Nope.  This is different: this test verifies that an attempt to apply a _si
Can you help me see the difference? AFAICT the second test in 
TestApplyTooMuchWithoutFlushing:

1. Starts a new session in MANUAL_FLUSH mode.
2. Customizes the buffer size.
3. Applies a single insert, sized to twice the buffer size.
4. Asserts that this fails with Incomplete.

TestCheckMutationBufferSpaceLimitInEffect does the same thing across all flush 
modes, except that the buffer size is a little different, and that the insert 
isn't twice the buffer size, but only one byte larger.


PS25, Line 2258:   WaitForAutoFlushBackground(session, FLUSH_INTERVAL_MS);
   :   // There should be some pending ops: the automatic 
background flush
   :   // should not be active after switch to the MANUAL_FLUSH 
mode.
> About 500 ms.  Is that too much?
Nah, I think that's fine.

Anything on the order of multiple seconds is something we should think about.


Line 2428:   // AUTO_FLUSH_BACKGROUND should be faster than AUTO_FLUSH_SYNC.
> Due to the nature of the things and chosen parameters, the difference is in
Yes, and please try TSAN. It slows things down in unexpected ways (e.g. context 
switches become much more expensive, as does thread startup).


PS25, Line 2574:   ASSERT_TRUE(session->HasPendingOperations());
> Yes, I also have this concern.  The principal solution would be delaying/st
That's fine, just loop it using dist-test.py in TSAN mode, and I'll be 
satisfied.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 25
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-31 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 25:

(45 comments)

I actually reviewed the changes to client-test this time.

http://gerrit.cloudera.org:8080/#/c/3952/25/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS25, Line 182:   // Must be public to use as a thread closure.
They can be static functions, though, right? I don't see them accessing any 
test state.


PS25, Line 187: Microseconds
How did you choose 10 us? It seems low to me (i.e. maybe we can get away with a 
higher granularity), but maybe I'm missing something.


PS25, Line 199:   
Nit: extra space here.


PS25, Line 553: oprations
Nit: operations


PS25, Line 556: WaitForAutoFlushBackground
Hmm, when this returns, there's no actual guarantee that a batch flushed, 
right? We may have just expired the deadline?


PS25, Line 558: MAX_WAIT_MS
kMaxWaitMs


Line 1207: << "unexpected status: " << result.ToString();
There's a better technique for handling situations like these:

  Status s = DoSomething();
  SCOPED_TRACE(s.ToString());
  ASSERT_TRUE(s.IsIOError());
  ASSERT_TRUE(...);

If a test fails while SCOPED_TRACE() is still in scope, gtest will print 
s.ToString(). That way you don't have to worry about printing it yourself 
following every potential ASSERT/EXPECT that may fail.


PS25, Line 2153: BUFFER_SIZE
kBufferSizeBytes (camel case, k prefix, and append the unit)

Below too. Please fix all of your "constants" to follow this naming convention.


PS25, Line 2186: // Applying a big operation which does not fit into the buffer 
should
   : // return an error with session running in any supported flush 
mode.
Doesn't this test make the second half of TestApplyTooMuchWithoutFlushing 
redundant?


PS25, Line 2197: for (auto mode: FLUSH_MODES)
Nit: auto mode : FLUSH_MODES

(separate 'mode' from the colon with a space)


PS25, Line 2215: STLDeleteElements();
Nit: use an ElementDeleter and declare it right after declaring 'errors'. Then 
there's no possibility for a leak.


PS25, Line 2258:   WaitForAutoFlushBackground(session, FLUSH_INTERVAL_MS);
   :   // There should be some pending ops: the automatic 
background flush
   :   // should not be active after switch to the MANUAL_FLUSH 
mode.
I see. How expensive is this in terms of test runtime? You've dropped the flush 
interval to 100 ms; how much time do we end up waiting here?


Line 2304: *result = t_end - t_begin;
Have you looked at the StopWatch class? May be easier for this kind of thing.


PS25, Line 2335: // Check that call to Flush()/FlushAsync() is no-op if there 
isn't current
   : // batcher for the session.
Doesn't this test make TestEmptyBatch redundant?


PS25, Line 2348:   ASSERT_OK(sync0.Wait());
   :   // The same with the synchronous flush.
   :   ASSERT_OK(session->Flush());
   :   ASSERT_FALSE(session->HasPendingOperations());
   :   ASSERT_EQ(0, session->CountPendingErrors());
How does this verify that flush was a no-op? Seems to me you need to find some 
metrics indicating that no Write RPCs were sent.


PS25, Line 2391: kudu::Thread::Create
Now that we're building with C++11, we prefer to use std::thread for test 
threads. There's less boilerplate with them (i.e. no need to make up a thread 
category and name), and they have better support for lambdas, which I think you 
can use here pretty effectively.

Same goes for the other new uses of kudu::Thread.


Line 2428:   // AUTO_FLUSH_BACKGROUND should be faster than AUTO_FLUSH_SYNC.
Hmm, is this expected to be true 100% of all test runs? Even with debug 
instrumentation, TSAN, ASAN, etc? Wondering if this test will be flaky.

A good way to find out is to use build-support/dist-test.py to loop your test 
1000 times. You can use --gtest_filter on the command line to run just this one 
test case (as opposed to all of client-test, much of which is already flaky).


Line 2429:   EXPECT_GT(t_diff_afs.ToMilliseconds(), 
t_diff_afb.ToMilliseconds());
Can't we just EXPECT_GT on the MonoDeltas directly now?


PS25, Line 2451:   // AUTO_FLUSH_BACKGROUND should be faster than 
AUTO_FLUSH_SYNC.
   :   EXPECT_GT(t_diff_afs.ToMilliseconds(), 
t_diff_afb.ToMilliseconds());
Same issues and questions as in the above test.


PS25, Line 2456: // applying a bunch of small rows without a flush should not 
result in
   : // an error, even with low limit on the buffer space.  This is 
because
   : // Session::Apply() blocks and waits for buffer space to 
become available.
For this test, you may want to scan once you're done writing to make sure that 
all of the rows really made it and that Apply() didn't just decide to throw 
some rows away because we ran out of buffer space.


Line 2488: // applying a bunch of rows every of 

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-31 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 25:

Build Started http://104.196.14.100/job/kudu-gerrit/3167/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 25
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-31 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#25).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of
the freshly submitted (i.e. not-yet-scheduled-for-flush) write
operations is over the threshold.  The threshold can be set as
the percentage of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterval() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

Added functionality to control the maximum number of batchers
per session.  If number of batchers is at the limit already,
KuduSession::Apply() blocks until it's possible to add a new batcher
to accommodate the incoming operation.

Modified behavior of the KuduSession::Flush(): now it waits until all
batchers are flushed before returning.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
13 files changed, 1,516 insertions(+), 238 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/25
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 25
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-31 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#24).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of
the freshly submitted (i.e. not-yet-scheduled-for-flush) write
operations is over the threshold.  The threshold can be set as
the percentage of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterval() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

Added functionality to control the maximum number of batchers
per session.  If number of batchers is at the limit already,
KuduSession::Apply() blocks until it's possible to add a new batcher
to accommodate the incoming operation.

Modified behavior of the KuduSession::Flush(): now it waits until all
batchers are flushed before returning.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
13 files changed, 1,517 insertions(+), 238 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/24
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 24
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-31 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 24:

Build Started http://104.196.14.100/job/kudu-gerrit/3166/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 24
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 23:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 118:   // Thread-safety note: the buffer_bytes_limit_ is not supposed to 
be accessed
> I think it's important to have safety notes like this in the header so that
But from the method caller's perspective all what's needed to know is that it's 
not supposed to be called from multiple threads, plus generic info on the 
parameters.  I don't see more.

Am I missing something?

I can put notion regarding the thread-safety constraint into the header.  Would 
it be enough?


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

Line 191:   std::unordered_set flushed_batchers_;
> Isn't the count of bytes in currently pending operations kept in the buffer
Correct.

As a matter of fact, I've removed flished_batchers_ already in the patchset I'm 
preparing.  I think now it's looks better.  Probably, it would be nice to get 
some feedback on that from the original author of the Batcher class (Todd 
and/or David?).


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-30 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 23:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

Line 474: Status s = had_errors_ ? Status::Incomplete("Some errors 
occurred")
> OK, that reasoning makes sense.
Yah, I agree IOError isn't a great fit, but the Status codes are pretty general.


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 118:   // Thread-safety note: the buffer_bytes_limit_ is not supposed to 
be accessed
> I tried that, but from the reading perspective it looked worse -- there is 
I think it's important to have safety notes like this in the header so that 
callers can see these invariants without going to the implementation.  This is, 
after all, an invariant that is imposed on the caller.


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

Line 191:   std::unordered_set flushed_batchers_;
> That's a good question!  Besides using those for current implementation of 
Isn't the count of bytes in currently pending operations kept in the 
buffer_bytes_used_ counter?


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-30 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 23:

(27 comments)

Dan & Adar, thank you for the review.  I'll post updated version as soon as I 
make sure the tests are passing OK.

http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

Line 474: Status s = had_errors_ ? Status::Incomplete("Some errors 
occurred")
> I don't think Incomplete is the right status here; the flush is complete.
OK, that reasoning makes sense.

However, I thought IOError might be irrelevant here as well.  Or this is the 
best match among all possible status codes?  There is seemingly more generic 
RunTimeError, but I'm not sure that's the right case to use it.

I'll try to return IOError back to get back to the original, and meanwhile we 
can discuss it separately.


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS23, Line 2360: there the current batcher is at place
> Not sure what this is trying to say, maybe:
ops, my bad.

Should have been something like 'OK, now the current batcher should be at 
place.'

Will fix.


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 322:hp);
> this can go on the line above.
Done


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 63:   TimeBasedFlushInit(session);
> How about checking if we are in AUTO_FLUSH_BACKGROUND first?  I know this i
The check you mentioned is done by TimeBasedFlushTask() method _before_ 
scheduling a task (i.e. no task scheduled if current mode is not 
AUTO_FLUSH_BACKGROUND).  So, I see no need to perform that check here.

I put all those ugly guts (synchronization, etc.) into that method with 
intention to avoid doing those checks elsewhere.


Line 63:   TimeBasedFlushInit(session);
> The funny thing is that, by definition, we cannot be in AUTO_FLUSH_BACKGROU
TimeBasedFlushInit() is no-op in other than AUTO_FLUSH_BACKGROUND mode.

Yes, it might be a bad name for the method.

What name would you recommend instead?


PS23, Line 105:std::lock_guard l(lock_);
> Since this isn't acquired on any hot path, what do you think of removing it
I thought it was better to keep those locks separate to avoid having one giant 
lock for all the methods.  Those locks based on simple_spinlock look lighter 
than based on mutex.  But as you mentioned, since we already have that for the 
most critical paths, and the rest with simple_spinlock are just for methods 
which almost never called, I think it's a good idea.

Will do, thanks!


Line 118:   // Thread-safety note: the buffer_bytes_limit_ is not supposed to 
be accessed
> These thread safety notes (here and below) would be better in the header de
I tried that, but from the reading perspective it looked worse -- there is not 
much of context when looking at the signature of a function/method.  However, 
in the body of a method, there is much more relevant context since you can see 
the logic of the method, more comments, corresponding variables/members, etc.

That's said, I'm hesitant to move those in the header.

However, if you feel strong about that and have a compelling reasoning which 
addresses my concern regarding the lack of context, I'll move those.


Line 156: Status KuduSession::Data::SetMaxBatchersNum(unsigned int max_num) {
> How come this one doesn't check HasPendingOperations?
I didn't see a restriction that would arise from the current design.

Actually, we can remove that for SetBufferFlushInterval() and other except for 
the KuduSession::SetFlushMode().

But as I see from our recent HipChat discussion, we want to keep us more space 
for the future, because once published, it's easier to make those restrictions 
less strict.  But making these more strict could be almost impossible in the 
future.

So, I added the artificial restriction here as well.


Line 181:   FlushCurrentBatcher(1, cb);
> Nit: add a comment explaining the significance of 1 (i.e. why not 0?)
Done


PS23, Line 188: nullptr
> I spent a lot of time wondering if the loss of the synchronizer and RETURN_
I did a research on that prior to using nullptr as a callback in the new code 
and I didn't see any cases where an error would be dropped on the floor (i.e. 
in all cases where Batcher::had_errors_ is set to true, the error is registered 
into the Batcher::error_collector_).  So, it's safe as far as I can see.  But 
I'll add corresponding comments into the batcher and for the 
FlushCurrentBatcher() methods to clarify on that.

I think it's a good idea to document the difference.  I even thought I had done 
it, but re-visiting the client.h comments revealed that I missed that part.  
Thank you for the 

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-30 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 23:

(16 comments)

Nice.  This is much more understandable than previously.

http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

Line 474: Status s = had_errors_ ? Status::Incomplete("Some errors 
occurred")
I don't think Incomplete is the right status here; the flush is complete.


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS23, Line 2360: there the current batcher is at place
Not sure what this is trying to say, maybe:

there is a current batch


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 322:hp);
this can go on the line above.


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 63:   TimeBasedFlushInit(session);
How about checking if we are in AUTO_FLUSH_BACKGROUND first?  I know this is 
basically a no-op if not, but it does schedule a background task.


Line 118:   // Thread-safety note: the buffer_bytes_limit_ is not supposed to 
be accessed
These thread safety notes (here and below) would be better in the header 
declaration doc.


Line 156: Status KuduSession::Data::SetMaxBatchersNum(unsigned int max_num) {
How come this one doesn't check HasPendingOperations?


PS23, Line 283: nil
NULL


PS23, Line 356: }
  : if (
should this be an else if?


Line 368: buffer_bytes_used_ += required_size;
hmm, this seems a little early.  I can't think of a reason it would matter, but 
it seems to me this counter shouldn't be updated until the batcher is available 
and we are just about to or just have added the op to the batcher.  Otherwise 
we are updating the counter, potentially dropping the lock via the cond_.Wait() 
call, and then later actually adding the op.


Line 379:   if (!batcher_) {
Will the batcher_ ever be non-null here?  I can't think of a reason why it 
would be.  Maybe a DCHECK is in order instead of the conditional.


Line 399:   buffer_bytes_used_ -= required_size;
Another good reason not to increment this counter until after the op is 
successfully added to the batcher_.


http://gerrit.cloudera.org:8080/#/c/3952/23/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS23, Line 46: Aply
Apply


PS23, Line 91: Return once no more pending operations are left.
this sentence seems redundant.


PS23, Line 91: finished their flushing
finish flushing


Line 191:   std::unordered_set flushed_batchers_;
Why is it necessary to hold on to these batchers?  I see that it's used for 
calling HasPendingOperations, but wouldn't it be enough to check that 
batches_num_ > 0 in that case?


PS23, Line 199: // protected by batch_ctl_mutex_
the setter doesn't take batch_ctl_mutex_


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-29 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 23:

Build Started http://104.196.14.100/job/kudu-gerrit/3136/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-29 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#23).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of
the freshly submitted (i.e. not-yet-scheduled-for-flush) write
operations is over the threshold.  The threshold can be set as
the percentage of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterval() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

Added functionality to control the maximum number of batchers
per session.  If number of batchers is at the limit already,
KuduSession::Apply() blocks until it's possible to add a new batcher
to accommodate the incoming operation.

Modified behavior of the KuduSession::Flush(): now it waits until all
batchers are flushed before returning.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/error_collector.cc
M src/kudu/client/error_collector.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
13 files changed, 1,440 insertions(+), 209 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/23
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 23
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 22:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS22, Line 191:   Synchronizer s;
  :   KuduStatusMemberCallback ksmcb(, 
::StatusCB);
  :   NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, , 
BLOCK);
  :   RETURN_NOT_OK(s.Wait());
> Cool. So regardless of what change you do or don't make to NextBatcher(), c
Sure, I'll add those comments.

And just another thought to justify separating the responsibilities of the 
NextBatcher() method (most likely you have spotted that already):  it will 
allow more robust flushes in case if maximum number of batchers is reached.  
This is because NextBatch() in blocking mode first waits until it's possible to 
allocate a new batcher and only after that initiates flushing of the 
corresponding batcher.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 22:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS22, Line 191:   Synchronizer s;
  :   KuduStatusMemberCallback ksmcb(, 
::StatusCB);
  :   NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, , 
BLOCK);
  :   RETURN_NOT_OK(s.Wait());
> OK, I see the point now.
Cool. So regardless of what change you do or don't make to NextBatcher(), could 
you add a comment here explaining the various events we're waiting on?

Separately, could you look into splitting NextBatcher()'s responsibilities? 
Naively I think it'd simplify things, but it might be one of those things that 
you can only really evaluate once you're staring at the new code.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 22:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS22, Line 191:   Synchronizer s;
  :   KuduStatusMemberCallback ksmcb(, 
::StatusCB);
  :   NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, , 
BLOCK);
  :   RETURN_NOT_OK(s.Wait());
> I wasn't asking about NextBatcher()'s wait; I was asking about s.Wait(). To
OK, I see the point now.

Right -- waiting on total_bytes_used_ should be more than enough given the fact 
the flush of the current batcher has been initiated.

I was concerned with collecting the exact error, if any, from the current 
batcher and missed the point that error_collector_ would account for that error 
as well.

And yes, if we separate two responsibilities of the NextBatch() method, then 
all that inconsistency will go away, so the next batcher would be allocated in 
Apply()/ApplyAsync() and FlushAsync() would not report on problems due to max 
number of batchers.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 22:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 1188:   /// The default and the minimum setting for this parameter is 2 
(two).
> Yes, technically the minimum could be 1.
OK. If it simplifies the implementation, let's keep the minimum at 2. We can 
always remove the restriction later.


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS22, Line 191:   Synchronizer s;
  :   KuduStatusMemberCallback ksmcb(, 
::StatusCB);
  :   NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, , 
BLOCK);
  :   RETURN_NOT_OK(s.Wait());
> NextBatcher() not only flushes the current batcher, but it also allocates a
I wasn't asking about NextBatcher()'s wait; I was asking about s.Wait(). To 
help answer the question, here are all of the waits going on:

1. BLOCK in NextBatcher(). This ensures that a new batcher has become available.
2. s.Wait(). This ensures that the batcher we've flushed has finished flushing.
3. total_bytes_used_ == 0. This ensures that any other batchers in flight have 
finished flushing.

Do we need all of these? Naively, I think waiting for total_bytes_used_ to 
reach 0 is a superset of s.Wait(), no?

Moreover, don't we only need to get (and wait for) a new batcher on the next 
Apply()/ApplyAsync(), rather than here?


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 22:

(17 comments)

Thank you for the review.

I'll try to take a fresh look in the context of simplifying the code.

Frankly, it would help a lot if it were clear from the very beginning that the 
KuduSession interface is not supposed to be thread-safe.  De-facto I started 
with the idea to have most of its methods thread-safe (as it was advertised by 
the comments in client.h).

http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/client.h
File src/kudu/client/client.h:

PS22, Line 1037: current
> Nit: "the current"
Done


PS22, Line 1143: when
> Nit: "only when", right?
Done


PS22, Line 1180: called
> Nit: "called the"
Done


PS22, Line 1183: taken care
> Nit: "taken care of"
Done


PS22, Line 1184: Once flushing of the prior
   :   /// mutation buffer is initiated, the new one is created and 
set current
   :   /// to accommodate incoming write operations, limit 
permitting.
> Nit: "Upon flushing the current mutation buffer, a new buffer is created to
Done


Line 1188:   /// The default and the minimum setting for this parameter is 2 
(two).
> Hmm, technically couldn't the minimum be 1? If set to 1, an Apply() with an
Yes, technically the minimum could be 1.

Yes, the behavior of the KuduSession::Apply() you described is there, but it 
now comes into play only when the incoming operation triggers flush of the 
current buffer.

I put 2 to keep current semantics of always having the current batcher 
available for incoming operations and allocating it when starting flush of the 
prior one.

I can modify the code and make it 1.  That would involve checking for the 
current batcher availability at every Apply() call.  Right now it's guaranteed 
the current batcher is always at place.


PS22, Line 1191: the
> Nit: drop 'the'
Done


PS22, Line 1201: implicitly amended
   :   ///   as if it were 0.
> Nit: "implicitly set to 0".
Done


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS22, Line 80: the
> Nit: double the
Done


Line 130:   buffer_bytes_limit_ = size;
> Likewise, I don't see why this needs to be protected.
It seems that's a remnant from the older code which was written with 
multi-threading safety requirement in mind.

Thanks, will remove.


Line 145:   buffer_watermark_pct_ = watermark_pct;
> Why does this need to be protected? It's only ever accessed by the writer t
Right.  That's the same as for the KuduSession::SetBufferBytesLimit() -- 
remnant from the past.


PS22, Line 191:   Synchronizer s;
  :   KuduStatusMemberCallback ksmcb(, 
::StatusCB);
  :   NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, , 
BLOCK);
  :   RETURN_NOT_OK(s.Wait());
> Why do we need to use the synchronizer to wait on this batch if, below, we 
NextBatcher() not only flushes the current batcher, but it also allocates a new 
one for next incoming operations.

Probably, it's better to separate those functions, and then it will be possible 
to make 1 the minimum number for the batchers_num_limit_.  I'll do that.

Back to your question.  The NextBatcher() method has two modes of operation: 
blocking and non-blocking.  In non-blocking mode it reports an error if it's 
not possible to allocate a new batcher due to the batchers_num_limit_ setting; 
in blocking mode it blocks until it's possible to allocate a new one and do its 
job.

So, it's necessary to initiate flushing current batcher and then wait for 
total_bytes_used_ to become 0.

BTW, the error collector does not capture errors due to batchers_num_limit_ 
limitations -- those errors are reported  via the provided callback for the 
NextBatcher() (which usually comes from the FlushAsync()).


PS22, Line 222: relevent
> Nit: relevant
Done


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS22, Line 104:   // On successful return, the output flush_mode parameter
  :   // is set to the effective flush mode.
> I thought we no longer needed the output parameter for flush mode?
Right.

The reason of returning the effective flush_mode in output parameter is that I 
don't want to acquire the lock protecting flush_mode_ again just in the 
upper-level KuduSession::Apply() -- that needs to know the flush mode to decide 
whether it's necessary to do Flush() in case of AUTO_FLUSH_SYNC mode.  Strictly 
speaking, it's necessary either to access it under lock or make it atomic.

As for the task cancelation, I don't think it would be a better approach.


PS22, Line 142:   
> Nit: looks like there are two spaces here. Below too.
Done


PS22, Line 147: BATCHER_WATERMARK_NON_EMPTY
> Our convention for naming class constants like these "kCamelCaseFoo". We 

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 22:

(18 comments)

Did another pass. I'm still blown away by the overall complexity; I wonder if 
reducing the number of locks will help here, or making other infrastructure 
improvements (e.g. allowing reactor tasks to be canceled).

http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 189:   // The time when the very first operation was added into the 
batcher.
Should note that this is protected by lock_. While you're there, can you make 
sure all other protected fields are annotated as such?


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/client.h
File src/kudu/client/client.h:

PS22, Line 1037: current
Nit: "the current"


PS22, Line 1143: when
Nit: "only when", right?


PS22, Line 1180: called
Nit: "called the"


PS22, Line 1183: taken care
Nit: "taken care of"


PS22, Line 1184: Once flushing of the prior
   :   /// mutation buffer is initiated, the new one is created and 
set current
   :   /// to accommodate incoming write operations, limit 
permitting.
Nit: "Upon flushing the current mutation buffer, a new buffer is created to 
replace it, provided the limit is not exceeded."


Line 1188:   /// The default and the minimum setting for this parameter is 2 
(two).
Hmm, technically couldn't the minimum be 1? If set to 1, an Apply() with an 
outstanding background flush ought to block (and an ApplyAsync() should return 
some sort of backpressure error). Though I guess you'd need that kind of 
behavior when the limit is reached regardless of whether it's 1 or higher.


PS22, Line 1191: the
Nit: drop 'the'


PS22, Line 1201: implicitly amended
   :   ///   as if it were 0.
Nit: "implicitly set to 0".


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS22, Line 80: the
Nit: double the


Line 130:   buffer_bytes_limit_ = size;
Likewise, I don't see why this needs to be protected.


Line 145:   buffer_watermark_pct_ = watermark_pct;
Why does this need to be protected? It's only ever accessed by the writer 
thread (not a reactor thread).


PS22, Line 191:   Synchronizer s;
  :   KuduStatusMemberCallback ksmcb(, 
::StatusCB);
  :   NextBatcher(session, BATCHER_WATERMARK_NON_EMPTY, , 
BLOCK);
  :   RETURN_NOT_OK(s.Wait());
Why do we need to use the synchronizer to wait on this batch if, below, we wait 
until total_bytes_used_ reaches 0? Is it just to report errors? Would the error 
collector usage below capture that?


PS22, Line 222: relevent
Nit: relevant


http://gerrit.cloudera.org:8080/#/c/3952/22/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS22, Line 104:   // On successful return, the output flush_mode parameter
  :   // is set to the effective flush mode.
I thought we no longer needed the output parameter for flush mode?

Oh, is this because we read flush_mode_ from the flusher task, which means we 
need to protect it with something, or make it an atomic type of some kind? And 
so you'd rather pass back the flush mode from ApplyWriteOp() to its caller to 
avoid taking the lock a second time?

If so, it seems like that's working around the lack of "canceling" a scheduled 
task. That is, if we could cancel the flusher task (e.g. when the flush mode 
changes away from AUTO_BACKGROUND_FLUSH), it'd never need to read the flush 
mode. I wonder how much complexity is involved with adding such cancellation 
behavior; we've wanted it elsewhere and it would certainly simplify the flusher 
task considerably.


PS22, Line 142:   
Nit: looks like there are two spaces here. Below too.


PS22, Line 147: BATCHER_WATERMARK_NON_EMPTY
Our convention for naming class constants like these "kCamelCaseFoo". We 
reserve ALL_UPPER_CASE for macros.


PS22, Line 158:   // A dedicated lock to prevent multiple threads executing the 
ApplyWriteOp()
  :   // methods simultaneously. Should not be used for any other 
purpose.
I don't understand; we've expressly forbidden multiple Apply() threads. So why 
bother to do this? It's just another lock to maintain; if this is to prevent 
people from shooting themselves in the foot, I'd rather remove it and simplify.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 22:

Build Started http://104.196.14.100/job/kudu-gerrit/3072/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#22).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of the
freshly submitted (i.e. not-yet-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterval() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
11 files changed, 1,445 insertions(+), 183 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/22
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 22
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-25 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 15:

(8 comments)

Thank you for your feedback.  I'm posting a new version in a minute.

http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 1146:   Status SetMutationBufferFlushWatermark(double watermark_pct)
> I don't think it's a big deal either way. Sessions are cheap (especially no
OK, that seems to be good reasoning to me.  No getters then.


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS15, Line 64: -1,
> I don't think the existing behavior is necessarily worth preserving; an emp
All right, I removed the semantics of flushing an empty batcher.  That will 
make a difference when there is a tight limit on maximum number of batchers 
allowed to exist simultaneously.


Line 66: PeriodicFlushTask(Status::OK(), messenger_, session);
> Sorry, I don't understand what you're suggesting. Can you explain it in mor
I'm sorry for that poorly articulated and poorly written piece.

To clarify on that thought, I'm copy-pasting a comment from the code which 
implements that idea, please see below.



Let's measure age of a batcher as the time elapsed from the moment of adding 
the very first operation into the batcher.
The idea is to flush a batcher when its age is very close
to the flush_interval_: let's call it 'batcher flush age'.
If current batcher hasn't reached its flush age yet,
just re-schedule the task to re-evaluate the age of
then-will-be-current batcher, so if the current batcher
is still current at that time, it will be exactly
of its flush age.



Doing so allows to address the issue you pointed at.


Line 109:   flow_control_cond_.Broadcast();
> Nope, simplifying the implementation is the underlying motivation. But I th
OK, I see.  It seems that after recent discussions an additional reason 
appeared: we want to have some sort of parity with the Java client.


Line 135:   CHECK_GE(timeout_ms, 0);
> I think that's reasonable provided it's documented.
Done


PS15, Line 230:   RETURN_NOT_OK(TryApplyWriteOp(write_op, _size,
  : _flush_mode, 
_flush_watermark));
> I don't understand why friendship factors into this: both ApplyWriteOp and 
I thought you were talking about calling the batcher directly.  Yes, I 
addressed the required_size so it's no longer an output parameter of the 
TryApplyWriteOp() method.


Line 398:   data->NextBatcher(session, 0, nullptr);
> I find the semantics of -1, 0, and >0 to be non-obvious. Perhaps we can con
Done


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS15, Line 87: On successful return, the output flush_mode parameter is set
 :   // to the effective flush mode.
> I don't understand how flush_mode_ can change in the middle of ApplyWriteOp
Yes, this is a holdover.  The comment was written when I was under impression 
we allow to have multiple threads working with a single KuduSession object.  
Also, calls of Apply() from different threads are not relevant here -- 
actually, the reason was that other thread might call 
KuduSession::SetFlushMode() while waiting on the condition variable.

But now all those concerns are gone.  We don't allow to change the mode while 
operations are pending and we don't allow making calls to a single KuduSession 
object from multiple threads.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-24 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 15:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 1146:   Status SetMutationBufferFlushWatermark(double watermark_pct)
> Will do.  BTW, do we need getters for those parameters?  I think getters mi
I don't think it's a big deal either way. Sessions are cheap (especially now 
that they don't have a dedicated thread to flush in the background) and so I 
tend to think they're short-lived, which means it's easy for the application to 
remember what value was provided here.


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS15, Line 64: -1,
> This is to force getting a new batcher and flushing the prior, if any, even
I don't think the existing behavior is necessarily worth preserving; an empty 
flush doesn't actually accomplish anything.

In any case, that wasn't what my comment was about; I was asking about the 
semantic difference between '-1' and '0' as the watermark value.


Line 66: PeriodicFlushTask(Status::OK(), messenger_, session);
> OK, I'll change it then.
Sorry, I don't understand what you're suggesting. Can you explain it in more 
detail?


Line 109:   flow_control_cond_.Broadcast();
> The original idea was to get rid of those limitations like empty buffer whe
Nope, simplifying the implementation is the underlying motivation. But I think 
it's enough; the clients are generally complicated code and aren't as well 
tested as the rest of Kudu.


Line 135:   CHECK_GE(timeout_ms, 0);
> Yep.  What we can do instead is to set it to 0 if the specified parameter w
I think that's reasonable provided it's documented.


PS15, Line 230:   RETURN_NOT_OK(TryApplyWriteOp(write_op, _size,
  : _flush_mode, 
_flush_watermark));
> Yep, having those facts simplifies this.  Again, I was under impression the
I don't understand why friendship factors into this: both ApplyWriteOp and 
TryApplyWriteOp are members of KuduSession::Data, so they should be equivalent 
w.r.t. calling Batcher::SizeInBuffer().


Line 398:   data->NextBatcher(session, 0, nullptr);
> Nope.  It would, however, if the second parameter were -1 (as you noticed t
I find the semantics of -1, 0, and >0 to be non-obvious. Perhaps we can convey 
the desired behavior from caller to PeriodicFlushTask() in another way? Through 
an additional parameter, an enum, or something like that?


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS15, Line 87: On successful return, the output flush_mode parameter is set
 :   // to the effective flush mode.
> This is because the flush_mode_ can change in the middle and the code which
I don't understand how flush_mode_ can change in the middle of ApplyWriteOp if 
the session may only be used by a single thread. Or is this a holdover from 
before, when it was OK for multiple threads to write to a session?


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-24 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 21:

(7 comments)

Thank you for the review.  I addressed the issues you pointed at and will send 
the updated version with additional changes soon.

http://gerrit.cloudera.org:8080/#/c/3952/21//COMMIT_MSG
Commit Message:

PS21, Line 27: va(
> Interval
Done


http://gerrit.cloudera.org:8080/#/c/3952/21/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

PS21, Line 69: const scoped_refptr& error_collector,
 :   const client::sp::weak_ptr& session,
> both of these should be taken by value, and then moved into their field.
Done


http://gerrit.cloudera.org:8080/#/c/3952/21/src/kudu/client/client.h
File src/kudu/client/client.h:

PS21, Line 1143: when newly submitted
   :   ///   operations are about to overflow the buffer
> when the newly applied operation would overflow the buffer
Done


http://gerrit.cloudera.org:8080/#/c/3952/21/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS21, Line 428: requied
> required
Done


PS21, Line 451: Aplly
> Apply
Done


http://gerrit.cloudera.org:8080/#/c/3952/21/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS21, Line 86: apply write operation
> apply a write operation
Done


PS21, Line 86: Check whether it's possible
> Does this not actually apply the op?  If not I think the method should be r
Oops, exactly.  I messed up those comments, my bad.  Will fix.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-23 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 21:

Build Started http://104.196.14.100/job/kudu-gerrit/3023/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-23 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#21).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of the
freshly submitted (i.e. not-yet-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterva() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
11 files changed, 1,179 insertions(+), 174 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/21
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 21
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 20:

Build Started http://104.196.14.100/job/kudu-gerrit/3019/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#20).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of the
freshly submitted (i.e. not-yet-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterva() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
11 files changed, 1,178 insertions(+), 174 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/20
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 20
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 18:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/3952/18/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

PS18, Line 522: Add
> Does it make sense to add a coment that callers are expected to serialize d
I think the original design of the Batch class is to accommodate multiple 
callers of this method.


http://gerrit.cloudera.org:8080/#/c/3952/16/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 168: 
> typo: the the
Done


PS16, Line 169: ck
> s/is/are/
Done


http://gerrit.cloudera.org:8080/#/c/3952/18/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 692:   : data_(new KuduSession::Data(client, client->data_->messenger_)) {
> I haven't looked at how data_ being set, just wondering if data_ could be N
The data_ is allocated in the constructor of the Kudu::Client() object, so it 
should not be nullptr: if there is not enough memory, std::bad_alloc() should 
be thrown and it will be termination with SIGABRT due to the unexpected 
exception.


PS18, Line 736: int
> uint ?
Good point -- I added this to be in line with SetTimeoutMillis(), but using 
unit would be better.  Will change.


PS18, Line 740: int
> uint ?
This comes from the interface which is already published.  I'm not sure whether 
it's OK to change it in an incompatible way now.


http://gerrit.cloudera.org:8080/#/c/3952/16/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 1017:   /// Modes of flush operations.
> Is there a default mode of op here ? if so, can we mention it in comment ?
Yes, there is.  You could find it in the description for the AUTO_FLUSH_SYNC 
mode.


Line 1038: /// batch is sent and the reclamed space is available for new 
operations.
> typo: reclamed
Done


Line 1040: /// @todo Provide an API for the user to specify a callback to 
do their own
> If this is not specific to this mode alone, we could move this todo in the 
No, this is specific only for this mode.  In all other cases it's either 
possible to specify a callback or it's clear that the callback is not needed 
(because the flush operation is synchronous).


Line 1044: ///   (probably the messenger IO threads?).
> I guess this todo is already addressed ?
Yes, this is clear that it's not needed now -- will remove.


http://gerrit.cloudera.org:8080/#/c/3952/19/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS19, Line 233: weak_session
> any significance of usage of 'weak' session ? :)
Yes -- this is to avoid circular dependencies later when NextBatcher is called. 
 It's possible to use shared_ptr here as well, but it should be transformed 
into weak_ptr in the batcher anyway.


PS19, Line 263: (
> Isn't this redundant ? We could as well place NextBatcher above ?
No, it's not.  Only in rare cases the additional pre-flush is needed (i.e. call 
of NextBatcher()).  Please read on the diagram which you liked.


PS19, Line 270: lock_
> I thought whole routine is serialized, so why another spinlock here ?
Because there is another actor -- the background 'periodic' flush task which 
calls NextBatcher() and replaces batcher_.


http://gerrit.cloudera.org:8080/#/c/3952/18/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS18, Line 126: to
> s/to/for ?
Done


PS18, Line 165: cond_mutex_
> Trivial nit: I suggest to rename this to flow_control_cond_mutex_ which cou
Good idea -- thanks!

I'll rename it and leave the comment since it explains why this is different 
from the lock_ and gives info in which order it's necessary to acquire both, if 
needed (actually, the code never acquires both).


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Dinesh Bhat (Code Review)
Dinesh Bhat has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 19:

(17 comments)

Hi Alexey,

went through this patch more from knowledge point of view, so feel free to 
ignore any review comments you find not useful. Also I haven't completely 
digested all the routines in sesion-internal.cc, so I may as well take a second 
pass tomorrow later if time permits.

http://gerrit.cloudera.org:8080/#/c/3952/18/src/kudu/client/batcher.cc
File src/kudu/client/batcher.cc:

PS18, Line 522: we 
Does it make sense to add a coment that callers are expected to serialize 
during Add ?


http://gerrit.cloudera.org:8080/#/c/3952/16/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 168: 
typo: the the


PS16, Line 169: ck
s/is/are/


http://gerrit.cloudera.org:8080/#/c/3952/17/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

PS17, Line 122: int64_t
Naive doubt here: If Load() is returning a non-atomic value we seem to be 
accessing the atomic value of buffer_bytes_used_ in write path vs non-atomic in 
read path ?


http://gerrit.cloudera.org:8080/#/c/3952/18/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 692:   : data_(new KuduSession::Data(client, client->data_->messenger_)) {
I haven't looked at how data_ being set, just wondering if data_ could be NULL 
since it seems to be a regular ptr.


PS18, Line 736: int
uint ?


PS18, Line 740: int
uint ?


http://gerrit.cloudera.org:8080/#/c/3952/16/src/kudu/client/client.h
File src/kudu/client/client.h:

Line 1017:   /// Modes of flush operations.
Is there a default mode of op here ? if so, can we mention it in comment ?


Line 1038: /// batch is sent and the reclamed space is available for new 
operations.
typo: reclamed


Line 1040: /// @todo Provide an API for the user to specify a callback to 
do their own
If this is not specific to this mode alone, we could move this todo in the 
beginning perhaps ?


Line 1044: ///   (probably the messenger IO threads?).
I guess this todo is already addressed ?


http://gerrit.cloudera.org:8080/#/c/3952/18/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS18, Line 448:   
Nice !! Liked these ascii diagrams.


http://gerrit.cloudera.org:8080/#/c/3952/19/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

PS19, Line 233: weak_session
any significance of usage of 'weak' session ? :)


PS19, Line 263: (
Isn't this redundant ? We could as well place NextBatcher above ?
do  {
   NextBatcher();
  CanApplyWriteOp(_pre_flush);
} while (need_pre_flush)


PS19, Line 270: lock_
I thought whole routine is serialized, so why another spinlock here ?


http://gerrit.cloudera.org:8080/#/c/3952/18/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS18, Line 126: to
s/to/for ?


PS18, Line 165: cond_mutex_
Trivial nit: I suggest to rename this to flow_control_cond_mutex_ which could 
be self explanatory(and remove al the comment associated). Just a suggestion, 
feel free to ignore this.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Dinesh Bhat 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 19:

Build Started http://104.196.14.100/job/kudu-gerrit/3016/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#19).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of the
freshly submitted (i.e. not-yet-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterva() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
11 files changed, 1,184 insertions(+), 171 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/19
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 19
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 18:

Build Started http://104.196.14.100/job/kudu-gerrit/3015/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#18).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for pending operations reaches the buffer size limit.  The limit
on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flush logic checks whether at least one of the
following two conditions is satisfied to determine whether it's time
to flush the accumulated write operations:
  * The over-the-watermark criterion: check whether the total size of the
freshly submitted (i.e. not-yet-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * The maximum wait time criterion: check whether the current batch
of operations has been accumulating for more than the maximum
wait time.  The maximum wait time can be specified in milliseconds
using the KuduSession::SetMutationBufferFlushInterva() method.
A KuduSession object uses RPC messenger's thread pool to monitor
batches' maximum wait time.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
11 files changed, 1,184 insertions(+), 171 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/18
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 18
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 17:

Build Started http://104.196.14.100/job/kudu-gerrit/3014/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 17
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-22 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 16:

Build Started http://104.196.14.100/job/kudu-gerrit/3011/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 15:

(41 comments)

Thank you for the review!  Will post updated version soon.

http://gerrit.cloudera.org:8080/#/c/3952/15//COMMIT_MSG
Commit Message:

PS15, Line 20: waterline
> Nit: I think elsewhere you changed this to watermark; could you do the same
Done


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client-internal.h
File src/kudu/client/client-internal.h:

Line 240:  private:
> In the PIMPL idiom, the impl class is typically completely public. Not sure
I just noticed the DISSALLOW_COPY_AND_ASSIGN() does not do what it's supposed 
to since the private was missing.  I think the idea is to protect against 
unintentional copying.


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 53: #include "kudu/gutil/strings/human_readable.h"
> Is this used for anything?
It turned out that it's not used.  Will remove.


Line 692:   : data_(new KuduSession::Data(client, client->data_->messenger_)) {
> Since the messenger is reachable through the client, why not have KuduSessi
Exactly.  It's not possible to declare an embedded class to be a friend (at 
least I don't know how to do that).  I.e., KuduSession::Data is an 
embedded/internal class, and it's not possible to add it into the frendship 
list for the KuduClient class.


Line 733: Status KuduSession::SetMutationBufferSpace(size_t size) {
> One of the simplifying assumptions made by the Java client is that many (if
OK, that's a good idea.  I'll think in that direction, thanks!


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.h
File src/kudu/client/client.h:

PS15, Line 1035: In this mode, the Flush() call can be used to block until a 
batch
   : /// is sent and the buffer has available space again.
> To be clear, Flush() also causes a background flush that normally would hav
Yes, exactly.  Thank you for making this clear.  Will update the comment.


PS15, Line 1127: the
> Nit: can drop this.
Done


PS15, Line 1129: pushing
> Nit: sending
Done


PS15, Line 1130: into
> Nit: to the
Done


PS15, Line 1133: OK
> Nit: safe
Done


PS15, Line 1134: does not come into play
> Nit: has no effect
Done


PS15, Line 1139:   
> Nit: got two spaces here, can you fix and check the rest?
Done


PS15, Line 1140: by be 
> What happened here?
Oops, that were aliens, probably.


PS15, Line 1141: n
> And here?
Done


Line 1146:   Status SetMutationBufferFlushWatermark(double watermark_pct)
> It would also be useful to describe (in the comment) what the default water
Will do.  BTW, do we need getters for those parameters?  I think getters might 
be useful here.  However, having no getter for the buffer space hints that 
there was some deliberate decision not having getters.  OK, I'll figure it out.


PS15, Line 1161: OK
> Nit: safe
Done


PS15, Line 1162:  does not come into play
> Nit: has no effect
Done


PS15, Line 1166: internal
> Nit: interval
Done


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 57:   flush_interval_(MonoDelta::FromMilliseconds(10)) {
> The default flush interval seems really low. Isn't it something like 5s in 
I just thought it supposed to be that low.  This is not due to a mistake in the 
logic which would block sometimes.  I'll put 1 second here as in 
AsyncKuduSesion, as you mentioned.


PS15, Line 64: -1,
> PeriodicFlushTask uses 0, not -1. Is there a significance to this differenc
This is to force getting a new batcher and flushing the prior, if any, even if 
the prior hasn't any buffered data.  That was the default behavior prior to my 
modifications here: there is a unit test which verifies that upon a call of 
FlushAsync() even the empty batcher is 'flushed'.


Line 66: PeriodicFlushTask(Status::OK(), messenger_, session);
> I don't think this is quite the periodic semantics we want.
OK, I'll change it then.

Basically, the idea is to run the periodic task as it, making re-scheduling 
itself, but check against the latest flush time when the task executes.  There, 
schedule the task to run next after (last_flush_time + interval - now) 
interval, so next periodic flush happens not earlier than the period of the 
background flush task.

Does this sound good?


Line 74: bytes_flushed = batcher->buffer_bytes_used();
> Pull this out of the lock, presumably it's only flushed_batchers_ that need
Done


PS15, Line 81: there might be several data pushers
> How? Shouldn't there be at most one outstanding thread in Apply(), since on
David mentioned that in the very first review of my changes, at least I 
understood it like 'make sure there is a test which verifies it's safe to call 
Apply() from multiple threads'.  And I implemented that test, actually a couple 
of 

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 15:

(41 comments)

Here's my first pass.

http://gerrit.cloudera.org:8080/#/c/3952/15//COMMIT_MSG
Commit Message:

PS15, Line 20: waterline
Nit: I think elsewhere you changed this to watermark; could you do the same 
here, for consistency?


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client-internal.h
File src/kudu/client/client-internal.h:

Line 240:  private:
In the PIMPL idiom, the impl class is typically completely public. Not sure 
what privatizing these deleted constructors/operators buys us.


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 53: #include "kudu/gutil/strings/human_readable.h"
Is this used for anything?


Line 692:   : data_(new KuduSession::Data(client, client->data_->messenger_)) {
Since the messenger is reachable through the client, why not have 
KuduSession::Data do it in the constructor? Is it to avoid more friendship?


Line 733: Status KuduSession::SetMutationBufferSpace(size_t size) {
One of the simplifying assumptions made by the Java client is that many (if not 
all) of these setters may only be called when there are no pending operations. 
I suggest we do the same here; I think it'll obviate the need to broadcast on 
flow_control_cond_, for one.


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/client.h
File src/kudu/client/client.h:

PS15, Line 1035: In this mode, the Flush() call can be used to block until a 
batch
   : /// is sent and the buffer has available space again.
To be clear, Flush() also causes a background flush that normally would have 
happened at some point in the near future to happen _now_, right?


PS15, Line 1127: the
Nit: can drop this.


PS15, Line 1129: pushing
Nit: sending


PS15, Line 1130: into
Nit: to the


PS15, Line 1133: OK
Nit: safe


PS15, Line 1134: does not come into play
Nit: has no effect


PS15, Line 1139:   
Nit: got two spaces here, can you fix and check the rest?


PS15, Line 1140: by be 
What happened here?


PS15, Line 1141: n
And here?


Line 1146:   Status SetMutationBufferFlushWatermark(double watermark_pct)
It would also be useful to describe (in the comment) what the default watermark 
percentage is.


PS15, Line 1161: OK
Nit: safe


PS15, Line 1162:  does not come into play
Nit: has no effect


PS15, Line 1166: internal
Nit: interval


http://gerrit.cloudera.org:8080/#/c/3952/15/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 57:   flush_interval_(MonoDelta::FromMilliseconds(10)) {
The default flush interval seems really low. Isn't it something like 5s in 
Java? Why so low here?


PS15, Line 64: -1,
PeriodicFlushTask uses 0, not -1. Is there a significance to this difference? 
If not, can you make them the same?


Line 66: PeriodicFlushTask(Status::OK(), messenger_, session);
I don't think this is quite the periodic semantics we want.

Take a look at the Java client. It only schedules a background flush when the 
first op is applied. The advantage of that approach is that if you create a 
session, wait (flush_interval - 1) ms, and apply an operation, you will have to 
wait an additional flush_interval ms before your operation is automatically 
flushed. With this approach, you'll get the background flush happening almost 
immediately, giving the client no time to accumulate more operations into the 
buffer.


Line 74: bytes_flushed = batcher->buffer_bytes_used();
Pull this out of the lock, presumably it's only flushed_batchers_ that needs to 
be protected.


PS15, Line 81: there might be several data pushers
How? Shouldn't there be at most one outstanding thread in Apply(), since one 
KuduSession is meant for one writing thread?


PS15, Line 101: flusher
Not clear what a "flusher" is (now that there's no dedicated thread).


Line 102:   std::lock_guard l(cond_mutex_);
Why is this taken here? flush_mode_ is not protected by cond_mutex_.

Now that I take another look, it seems flush_mode_ can be accessed by the 
reactor thread in PeriodicFlushTask(), so it does need some synchronization, 
but it's not being applied consistently yet (nor is it documented).


Line 109:   flow_control_cond_.Broadcast();
Why do this? We've guaranteed there are no pending operations, so who would 
care about this broadcast?


PS15, Line 115:   // The data flow control logic needs to know if the buffer 
limit changes.
  :   // There might be several threads calling 
KuduSession::Apply().
Let's simplify by preventing buffer size changes when there are pending 
operations.

Also, if it wasn't already obvious, remember that KuduSession is inherently 
single-threaded: clients must use a separate KuduSession for each writing 
thread.


Line 122:   CHECK_GE(watermark_pct, 0);
This seems harsh for a 

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 30: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> The symbol hiding logic is only on Linux, which is, amongst other reasons, 
OK, I see.  Thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 30: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> Thank you for the heads-up!
The symbol hiding logic is only on Linux, which is, amongst other reasons, why 
we don't actually ship a macOS C++ client.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 30: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> Chiming in late, but I agree with Dan: gflags are totally unusable to C++ c
Thank you for the heads-up!

Just to verify that the google flags in kudu_client library is not visible to 
the application it's linking with, I modified the sample.cc -- basically, added 
DECLARE_int32 and called ParseCommandLineFlags() and 
HandleCommandLineHelpFlags().  From that I see those flags are available for 
the client application.

At list I see it on MacOS X.  I can check that on Linux, if needed.  If it 
turns out the flags are available via the kudu_client library, do we consider 
it as an issue?


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 15:

Build Started http://104.196.14.100/job/kudu-gerrit/2997/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 14:

Build Started http://104.196.14.100/job/kudu-gerrit/2990/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#13).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * the over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * the maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
KuduSession::SetMutationBufferFlushInterva() method.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
13 files changed, 1,121 insertions(+), 160 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/13
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 13:

Build Started http://104.196.14.100/job/kudu-gerrit/2989/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-19 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#12).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * the over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * the maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
KuduSession::SetMutationBufferFlushInterva() method.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
13 files changed, 1,124 insertions(+), 160 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/12
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-18 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 11:

Build Started http://104.196.14.100/job/kudu-gerrit/2984/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-18 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#11).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * the over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
KuduSession::SetMutationBufferFlushWatermark() method.
  * the maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
KuduSession::SetMutationBufferFlushInterva() method.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
13 files changed, 1,124 insertions(+), 160 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/11
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-18 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 421:   flow_control_cond_.Wait();
> The condition awaits till total size of pending operations drops below the 
I meant both the periodic- and waterline-triggerd flush will notify on update 
of total_bytes_used_, if any happened.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-18 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 10:

Build Started http://104.196.14.100/job/kudu-gerrit/2983/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-18 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#10).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
13 files changed, 1,149 insertions(+), 160 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/10
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-18 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 9:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 64:   // Takes a reference on error_collector. Creates a weak_ptr to 
'session'.
> Could you update this comment, especially WRT the error_collector?  I can't
Done


Line 115:   static int64_t GetOperationSizeInBuffer(KuduWriteOperation* 
write_op);
> This doesn't seem too useful since it just calls write_op->SizeInBuffer(). 
I didn't want to introduce an additional KuduSession::Data friend to the 
Batcher class (and it's not possible because Data is a nested class in the 
KuduSession class).

Probably, there is a more elegant solution for this.  Basically, I want to get 
wire size for KuduWriteOperation.


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS9, Line 555: session
> the session
Done


PS9, Line 557: maximum 
> the maximum
Done


PS9, Line 557: 10x multiplier
> 10x seems like a lot of leeway.  Could we get away with 2 or 3x?  I would c
Done


PS9, Line 2221: Status s;
> looks like everywhere this is used the call could instead be wrapped in ASS
Somehow I overlooked that :)  Thanks!


PS9, Line 2307: scenrio
> 'scenario' here and below
Done


PS9, Line 2478: previos
> previous
Done


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 36: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> Remove this and use SetMutationBufferSpace instead.
Done


Line 49: DEFINE_int32(client_buffer_flush_watermark_pct, 75,
> Instead of a gflag, this should be an option on KuduSession, like SetMutati
Done


Line 61: DEFINE_int32(client_buffer_flush_timeout_ms, 10,
> Same here, this should be an option on KuduSession.  https://kudu.apache.or
Done


PS9, Line 82: :
> wrap init list.
Done


Line 115:   messenger->ScheduleOnReactor(
> It's not clear to me why this needs to be scheduled on a reactor thread.  I
Good point!  The original design had a separate thread which was busy with all 
flush-related activity, and I just transferred it on the new scheme with 
messenger/reactor threads.  This should not be blocking, so I can safely call 
it from the same thread where CheckAndRun is being executed.


PS9, Line 154:   if (PREDICT_FALSE(!status.ok())) {
 : return;
 :   }
 :   // Check that the session is still alive to access the data 
safely.
 :   sp::shared_ptr session(weak_session.lock());
 :   if (PREDICT_FALSE(!session)) {
 : return;
 :   }
 :   // Flush mode could change during the operation.  If current 
mode
 :   // is no longer AUTO_FLUSH_BACKGROUND, then stop re-scheduling 
the task.
 :   if (PREDICT_FALSE(data_->flush_mode_ != 
AUTO_FLUSH_BACKGROUND)) {
 : return;
 :   }
 :   // Detach the batcher, schedule its flushing,
 :   // and install a new one to accumulate incoming write 
operations.
 :   data_->NextBatcher(weak_session, 0, nullptr);
> Looks like this might be able to call Run instead of duplicating?
Done


PS9, Line 205: MutexLock
> prefer to use std::lock_guard if possible.
Done


PS9, Line 392: int64_t* op_raw_size
> is this still necessary now that the size is cached?
yep -- that's the issue of visibility of a private method.  I cannot make an 
internal class a friend to the Batcher class.


Line 421:   flow_control_cond_.Wait();
> Is a flush forced somewhere in this situation, or is it just waiting for th
The condition awaits till total size of pending operations drops below the 
limit.  This is enforced by the logic of the code.  If no updates happen on 
total_bytes_used_, this will wait forever.


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS9, Line 111: BACKGFOUND
> BACKGROUND
Done


Line 126: ~BackgroundFlusher();
> I think this can use the default keyword instead of defining the no-op dtor
OK, I just removed it to rely on auto-generated ones.


Line 136: void CheckAndRun(const std::shared_ptr& messenger,
> Could you doc this?  I can't tell exactly what it's for, or why it chooses 
Renamed and added more comments.


PS9, Line 151: KuduSession::Data* const
> It looks like all of the methods on the background flusher that do any work
It's a reference (a pointer, actually) to the outer/enclosing object.

It's possible to keep those weak pointers as members of the BackgroundFlusher, 
but when a BackgroundFlusher object, as a sub-object of KuduSession::Data, is 
deallocated on destruction of the enclosing KuduSession::Data object, those 
members 

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-17 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 9:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/batcher.h
File src/kudu/client/batcher.h:

Line 64:   // Takes a reference on error_collector. Creates a weak_ptr to 
'session'.
Could you update this comment, especially WRT the error_collector?  I can't 
tell if the intent is that this constructor takes ownership of the error 
collector, or if it's borrowed.


Line 115:   static int64_t GetOperationSizeInBuffer(KuduWriteOperation* 
write_op);
This doesn't seem too useful since it just calls write_op->SizeInBuffer().  Any 
reason to prefer having it on the batcher?


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS9, Line 555: session
the session


PS9, Line 557: 10x multiplier
10x seems like a lot of leeway.  Could we get away with 2 or 3x?  I would 
consider it a bug if we routinely took 10x longer than the deadline.


PS9, Line 557: maximum 
the maximum


PS9, Line 2221: Status s;
looks like everywhere this is used the call could instead be wrapped in 
ASSERT_OK.  I think that also tends to give better error messages on failure.


PS9, Line 2307: scenrio
'scenario' here and below


PS9, Line 2478: previos
previous


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 36: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
Remove this and use SetMutationBufferSpace instead.


Line 49: DEFINE_int32(client_buffer_flush_watermark_pct, 75,
Instead of a gflag, this should be an option on KuduSession, like 
SetMutationBufferSpace.  This also mirrors the equivalent Java options: 
https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setMutationBufferLowWatermark-float-


Line 61: DEFINE_int32(client_buffer_flush_timeout_ms, 10,
Same here, this should be an option on KuduSession.  
https://kudu.apache.org/apidocs/org/kududb/client/AsyncKuduSession.html#setFlushInterval-int-


PS9, Line 82: :
wrap init list.


Line 115:   messenger->ScheduleOnReactor(
It's not clear to me why this needs to be scheduled on a reactor thread.  I get 
that you don't want to block the user thread on the apply call, but if 
BackgroundFlusher::Run blocks, it's not appropriate to be running on the 
reactor thread anyway.


PS9, Line 154:   if (PREDICT_FALSE(!status.ok())) {
 : return;
 :   }
 :   // Check that the session is still alive to access the data 
safely.
 :   sp::shared_ptr session(weak_session.lock());
 :   if (PREDICT_FALSE(!session)) {
 : return;
 :   }
 :   // Flush mode could change during the operation.  If current 
mode
 :   // is no longer AUTO_FLUSH_BACKGROUND, then stop re-scheduling 
the task.
 :   if (PREDICT_FALSE(data_->flush_mode_ != 
AUTO_FLUSH_BACKGROUND)) {
 : return;
 :   }
 :   // Detach the batcher, schedule its flushing,
 :   // and install a new one to accumulate incoming write 
operations.
 :   data_->NextBatcher(weak_session, 0, nullptr);
Looks like this might be able to call Run instead of duplicating?


PS9, Line 205: MutexLock
prefer to use std::lock_guard if possible.


PS9, Line 392: int64_t* op_raw_size
is this still necessary now that the size is cached?


Line 421:   flow_control_cond_.Wait();
Is a flush forced somewhere in this situation, or is it just waiting for the 
periodic flush?


http://gerrit.cloudera.org:8080/#/c/3952/9/src/kudu/client/session-internal.h
File src/kudu/client/session-internal.h:

PS9, Line 111: BACKGFOUND
BACKGROUND


Line 126: ~BackgroundFlusher();
I think this can use the default keyword instead of defining the no-op dtor in 
session-internal.cc.


Line 136: void CheckAndRun(const std::shared_ptr& messenger,
Could you doc this?  I can't tell exactly what it's for, or why it chooses to 
run the flush on the reactor thread with no wait.


PS9, Line 151: KuduSession::Data* const
It looks like all of the methods on the background flusher that do any work 
take a weak_ptr.  I can't quite follow why this raw ptr is held, and the 
methods take the weak_ptr passed in.  Could a weak_ptr instead be passed in as 
part of the ctor and stored in a field, and then the methods above could use 
that?  I think that would also remove the dtor ordering thing below which is 
pretty fragile, since weak_ptrs can be checked before use.


PS9, Line 159: write operation
the write operation


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8

[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-17 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 9:

Build Started http://104.196.14.100/job/kudu-gerrit/2968/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-17 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#9).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
13 files changed, 1,162 insertions(+), 154 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/9
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-16 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 400:   return shared_ptr();
> I'm suggesting that maybe it's not needed, because the operation is infalli
That's true -- the method returned Status since I brought it from the previous 
revision where I used a separate flusher thread, and Start() could return an 
error if it could not start the flusher thread.

Thank you for pointing at that -- I removed the Status return code, so now it 
returns void.


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 30: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> We don't want to impose a configuration framework on clients.  Imagine if t
Yes, this reasoning makes sense to me.  Since David suggested to use glags for 
those parameters, I though that was a good idea.  Probably, David did not mean 
to expose those to clients.  But then the only usage for them is throughout the 
tests.  OK, that sounds good to me.  If there is a necessity to expose those to 
client applications, we can add that later.


Line 98:   messenger->ScheduleOnReactor(
> I'm not quite following.  It sounds like the C++ client does not have an eq
This code is about determine whether to perform flush.  This is not about 
determine whether to block (that when-to-block code you could find in 
KuduSession::Data::ApplyWriteOp()).

The waterline/watermark criterion is to determine when it's necessary to flush 
the current batcher.  The idea is too check how much of the buffer is taken by 
the freshly submitted (not yet scheduled for flush) operations: if it's more 
than the specified percentage, then flush the current batcher.  Only the 
current batcher has freshly submitted (not yet scheduled for flush) operations.

In my previous comment, I tried to point to the presence of batchers in the C++ 
client code. Probably, the confusion is due to the fact that Java 
implementation does not have those.  In C++ client, there might be several 
batchers accommodating pending operations.  However, the limit on the total 
size of pending operations is enforced -- in total, all currently active 
batchers cannot consume more space for their pending operations than the limit 
specified via the KuduSession::SetMutationBufferSpace() method.

Does it help to resolve the confusion (if any)?


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/util/monotime.h
File src/kudu/util/monotime.h:

Line 133:   friend MonoTime operator -(const MonoTime& t, const MonoDelta& 
delta);
> Thanks!
Actually, thank you for pointing to this.  For some reason I overlooked the 
fact that having small independent patches would benefit everyone.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-16 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 400:   return shared_ptr();
> OK, that sounds reasonable.  Do I understand correctly that you suggest to 
I'm suggesting that maybe it's not needed, because the operation is infallible. 
 It looks like you added Status returns instead of void in a bunch of places, 
but I couldn't find any places which could actually fail.


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 30: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> Why not?  If we exposed them in the documentation, a user could link a bina
We don't want to impose a configuration framework on clients.  Imagine if the 
application used multiple libraries, all of which came with their own 
configuration framework.  This is a real issue in Hadoop where the HDFS client 
libraries ship with a configuration framework, and it's an absolute disaster.


PS6, Line 91: waterline
> That's fun, because originally I had put 'watermark' and then figured that 
Yah, I just suggest watermark to keep it consistent with the Java impl.  In 
reality the words are synonymous.


Line 98:   messenger->ScheduleOnReactor(
> Yes, the Java implementation is different -- it does not have batchers and 
I'm not quite following.  It sounds like the C++ client does not have an 
equivalent to the PleaseThrottle exception, and instead we block when there are 
no buffers available.  That makes sense.  But why in that circumstance is there 
a watermark?  Why would you want to block earlier than necessary?


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/util/monotime.h
File src/kudu/util/monotime.h:

Line 133:   friend MonoTime operator -(const MonoTime& t, const MonoDelta& 
delta);
> I added them here because the tests I added rely on those.
Thanks!


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#8).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
13 files changed, 1,209 insertions(+), 154 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 8:

Build Started http://104.196.14.100/job/kudu-gerrit/2940/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

(7 comments)

Thank you for the review!

I posted a new version (patchset 7) because the LINT build failed for patchset 
6, so I wanted to fix that.

http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 400:   return shared_ptr();
> I've traced this through, and I can't find any place that this can actually
OK, that sounds reasonable.  Do I understand correctly that you suggest to 
remove this check at all?


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/client.h
File src/kudu/client/client.h:

PS6, Line 1029: be
> s/be/become
Done


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 30: DEFINE_int32(client_buffer_bytes_limit, 7 * 1024 * 1024,
> I think gflags don't apply to the client, so these will not really be confi
Why not?  If we exposed them in the documentation, a user could link a binary 
with the library having those flags exposed.

But I agree: making those accessible via the client API would be easier to use, 
otherwise those are left hidden.  However, I don't like idea of mutators for 
there -- I would better set them in the constructor and keep then unchanged.  
Probably, those might be properties of some sort of configuration object.

How do you like the following: I'll mark those as hidden flags, which we will 
use only for testing purposes.  Later on, we can expose controls for those up 
to the client API level.  Does this make sense?


Line 59: KuduSession::Data::BackgroundFlusher::BackgroundFlusher(
> formatting (no wrap on initial argument, and 4 spaces for initializer list)
OK, will fix.


PS6, Line 91: waterline
> We typically use 'watermark' instead of 'waterline' (at least in the Java i
That's fun, because originally I had put 'watermark' and then figured that that 
was not the best term for that:

  https://en.wikipedia.org/wiki/Watermark_(disambiguation)

OK, I'll replace it with watermark.


Line 98:   messenger->ScheduleOnReactor(
> This looks like it's flushing the current buffer if it's over the waterline
Yes, the Java implementation is different -- it does not have batchers and 
other stuff.  Also, here the waterline is about amount of freshly added 
operations, not total size of the pending operations.


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/util/monotime.h
File src/kudu/util/monotime.h:

Line 133:   friend MonoTime operator -(const MonoTime& t, const MonoDelta& 
delta);
> Maybe split the changes to MonoTime into its own review.
I added them here because the tests I added rely on those.

All right, having that as a separate change will be better from codereview's 
point.  Will separate into a stand-alone change.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 7:

(7 comments)

Incremental review, since I see a new revision has been put up.  I don't think 
this is going to make 0.10, so I'm going to preempt it in the meantime.

http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 400:   return shared_ptr();
I've traced this through, and I can't find any place that this can actually 
fail.  I think as it's written will be quite problematic, I can see this being 
a common source of nullptr dereferences (if in fact there are cases where it 
can fail).


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/client.h
File src/kudu/client/client.h:

PS6, Line 1029: be
s/be/become


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/client/session-internal.cc
File src/kudu/client/session-internal.cc:

Line 30: 
I think gflags don't apply to the client, so these will not really be 
configurable.  I think setting the default to be a constant and exposing 
mutators on the Session instance is the best we can do.


Line 59: 
formatting (no wrap on initial argument, and 4 spaces for initializer list). 
https://google.github.io/styleguide/cppguide.html#Constructor_Initializer_Lists.

A lot of the methods below are wrapped too much as well.


PS6, Line 91: 
We typically use 'watermark' instead of 'waterline' (at least in the Java impl).


Line 98: const bool over_waterline =
This looks like it's flushing the current buffer if it's over the waterline, 
but that's not exactly what the Java client does.  If a write comes in over the 
waterline, a PleaseThrottle exception is passed back to the user.  Also, this 
only ever happens if there isn't a free buffer available.


http://gerrit.cloudera.org:8080/#/c/3952/6/src/kudu/util/monotime.h
File src/kudu/util/monotime.h:

Line 133:   friend MonoTime operator -(const MonoTime& t, const MonoDelta& 
delta);
Maybe split the changes to MonoTime into its own review.


-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#7).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
13 files changed, 1,221 insertions(+), 155 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/7
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 7:

Build Started http://104.196.14.100/job/kudu-gerrit/2920/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

Build Started http://104.196.14.100/job/kudu-gerrit/2916/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#6).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/client/write_op.cc
M src/kudu/client/write_op.h
M src/kudu/util/env_posix.cc
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
14 files changed, 1,218 insertions(+), 155 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/6
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#5).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/env_posix.cc
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
12 files changed, 1,138 insertions(+), 151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-15 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 5:

Build Started http://104.196.14.100/job/kudu-gerrit/2896/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-13 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#4).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
M src/kudu/util/stopwatch.h
12 files changed, 990 insertions(+), 151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-13 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 4:

Build Started http://104.196.14.100/job/kudu-gerrit/2861/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-12 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#3).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
M src/kudu/util/stopwatch.h
12 files changed, 970 insertions(+), 151 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-12 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 3:

Build Started http://104.196.14.100/job/kudu-gerrit/2860/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-12 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 2:

Build Started http://104.196.14.100/job/kudu-gerrit/2837/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-12 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3952

to look at the new patch set (#2).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for the Kudu C++ client library.
KuduSession objects use RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches the buffer size limit.
The limit on the buffer size can be set by the
KuduSession::SetMutationBufferSpace() method or via the
--client_buffer_bytes_limit command-line flag.

The background flusher checks whether at least one of the following two
conditions is satisfied to determine whether it's time to flush
the accumulated write operations:
  * over-the-waterline criterion: check whether the total size of the
freshly submitted (i.e. not-scheduled-for-flush) write operations
is over the threshold.  The threshold can be set as a percentage
of the total buffer size using the
--client_buffer_flush_waterline_pct command-line flag.
  * maximum wait time criterion: check whether the previous flush
happened more than the maximum wait time ago.  The maximum wait time
can be specified in milliseconds using the
--client_buffer_flush_timeout_ms command-line flag.

This change also addresses the following JIRA issue:
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
11 files changed, 966 insertions(+), 148 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-12 Thread Alexey Serbin (Code Review)
Alexey Serbin has abandoned this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Abandoned

Abandoned in favor of https://gerrit.cloudera.org/#/c/3952/

The new approach uses RPC messenger thread pool instead of spawning a thread 
per KuduSession object.

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: abandon
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-11 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/2827/

-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-11 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3952

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for C++ client library.
KuduSession instance uses RPC messenger's thread pool to schedule
background flush tasks.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if total amount of data
for operations not pushed to server reaches limit on the buffer
size.  The limit on the buffer size can be set by
KuduSession::SetMutationBufferSpace() or via the
--client_buffer_bytes_limit command-line flag.

The background flusher uses high-watermark approach for
its flush criterion: it counts buffer size used by not-yet-flushed
(i.e. freshly submitted) write operations and checks whether
the specified threshold is reached.  The threshold can be set via the
--client_buffer_hw_pct command-line flag as a percentage of the total
buffer space.

This change also addresses the following JIRA issue(s):
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-internal.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
11 files changed, 951 insertions(+), 149 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3952/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I34905c30b3aad96f53cf7a1822b1cde6d25f33a8
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-11 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 8:

Build Started http://104.196.14.100/job/kudu-gerrit/2811/

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-11 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3668

to look at the new patch set (#8).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for C++ client library.
KuduSession instance starts a dedicated background flusher thread
to perform flush in background.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if running out of space
in 'virtual buffer': there isn't a single buffer as is,
there is a set of buffered write operations used by multiple batchers.
The limit on the buffer size can be set by
KuduSession::SetMutationBufferSpace().

The background flusher uses high-watermark approach for
its flush criterion: it counts buffer size used by not-yet-flushed
write operations.

The data flow control implemented by KuduSession::Apply() combines
high-/low-watermark approach for its pause/continue criterion:
it counts total buffer size used by write operations.

This change also addresses the following JIRA issue(s):
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
---
M python/kudu/tests/test_client.py
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
10 files changed, 933 insertions(+), 149 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/3668/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 7:

Build Started http://104.196.14.100/job/kudu-gerrit/2739/

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-08 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 6:

Build Started http://104.196.14.100/job/kudu-gerrit/2738/

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-05 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3668

to look at the new patch set (#5).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for C++ client library.
KuduSession instance starts a dedicated background flusher thread
to perform flush in background.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if running out of space
in 'virtual buffer': there isn't a single buffer as is,
there is a set of buffered write operations used by multiple batchers.
The limit on the buffer size can be set by
KuduSession::SetMutationBufferSpace().

The background flusher uses high-watermark approach for
its flush criterion: it counts buffer size used by not-yet-flushed
write operations.

The data flow control implemented by KuduSession::Apply() combines
high-/low-watermark approach for its pause/continue criterion:
it counts total buffer size used by write operations.

This change also addresses the following JIRA issue(s):
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
---
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
9 files changed, 876 insertions(+), 116 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/3668/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-05 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 5:

Build Started http://104.196.14.100/job/kudu-gerrit/2728/

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-05 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..


Patch Set 4:

Build Started http://104.196.14.100/job/kudu-gerrit/2723/

-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: No


[kudu-CR] KUDU-456 Implement AUTO FLUSH BACKGROUND flush mode

2016-08-05 Thread Alexey Serbin (Code Review)
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3668

to look at the new patch set (#4).

Change subject: KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode
..

KUDU-456 Implement AUTO_FLUSH_BACKGROUND flush mode

Implemented AUTO_FLUSH_BACKGROUND for C++ client library.
KuduSession instance starts a dedicated background flusher thread
to perform flush in background.  In AUTO_FLUSH_BACKGROUND mode,
the KuduSession::Apply() method blocks if running out of space
in 'virtual buffer': there isn't a single buffer as is,
there is a set of buffered write operations used by multiple batchers.
The limit on the buffer size can be set by
KuduSession::SetMutationBufferSpace().

The background flusher uses high-watermark approach for
its flush criterion: it counts buffer size used by not-yet-flushed
write operations.

The data flow control implemented by KuduSession::Apply() combines
high-/low-watermark approach for its pause/continue criterion:
it counts total buffer size used by write operations.

This change also addresses the following JIRA issue(s):
  KUDU-1376 KuduSession::SetMutationBufferSpace is not defined

Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
---
M src/kudu/client/batcher.cc
M src/kudu/client/batcher.h
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/session-internal.cc
M src/kudu/client/session-internal.h
M src/kudu/util/monotime.cc
M src/kudu/util/monotime.h
9 files changed, 869 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/68/3668/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3668
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idfde05bef342db24990c6e3da3b0270c3bb37a9d
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: David Ribeiro Alves 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon