[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-20 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..

KUDU-2274 [itest] stress test for replica replacement

This is a stress test for the automatic replica replacement in Kudu.

Parameters of the test are configurable via run-time gflags, so it's
possible to run it in a 'standalone' mode, targeting it to be a sort
of an endurance test.

This test reproduces the race described in KUDU-2274: it triggers
DCHECK() assertions added in 194fd8b169f29aafbd78a47709ac51d2e8354a1a
before the relevant fixes for KUDU-2274 were checked in.

Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Reviewed-on: http://gerrit.cloudera.org:8080/9255
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy 
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
A src/kudu/integration-tests/raft_consensus_stress-itest.cc
4 files changed, 305 insertions(+), 4 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Mike Percy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 7
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-20 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 21 Feb 2018 01:37:02 +
Gerrit-HasComments: No


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-20 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9255/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9255/5//COMMIT_MSG@11
PS5, Line 11: ar
> are
Done


http://gerrit.cloudera.org:8080/#/c/9255/5/src/kudu/integration-tests/raft_consensus_stress-itest.cc
File src/kudu/integration-tests/raft_consensus_stress-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9255/5/src/kudu/integration-tests/raft_consensus_stress-itest.cc@236
PS5, Line 236:
> nit: why not iteration++ here?
Because I don't want to count retry due to a ksck failure as an iteration: if 
RunKsck() fails, the logic starts another loop, but  that case is not counted 
as 'iteration' in sense of this scenario.


http://gerrit.cloudera.org:8080/#/c/9255/5/src/kudu/integration-tests/raft_consensus_stress-itest.cc@277
PS5, Line 277:   if (ksck_failures_in_a_row > FLAGS_test_max_ksck_failures) {
 : // Suspecting a Raft consensus failure while running ksck 
with shorter
 : // timeout (see above). Run an extra round of ksck with the 
regular timeout
 : // to verify that replicas haven't really converged and, if 
so, just bail
 : // right at this point.
 : const auto& s = v.RunKsck();
 : if (!s.ok()) {
 :   FAIL() << Substitute("$0: tablet replicas haven't 
converged", s.ToString());
 : }
 :   }
> isn't this the same as line 288 where we run v.CheckCluster() ?
Yes, CheckCluster contains RunKsck().  However, it's wrapped into some extra 
logic for retries.  Here I wanted to short-circuit and report an error right 
away if the ksck has already failed too many times in a row.



--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 20 Feb 2018 23:54:32 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-20 Thread Alexey Serbin (Code Review)
Hello Tidy Bot, Mike Percy,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9255

to look at the new patch set (#6).

Change subject: KUDU-2274 [itest] stress test for replica replacement
..

KUDU-2274 [itest] stress test for replica replacement

This is a stress test for the automatic replica replacement in Kudu.

Parameters of the test are configurable via run-time gflags, so it's
possible to run it in a 'standalone' mode, targeting it to be a sort
of an endurance test.

This test reproduces the race described in KUDU-2274: it triggers
DCHECK() assertions added in 194fd8b169f29aafbd78a47709ac51d2e8354a1a
before the relevant fixes for KUDU-2274 were checked in.

Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
A src/kudu/integration-tests/raft_consensus_stress-itest.cc
4 files changed, 305 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/9255/6
--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-20 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Patch Set 5: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/9255/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9255/5//COMMIT_MSG@11
PS5, Line 11: is
are


http://gerrit.cloudera.org:8080/#/c/9255/5/src/kudu/integration-tests/raft_consensus_stress-itest.cc
File src/kudu/integration-tests/raft_consensus_stress-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9255/5/src/kudu/integration-tests/raft_consensus_stress-itest.cc@236
PS5, Line 236:
nit: why not iteration++ here?


http://gerrit.cloudera.org:8080/#/c/9255/5/src/kudu/integration-tests/raft_consensus_stress-itest.cc@277
PS5, Line 277:   if (ksck_failures_in_a_row > FLAGS_test_max_ksck_failures) {
 : // Suspecting a Raft consensus failure while running ksck 
with shorter
 : // timeout (see above). Run an extra round of ksck with the 
regular timeout
 : // to verify that replicas haven't really converged and, if 
so, just bail
 : // right at this point.
 : const auto& s = v.RunKsck();
 : if (!s.ok()) {
 :   FAIL() << Substitute("$0: tablet replicas haven't 
converged", s.ToString());
 : }
 :   }
isn't this the same as line 288 where we run v.CheckCluster() ?



--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 20 Feb 2018 23:42:10 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Patch Set 5: Verified+1

(1 comment)

Unreleated flakes in

  DiskReservationITest.TestFillMultipleDisks
  DiskReservationITest.TestWalWriteToFullDiskAborts

due to NTP failure

F0214 06:53:26.686384 28842 master_main.cc:74] Check failed: _s.ok() Bad 
status: Service unavailable: Cannot initialize clock: Error reading clock. 
Clock considered unsynchronized

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc
File src/kudu/integration-tests/raft_consensus_stress-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@113
PS3, Line 113:
> Since we are not using tablet_id_ etc, can we instead use ExternalMiniClust
This test uses tablet_servers_ and the code to populate that.  I'm also 
thinking that we might expand functionality of this test in the future, so I 
would prefer to keep RaftConsensusITestBase here.

However, if you feel strongly about that, I'll update this to use 
ExternalMiniClusterITestBase.



--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 14 Feb 2018 19:49:02 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed Kudu Jenkins from this change.  ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Patch Set 5:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc
File src/kudu/integration-tests/raft_consensus_stress-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@66
PS3, Line 66: t
> magic number, can we add something like /* ERROR */ or ideally use the cons
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@79
PS3, Line 79: "
> 20 for ASAN and 5 for everybody else?
Yep, that's the ultimate typo.  It should be inversed/swapped.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@117
PS3, Line 117:  int
> nit: s/being/are/
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@122
PS3, Line 122: ShortTimeout = MonoDelta::FromSeconds(
> nit: how about:
well, as it turned out, after fixing the ultimate typo it works for TSAN even 
without patch for TestWorkload.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@170
PS3, Line 170:
> kNumReplicas ?
By my understanding (after looking into catalog_manager.cc), 
FLAGS_max_create_tablets_per_ts limits the number of partitions/tablets, not 
the number of total replicas.  My original idea was to have 
FLAGS_test_num_tablets_per_server as a proxy for 
FLAGS_max_create_tablets_per_ts.

However, I updated the signature and now the test speaks in number of replicas 
per tablet server: I agree it's more convenient.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@175
PS3, Line 175: le (workload.
> TODO(KUDU-1188)
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@212
PS3, Line 212:   const auto ts_
> this loop will run hot if do_pauses is set to false
Right.  I moved SleepFor(MonoDelta::FromMilliseconds(250)) out of the if() 
closure.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@241
PS3, Line 241: workload.StopAndJoin();
 : rows_inserted = wo
> are these lines needed?
Yep, now I think we can get rid of them safely.  Maybe, in one of the former 
revisions they were useful.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@265
PS3, Line 265:
> false
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@266
PS3, Line 266: e
> false
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@276
PS3, Line 276: T
> true
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@277
PS3, Line 277: r
> true
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@284
PS3, Line 284:   FAIL() << Substitute("$0: tablet replicas haven't 
converged", s.ToString());
 : }
> remove these lines?
Done


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@293
PS3, Line 293: FATALS(v.Che
> Consider saving this in a Status last_ksck_status defined next to ksck_fail
Good catch.  It seems like just another lost update due to multiple revisions 
of this test.  I corrected the code as necessary.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@305
PS3, Line 305:
> Doesn't CheckRowCount() retry? Leader leases won't help here, I don't think
Indeed, thank you for pointing at the right direction.  I thought the message 
was about reading less than expected, but as it turned out it was about reading 
more than expected.



--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 14 Feb 2018 06:56:13 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-13 Thread Alexey Serbin (Code Review)
Hello Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9255

to look at the new patch set (#5).

Change subject: KUDU-2274 [itest] stress test for replica replacement
..

KUDU-2274 [itest] stress test for replica replacement

This is a stress test for the automatic replica replacement in Kudu.

Parameters of the test is configurable via run-time gflags, so it's
possible to run it in a 'standalone' mode, targeting it to be a sort
of an endurance test.

This test reproduces the race described in KUDU-2274: it triggers
DCHECK() assertions added in 194fd8b169f29aafbd78a47709ac51d2e8354a1a
before the relevant fixes for KUDU-2274 were checked in.

Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
A src/kudu/integration-tests/raft_consensus_stress-itest.cc
4 files changed, 305 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/9255/5
--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-13 Thread Mike Percy (Code Review)
Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9255 )

Change subject: KUDU-2274 [itest] stress test for replica replacement
..


Patch Set 4:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc
File src/kudu/integration-tests/raft_consensus_stress-itest.cc:

http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@66
PS3, Line 66: 2
magic number, can we add something like /* ERROR */ or ideally use the constant 
from glog if possible? Also an explanation as to why we chose this as the 
default would be helpful for posterity.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@79
PS3, Line 79: 2
20 for ASAN and 5 for everybody else?


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@113
PS3, Line 113: RaftConsensusITestBase
Since we are not using tablet_id_ etc, can we instead use 
ExternalMiniClusterITestBase as the base class?


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@117
PS3, Line 117: being
nit: s/being/are/


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@122
PS3, Line 122: this test is not fit to run under TSAN
nit: how about:

  this test is not reliable under TSAN; skipping test


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@170
PS3, Line 170: 2
kNumReplicas ?


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@175
PS3, Line 175: TODO(aserbin)
TODO(KUDU-1188)


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@212
PS3, Line 212: if (do_pauses) {
this loop will run hot if do_pauses is set to false


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@241
PS3, Line 241: do_elections = false;
 : do_pauses = false;
are these lines needed?


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@265
PS3, Line 265: 0
false


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@266
PS3, Line 266: 0
false


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@276
PS3, Line 276: 1
true


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@277
PS3, Line 277: 1
true


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@284
PS3, Line 284:   do_elections = false;
 :   do_pauses = false;
remove these lines?


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@293
PS3, Line 293: onst auto& s
Consider saving this in a Status last_ksck_status defined next to 
ksck_failures_in_a_row (outside the loop) and saved on line 273, since it would 
be confusing to print a success message on failure if the final attempt 
succeeds.


http://gerrit.cloudera.org:8080/#/c/9255/3/src/kudu/integration-tests/raft_consensus_stress-itest.cc@305
PS3, Line 305:   //NO_FATALS(v.CheckRowCount(workload.table_name(),
Doesn't CheckRowCount() retry? Leader leases won't help here, I don't think. If 
we allow writes to time out, then we can actually end up writing more rows than 
got acknowledged. Since we are using workload.set_timeout_allowed(true), we 
should be using (ClusterVerifier::AT_LEAST, workload.rows_inserted()), not 
ClusterVerifier::EXACTLY.



--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Wed, 14 Feb 2018 02:19:16 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-13 Thread Alexey Serbin (Code Review)
Hello Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9255

to look at the new patch set (#4).

Change subject: KUDU-2274 [itest] stress test for replica replacement
..

KUDU-2274 [itest] stress test for replica replacement

This is a stress test for the automatic replica replacement in Kudu.

Parameters of the test is configurable via run-time gflags, so it's
possible to run it in a 'standalone' mode, targeting it to be a sort
of an endurance test.

This test reproduces the race described in KUDU-2274: it triggers
DCHECK() assertions added in 194fd8b169f29aafbd78a47709ac51d2e8354a1a
before the relevant fixes for KUDU-2274 were checked in.

Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
A src/kudu/integration-tests/raft_consensus_stress-itest.cc
4 files changed, 317 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/9255/4
--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot


[kudu-CR] KUDU-2274 [itest] stress test for replica replacement

2018-02-13 Thread Alexey Serbin (Code Review)
Hello Tidy Bot, Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9255

to look at the new patch set (#3).

Change subject: KUDU-2274 [itest] stress test for replica replacement
..

KUDU-2274 [itest] stress test for replica replacement

This is a stress test for the automatic replica replacement in Kudu.

Parameters of the test is configurable via run-time gflags, so it's
possible to run it in a 'standalone' mode, targeting it to be a sort
of an endurance test.

This test reproduces the race described in KUDU-2274: it triggers
DCHECK() assertions added in 194fd8b169f29aafbd78a47709ac51d2e8354a1a
before the relevant fixes for KUDU-2274 were checked in.

Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
---
M src/kudu/integration-tests/CMakeLists.txt
M src/kudu/integration-tests/cluster_verifier.cc
M src/kudu/integration-tests/cluster_verifier.h
A src/kudu/integration-tests/raft_consensus_stress-itest.cc
4 files changed, 317 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/55/9255/3
--
To view, visit http://gerrit.cloudera.org:8080/9255
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I036a882f7e9132a5c26013227c50c0699b59ed6e
Gerrit-Change-Number: 9255
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy 
Gerrit-Reviewer: Tidy Bot