Volodymyr Verovkin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14300 )

Change subject: KUDU-2800 draft 2
......................................................................


Patch Set 7:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/14300/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14300/7//COMMIT_MSG@7
PS7, Line 7: KUDU-2800 draft 2
> Add proper description
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc
File src/kudu/integration-tests/raft_config_change-itest.cc:

http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@460
PS6, Line 460: // Check re-replication during slow tablet replica bootstrap.
> nit: add a period in the end of the sentence.
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@464
PS6, Line 464:                               const string& evicted_replica_uuid,
             :                               const MonoDelta& kTimeout)
> I think it's better to explicitly say that replica is not evicted.
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@467
PS6, Line 467:     ASSERT_OK(GetLeaderReplicaWithRetries(tablet_id_, &leader));
> Here and in other new test scenarios: add SKIP_IF_SLOW_NOT_ALLOWED() since
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@483
PS6, Line 483:
> nit: the tablet's replica
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@492
PS6, Line 492: // Extra sanity checks.
> nit: if it's used only once, maybe put the lambda's code inside ASSERT_EVEN
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@500
PS6, Line 500:
> server
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@585
PS6, Line 585: _servers.front();
> Drop this: I think it's too vague to say that the replica  removes its data
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@590
PS6, Line 590:   // Inject delay in
> I think it's quite obvious what happens in two lines below; maybe drop this
Done


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@597
PS6, Line 597:   replica_ets->Shutdown();
             :   ASSERT_OK(ts->Restart());
             :
             :   // Find a leader. In case we paused the leader above, this 
will wait until
             :   // we have elected a new one.
             :   TServerDetails* leader = nullptr;
             :   w
> I think GetLeaderReplicaWithRetries() takes care of retries already.  Why i
I think if we shutdown leader, it may not quickly propagate across the cluster 
and we can see it during GetLeaderReplicaWithRetries() till consensus detect 
loss of leader and elect new one.


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@628
PS6, Line 628:   // Some WAL segments must exist, but wal segment 1 must not 
exist.
             :   ASSERT_OK(inspect_->WaitForFilePatte
> Given that there is ASSERT_EVENTUALLY() below, is it necessary at all?
We give that replica time equal to tablet_open_bootstrap_inject_latency_ms to 
finish bootstrap in order to make sure that, once replica boot up and try to 
re-join consensus, it will fail doing that.


http://gerrit.cloudera.org:8080/#/c/14300/6/src/kudu/integration-tests/raft_config_change-itest.cc@632
PS6, Line 632:
             :   //Let the replica finish slow bootstrap
             :   SleepFor(MonoDelta::FromSeconds(3));
             :
             :   ASSERT_EVENTUALLY([&]() {
             :     ValidateConsensusState(no_replica_servers.front(), 
follower_uuid, kTimeout);
             :   });
             : }
             :
> This is a repeating pattern.  Maybe, factor it out into a class method?
Done


http://gerrit.cloudera.org:8080/#/c/14300/7/src/kudu/integration-tests/raft_config_change-itest.cc
File src/kudu/integration-tests/raft_config_change-itest.cc:

http://gerrit.cloudera.org:8080/#/c/14300/7/src/kudu/integration-tests/raft_config_change-itest.cc@461
PS7, Line 461: SlowBootstrapTest
> nit: maybe, SlowTabletBootstrapTest would be more precise?
Done


http://gerrit.cloudera.org:8080/#/c/14300/7/src/kudu/integration-tests/raft_config_change-itest.cc@508
PS7, Line 508:     TServerDetails* leader = nullptr;
             :     ASSERT_OK(GetLeaderReplicaWithRetries(tablet_id_, &leader));
             :     consensus::ConsensusStatePB cstate;
             :     ASSERT_OK(GetConsensusState(leader, tablet_id_, kTimeout,
             :                                 EXCLUDE_HEALTH_REPORT, &cstate));
             :     // Check that new replica has not been added.
             :     ASSERT_FALSE(IsRaftConfigMember(no_replica_servers.front(), 
cstate.committed_config()));
             :     // Check that the replica at the tablet servers that was 
restarted is not removed.
             :     ASSERT_TRUE(IsRaftConfigMember(replica_servers.front(), 
cstate.committed_config()));
> Does it make sense to make the newly introduced ValidateConsensusState() to
This case is slightly different.
In ValidateConsensusState() we test
ASSERT_TRUE(IsRaftConfigMember(no_replica_servers.front(), 
cstate.committed_config()));
but here
ASSERT_FALSE(IsRaftConfigMember(no_replica_servers.front(), 
cstate.committed_config()));


http://gerrit.cloudera.org:8080/#/c/14300/7/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/14300/7/src/kudu/tserver/ts_tablet_manager.cc@144
PS7, Line 144: hidden
> Change to 'unsafe'.  That way it's automatically hidden from the documentat
Done


http://gerrit.cloudera.org:8080/#/c/14300/7/src/kudu/tserver/ts_tablet_manager.cc@1077
PS7, Line 1077: "Injecting " << FLAGS_tablet_open_bootstrap_inject_latency_ms
              :                      << "ms delay in tablet bootstrapping"
> nit: maybe, use strings::Substitute() to format the message for easier read
It's a copy-paste of FLAGS_delete_tablet_inject_latency_ms code. I found 4 code 
entries with exactly same code.



--
To view, visit http://gerrit.cloudera.org:8080/14300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie1fee42053194f51d7a869ce14788095d6627ed9
Gerrit-Change-Number: 14300
Gerrit-PatchSet: 7
Gerrit-Owner: Volodymyr Verovkin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Volodymyr Verovkin <[email protected]>
Gerrit-Comment-Date: Wed, 02 Oct 2019 20:51:22 +0000
Gerrit-HasComments: Yes

Reply via email to