Mike Percy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11142 )

Change subject: KUDU-2463 pt 1: adjust MVCC when replaying no-ops
......................................................................


Patch Set 6:

(32 comments)

Approach looks good, test looks good. How long does the test take to run? Maybe 
should be considered a slow test? Is it stable on dist-test?

I mostly have just left nitpick feedback.

http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG@10
PS6, Line 10: timstamps
timestamps


http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG@17
PS6, Line 17: necessarily be said for timestamps of no-ops and change configs.
also schema changes (I think)


http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG@37
PS6, Line 37: This alone isn't enough to prevent KUDU-2463, but it reduces the 
window
The cases where it doesn't prevent the problem are super edge case, right? The 
only scenario I can think of is where the WAL is full of config change ops or 
alter table ops and not enough replicas come up to get a leader elected.


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/consensus/consensus.proto
File src/kudu/consensus/consensus.proto:

http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/consensus/consensus.proto@235
PS6, Line 235: It
If


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/consensus/consensus.proto@236
PS6, Line 236: that the timestamp is assigned serially
to be true


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/raft_consensus-itest.cc
File src/kudu/integration-tests/raft_consensus-itest.cc:

http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/raft_consensus-itest.cc@259
PS6, Line 259: 10000
is this needed for this test to pass? or just for debuggability upon test 
failure? could use a small comment


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc
File src/kudu/integration-tests/timestamp_advancement-itest.cc:

PS6:
Did clang-tidy run on this file?


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@78
PS6, Line 78: using std::string;
nit: hrm, is there a reason you put these outside the namespace? seems easier 
to maintain with one "using" list


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@79
PS6, Line 79: using std::shared_ptr;
nit: out of alpha order


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@86
PS6, Line 86: class RaftPeerPB;
not used?


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@90
PS6, Line 90: class RpcController;
not used?


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@94
PS6, Line 94: class TabletServerServiceProxy;
not used


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@107
PS6, Line 107: using tserver::TabletServerServiceProxy;
nit: not used


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@108
PS6, Line 108: using rpc::RpcController;
nit: out of alpha order


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@112
PS6, Line 112: NonAdvancedTimestampsITest
nit: TimestampAdvancementITest ?


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@121
PS6, Line 121: write.set_num_tablets(1);
this is the default


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@122
PS6, Line 122:     write.set_payload_bytes(10);
             :     write.set_num_write_threads(1);
             :     write.set_write_batch_size(1);
Why change this stuff? If no compelling reason, let's remove it so we don't 
have to wonder about it.


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@135
PS6, Line 135:     const int64_t rows_inserted = write.rows_inserted();
             :     for (int i = 0; i < cluster_->num_tablet_servers(); i++) {
             :       HostPort hp;
             :       
hp.set_host(cluster_->mini_tablet_server(i)->bound_http_addr().host());
             :       
hp.set_port(cluster_->mini_tablet_server(i)->bound_http_addr().port());
             :       ASSERT_EVENTUALLY([&] {
             :         int64_t rows_in_replica;
             :         WARN_NOT_OK(itest::GetInt64Metric(hp,
             :             &METRIC_ENTITY_tablet, nullptr, 
&METRIC_rows_inserted, "value", &rows_in_replica),
             :             "Couldn't get metric... will retry");
             :         ASSERT_EQ(rows_inserted, rows_in_replica);
             :       });
             :     }
you should be able to replace this with:

    ASSERT_OK(WaitForServersToAgree(kTimeout, ts_map_, tablet_id, 
workload.batches_completed()));


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@149
PS6, Line 149: segement
segment


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@150
PS6, Line 150: onto
nit: to


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@151
PS6, Line 151: assert eventually here
not seeing this


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@181
PS6, Line 181:     return replicas[0];
nit: maybe add a CHECK_EQ(1, replicas.size());


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@186
PS6, Line 186: HasNoWriteOps
nit: Rename to CheckForWriteOpsInLog() ?


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@199
PS6, Line 199: s.IsCorruption()
nit: we don't expect a truncated log file in this test, do we?


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@202
PS6, Line 202:           case log::COMMIT:
do we care about commit messages? if they don't have a matching replicate 
message then they are dropped during tablet bootstrap


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@223
PS6, Line 223: leave
s/leave//


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@224
PS6, Line 224: timestamps
safe time timestamp


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@225
PS6, Line 225: TestTimestampsAdvancedFromBootstrapNoOp
nit: TestNoOpAdvancesMvccSafeTimeOnBootstrap ?


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@226
PS6, Line 226:   // Set a low Raft heartbeat and encourage frequent elections.
... so that we can fill up the WAL with no-op entries naturally


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@272
PS6, Line 272:
nit: extra blank line


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/tablet/tablet_bootstrap.cc
File src/kudu/tablet/tablet_bootstrap.cc:

http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/tablet/tablet_bootstrap.cc@1097
PS6, Line 1097: assigned serially with respect to
              :   // one another.
suggestion: guaranteed to be monotonic with respect to the write-ahead log.


http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/tablet/tablet_bootstrap.cc@1113
PS6, Line 1113:     default:
Hmm. I wonder if we had a bug excluding alter schema here. Do you know whether 
alter schema provides the same guarantees that writes provide?

I think I would be more comfortable writing this with the default set to false, 
something like:

timestamp_assigned_in_opid_order = false;
switch (op_type) {
  case WRITE_OP:
    timestamp_assigned_in_opid_order = true;
    break;
  case NO_OP: {
    const auto& req = replicate->noop_request();
    if (req.has_timestamp_in_opid_order()) {
      timestamp_assigned_in_opid_order = req.timestamp_in_opid_order();
      break;
    }
    timestamp_assigned_in_opid_order = true;
    break;
  }
  default:
    break;
}



--
To view, visit http://gerrit.cloudera.org:8080/11142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f
Gerrit-Change-Number: 11142
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <[email protected]>
Gerrit-Comment-Date: Sat, 15 Sep 2018 00:15:20 +0000
Gerrit-HasComments: Yes

Reply via email to