Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11142 )
Change subject: KUDU-2463 pt 1: adjust MVCC when replaying no-ops ...................................................................... Patch Set 6: (32 comments) Approach looks good, test looks good. How long does the test take to run? Maybe should be considered a slow test? Is it stable on dist-test? I mostly have just left nitpick feedback. http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG@10 PS6, Line 10: timstamps timestamps http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG@17 PS6, Line 17: necessarily be said for timestamps of no-ops and change configs. also schema changes (I think) http://gerrit.cloudera.org:8080/#/c/11142/6//COMMIT_MSG@37 PS6, Line 37: This alone isn't enough to prevent KUDU-2463, but it reduces the window The cases where it doesn't prevent the problem are super edge case, right? The only scenario I can think of is where the WAL is full of config change ops or alter table ops and not enough replicas come up to get a leader elected. http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/consensus/consensus.proto File src/kudu/consensus/consensus.proto: http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/consensus/consensus.proto@235 PS6, Line 235: It If http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/consensus/consensus.proto@236 PS6, Line 236: that the timestamp is assigned serially to be true http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/raft_consensus-itest.cc File src/kudu/integration-tests/raft_consensus-itest.cc: http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/raft_consensus-itest.cc@259 PS6, Line 259: 10000 is this needed for this test to pass? or just for debuggability upon test failure? could use a small comment http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc File src/kudu/integration-tests/timestamp_advancement-itest.cc: PS6: Did clang-tidy run on this file? http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@78 PS6, Line 78: using std::string; nit: hrm, is there a reason you put these outside the namespace? seems easier to maintain with one "using" list http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@79 PS6, Line 79: using std::shared_ptr; nit: out of alpha order http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@86 PS6, Line 86: class RaftPeerPB; not used? http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@90 PS6, Line 90: class RpcController; not used? http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@94 PS6, Line 94: class TabletServerServiceProxy; not used http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@107 PS6, Line 107: using tserver::TabletServerServiceProxy; nit: not used http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@108 PS6, Line 108: using rpc::RpcController; nit: out of alpha order http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@112 PS6, Line 112: NonAdvancedTimestampsITest nit: TimestampAdvancementITest ? http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@121 PS6, Line 121: write.set_num_tablets(1); this is the default http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@122 PS6, Line 122: write.set_payload_bytes(10); : write.set_num_write_threads(1); : write.set_write_batch_size(1); Why change this stuff? If no compelling reason, let's remove it so we don't have to wonder about it. http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@135 PS6, Line 135: const int64_t rows_inserted = write.rows_inserted(); : for (int i = 0; i < cluster_->num_tablet_servers(); i++) { : HostPort hp; : hp.set_host(cluster_->mini_tablet_server(i)->bound_http_addr().host()); : hp.set_port(cluster_->mini_tablet_server(i)->bound_http_addr().port()); : ASSERT_EVENTUALLY([&] { : int64_t rows_in_replica; : WARN_NOT_OK(itest::GetInt64Metric(hp, : &METRIC_ENTITY_tablet, nullptr, &METRIC_rows_inserted, "value", &rows_in_replica), : "Couldn't get metric... will retry"); : ASSERT_EQ(rows_inserted, rows_in_replica); : }); : } you should be able to replace this with: ASSERT_OK(WaitForServersToAgree(kTimeout, ts_map_, tablet_id, workload.batches_completed())); http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@149 PS6, Line 149: segement segment http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@150 PS6, Line 150: onto nit: to http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@151 PS6, Line 151: assert eventually here not seeing this http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@181 PS6, Line 181: return replicas[0]; nit: maybe add a CHECK_EQ(1, replicas.size()); http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@186 PS6, Line 186: HasNoWriteOps nit: Rename to CheckForWriteOpsInLog() ? http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@199 PS6, Line 199: s.IsCorruption() nit: we don't expect a truncated log file in this test, do we? http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@202 PS6, Line 202: case log::COMMIT: do we care about commit messages? if they don't have a matching replicate message then they are dropped during tablet bootstrap http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@223 PS6, Line 223: leave s/leave// http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@224 PS6, Line 224: timestamps safe time timestamp http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@225 PS6, Line 225: TestTimestampsAdvancedFromBootstrapNoOp nit: TestNoOpAdvancesMvccSafeTimeOnBootstrap ? http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@226 PS6, Line 226: // Set a low Raft heartbeat and encourage frequent elections. ... so that we can fill up the WAL with no-op entries naturally http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/integration-tests/timestamp_advancement-itest.cc@272 PS6, Line 272: nit: extra blank line http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/tablet/tablet_bootstrap.cc File src/kudu/tablet/tablet_bootstrap.cc: http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/tablet/tablet_bootstrap.cc@1097 PS6, Line 1097: assigned serially with respect to : // one another. suggestion: guaranteed to be monotonic with respect to the write-ahead log. http://gerrit.cloudera.org:8080/#/c/11142/6/src/kudu/tablet/tablet_bootstrap.cc@1113 PS6, Line 1113: default: Hmm. I wonder if we had a bug excluding alter schema here. Do you know whether alter schema provides the same guarantees that writes provide? I think I would be more comfortable writing this with the default set to false, something like: timestamp_assigned_in_opid_order = false; switch (op_type) { case WRITE_OP: timestamp_assigned_in_opid_order = true; break; case NO_OP: { const auto& req = replicate->noop_request(); if (req.has_timestamp_in_opid_order()) { timestamp_assigned_in_opid_order = req.timestamp_in_opid_order(); break; } timestamp_assigned_in_opid_order = true; break; } default: break; } -- To view, visit http://gerrit.cloudera.org:8080/11142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I26deff32da8c990cb8a2ba220bb81858ddd6d73f Gerrit-Change-Number: 11142 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Wong <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Comment-Date: Sat, 15 Sep 2018 00:15:20 +0000 Gerrit-HasComments: Yes
