Abhishek Chennaka has posted comments on this change. ( http://gerrit.cloudera.org:8080/19445 )
Change subject: KUDU-1945 Update auto incrementing counter during bootstrap ...................................................................... Patch Set 6: (36 comments) http://gerrit.cloudera.org:8080/#/c/19445/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19445/6//COMMIT_MSG@35 PS6, Line 35: > I guess there is also a case when those non-commited operations are DELETEs Reading/Decoding DELETE ops will not affect the auto incrementing counter. It is only updated when reading INSERT ops. http://gerrit.cloudera.org:8080/#/c/19445/6//COMMIT_MSG@38 PS6, Line 38: Columns - A Non Unique Primary Key column of type INT32 > Thank you for the clarification. Yes that is correct. At a given point the in-memory counter values might be different for different replicas but since the followers update the value based on what the leader raft replicates with an INSERT op, the column value is going to be the same. http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc File src/kudu/integration-tests/auto_incrementing-itest.cc: http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@71 PS6, Line 71: using kudu::tablet::TabletReplica; > consider remove this 'using'? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@132 PS6, Line 132: void DeleteRow > Why not to make this method returning Status and use RETURN_NOT_OK() instea Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@161 PS6, Line 161: // Stringify Rows from the response. > Is it possible to re-use TabletServerTestBase::StringifyRowsFromResponse() Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@168 PS6, Line 168: > nit: fix the indent? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@172 PS6, Line 172: > nit: fix the indent? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@206 PS6, Line 206: DCHECK_EQ > ASSERT_EQ? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@219 PS6, Line 219: opts.extra_tserver_flags.emplace_back("--log_segment_size_bytes_for_tests=1024"); : opts.extra_tserver_flags.emplace_back("--log_max_segments_to_retain=2"); : opts.extra_tserver_flags.emplace_back("--maintenance_manager_polling_interval_ms=10"); : opts.extra_tserver_flags.emplace_back("--maintenance_manager_num_threads=4"); : opts.extra_tserver_flags.emplace_back("--flush_threshold_secs=1"); : opts.extra_tserver_flags.emplace_back("--log_compression_codec=no_compression"); > nit: use initializers list for opts.extra_tserver_flags instead? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@238 PS6, Line 238: cluster_->tserver_proxy(0)->ListTablets(req, &resp, &rpc); > Wrap this into ASSERT_OK()? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@239 PS6, Line 239: DCHECK_EQ(1, resp.status_and_schema_size()); > Why not to use ASSERT_EQ() here instead? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@240 PS6, Line 240: string tablet_uuid > nit: could be 'const string& tablet_uuid'? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@245 PS6, Line 245: + > nit: separate '+' from the arguments with spaces Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@267 PS6, Line 267: cluster_->Restart() > Wrap Restart() into ASSERT_OK()? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@284 PS6, Line 284: TEST_F(AutoIncrementingItest, BootstrapNoWalsNoData) { > Is it possible to separate the majority of the code in this BootstrapNoWals Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@287 PS6, Line 287: opts.extra_tserver_flags.emplace_back("--log_segment_size_bytes_for_tests=1024"); : opts.extra_tserver_flags.emplace_back("--log_max_segments_to_retain=2"); : opts.extra_tserver_flags.emplace_back("--maintenance_manager_polling_interval_ms=10"); : opts.extra_tserver_flags.emplace_back("--maintenance_manager_num_threads=4"); : opts.extra_tserver_flags.emplace_back("--flush_threshold_secs=1"); : opts.extra_tserver_flags.emplace_back("--log_compression_codec=no_compression"); > nit: use initializers list for opts.extra_tserver_flags instead? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@306 PS6, Line 306: cluster_->tserver_proxy(0)->ListTablets(req, &resp, &rpc); > Wrap this into ASSERT_OK()? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@307 PS6, Line 307: DCHECK_EQ > Why not to use ASSERT_EQ() instead? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@334 PS6, Line 334: cluster_->Restart(); > Wrap this into ASSERT_OK()? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@341 PS6, Line 341: > nit: extra space Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@380 PS6, Line 380: raft > nit: Raft Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@381 PS6, Line 381: InsertData > Wrap into CHECK_OK()? Alternatively, store the status per thread and check This is now a part of the main thread and status will not be ok() as the inserts would fail due to servers being shutdown http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@381 PS6, Line 381: 5); > Why to sleep between each row, at all? Would be great to comment on the re Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@383 PS6, Line 383: // Ensure at least a couple of rows are attempted to be : // written before shutting down. : SleepFor(MonoDelta::FromMilliseconds(10)); : // Shutdown the tablet servers hosting followers while the data is still being written. : for (int i = 0; i < kNumTabletServers; i++) { : if (i != leader_ts_index) { : cluster_->tablet_server(i)->Shutdown(); : } : } > Would it be easier to run these in their own threads, but run the writer in Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@394 PS6, Line 394: leader replica > nit: the leader replica Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@394 PS6, Line 394: // At this point tablet server hosting leader replica would be having a replicate op without : // a corresponding COMMIT message > Is there a way to ensure that's exactly the case? Maybe, use smallest poss That should be the case, the reason being: The first write arrives at the leader replica and immediately Raft replicated. Raft heartbeat interval is 500ms and we send a write request every 5 ms. By the time the first few writes are sent to the followers and the leader replica is awaiting the ack, there would be a new write written to the WAL locally but not committed. By then we would shutdown the followers. I did try with smaller sized (10 bytes) WALs and the number of the WALs was not consistently more in the leader replica although I could see 1 pending op after startup consistently. http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@400 PS6, Line 400: CHECK_OK > Why not to use ASSERT_OK() here? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@405 PS6, Line 405: ASSERT_OK(InsertData(kNumRows, kNumRows * 2)); > Is there a way to check for Incomplete comment? http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@410 PS6, Line 410: CHECK_OK > Why not to use ASSERT_OK() here? Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@411 PS6, Line 411: leader_ts_index + kNumTabletServers + 1 > Could this part be shorten to leader_ts_index + 1? Yes, thanks for pointing that out. http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@419 PS6, Line 419: 3 > kNumTabletServers Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@420 PS6, Line 420: 2 > kNumTabletServers - 1 Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/integration-tests/auto_incrementing-itest.cc@421 PS6, Line 421: ASSERT_EQ(results[i].size(), results[i+1].size()); : for (int k = 0; k < results[i].size(); k++) { : ASSERT_EQ(results[i][k], results[i+1][k]); : } > Does it make sense to verify this across all the tablet replicas, not just Yes, I changed the condition of the loop to be based on the number of tablet servers. http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/tablet/tablet.cc File src/kudu/tablet/tablet.cc: http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/tablet/tablet.cc@437 PS6, Line 437: if (schema()->has_auto_incrementing()) { > Consider moving this code out into a separate method. Done http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/tablet/tablet.cc@442 PS6, Line 442: TODO(achennaka): Materialize only > That's good point! Is it possible to do so in this patch or you are planni It would be in a follow up patch. There doesn't seem to be too much of an overhead in this approach considering the time taken. http://gerrit.cloudera.org:8080/#/c/19445/6/src/kudu/tablet/tablet.cc@466 PS6, Line 466: LOG_WITH_PREFIX(INFO) << "Fetching the auto incrementing counter took " : << delta.ToMilliseconds() << "ms"; > It's better to use Stopwatch(Stopwatch::THIS_THREAD) for this purpose -- i Done -- To view, visit http://gerrit.cloudera.org:8080/19445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61b305efd7d5a065a2976327567163956c0c2184 Gerrit-Change-Number: 19445 Gerrit-PatchSet: 6 Gerrit-Owner: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Marton Greber <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Wed, 29 Mar 2023 07:39:52 +0000 Gerrit-HasComments: Yes
