Zoltan Martonka has posted comments on this change. ( http://gerrit.cloudera.org:8080/22867 )
Change subject: KUDU-3665 Send Send no-op heartbeat operations batched PART1 ...................................................................... Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/22867/5//COMMIT_MSG Commit Message: PS5: > I'd expect to see at least a few (unit?) tests for MultiRaftHeartbeatBatche So far, I’ve tested it by setting enable_multi_raft_heartbeat_batcher=1 and running all the unit tests. All tests that run a mini-cluster effectively test this functionality (and running some tests from client-test.cc was enough during development to keep checking if the cluster still works). Some become flaky, but now with discard they seem to be stable again. I think what we need to check is that there is no delay or increase in CPU usage for writes/reads (I launched a cluster for that, and also compared unit tests like ConcurrentWrites in client-test with the flag turned on and off). And tserver failures are detected, leadership elections still work etc. I should definitely add some more tests before removing the “experimental” tag, but first, I wanted to get the code reviewed. Most of the important things are already covered with unit tests; I just need to allow the flag. Also, we should consider running tests involving heartbeats with both buffering enabled and disabled. This would be logical from a correctness standpoint; however, it would significantly increase the testing load. http://gerrit.cloudera.org:8080/#/c/22867/5//COMMIT_MSG@7 PS5, Line 7: Add option to send no-op heartbeat operations batched PART1 > Consider opening a JIRA item at https://issues.apache.org/jira for tracking Done, KUDU-3665 http://gerrit.cloudera.org:8080/#/c/22867/5//COMMIT_MSG@10 PS5, Line 10: CPU and networking > maybe "CPU and network resources"? Done http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/client/client-test.cc File src/kudu/client/client-test.cc: http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/client/client-test.cc@10364 PS5, Line 10364: sort(row.begin(), row.end()); > Why do we need sort here? If it's really necessary, please add a comment t This change is no longer needed. I will revert it. Theoretically, there is no guarantee that the CLOSEST_REPLICA contains everything as soon as the LEADER committed the changes (since only a 2/3 majority of tservers has to finish). Also, there is no guarantee that we receive the data in the same order, and in theory, we should sort the two vectors before comparison if we only want to check whether they contain the same elements (but not necessarily in the same order). Of course, if there is no significant delay in the system, then this should almost never break on localhost. http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus.proto File src/kudu/consensus/consensus.proto: http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus.proto@419 PS5, Line 419: tablet_id > Is 'tablet_id' really shared between messages? no http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus.proto@438 PS5, Line 438: ConsensusRequestPB > Is this a typo? Yes http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus_peers.h File src/kudu/consensus/consensus_peers.h: http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus_peers.h@232 PS5, Line 232: NO_ACTIVE, : PENDING_NON_BUFFERED, : PENDING_BUFFERED_PREPARE, : PENDING_BUFFERED_POSSIBLY_IN_BUFFERED, : PENDING_BUFFERED_REQUEST_TO_FLUSH, : PENDING_BUFFERED_FLUSHED > Please add short description for each of the enumerators. Done http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus_peers.h@240 PS5, Line 240: std::uint64_t > nit: is it possible to have just 'uint64_t'? yes http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/tserver/tablet_service.cc File src/kudu/tserver/tablet_service.cc: http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/tserver/tablet_service.cc@1843 PS5, Line 1843: mzzzzz > nit for here and elsewhere: consider using TODO(github_name) instead for t Done -- To view, visit http://gerrit.cloudera.org:8080/22867 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie92ba4de5eae00d56cd513cb644dce8fb6e14538 Gerrit-Change-Number: 22867 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Martonka <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Zoltan Chovan <[email protected]> Gerrit-Reviewer: Zoltan Martonka <[email protected]> Gerrit-Comment-Date: Wed, 04 Jun 2025 13:45:19 +0000 Gerrit-HasComments: Yes
