Zoltan Martonka has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22867 )

Change subject: KUDU-3665 Send Send no-op heartbeat operations batched PART1
......................................................................


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/22867/5//COMMIT_MSG
Commit Message:

PS5:
> I'd expect to see at least a few (unit?) tests for MultiRaftHeartbeatBatche
So far, I’ve tested it by setting enable_multi_raft_heartbeat_batcher=1 and 
running all the unit tests. All tests that run a mini-cluster effectively test 
this functionality (and running some tests from client-test.cc was enough 
during development to keep checking if the cluster still works). Some become 
flaky, but now with discard they seem to be stable again.

I think what we need to check is that there is no delay or increase in CPU 
usage for writes/reads (I launched a cluster for that, and also compared unit 
tests like ConcurrentWrites in client-test with the flag turned on and off). 
And tserver failures are detected, leadership elections still work etc.

I should definitely add some more tests before removing the “experimental” tag, 
but first, I wanted to get the code reviewed. Most of the important things are 
already covered with unit tests; I just need to allow the flag.

Also, we should consider running  tests involving heartbeats with both 
buffering enabled and disabled. This would be logical from a correctness 
standpoint; however, it would significantly increase the testing load.


http://gerrit.cloudera.org:8080/#/c/22867/5//COMMIT_MSG@7
PS5, Line 7: Add option to send no-op heartbeat operations batched PART1
> Consider opening a JIRA item at https://issues.apache.org/jira for tracking
Done, KUDU-3665


http://gerrit.cloudera.org:8080/#/c/22867/5//COMMIT_MSG@10
PS5, Line 10: CPU and networking
> maybe "CPU and network resources"?
Done


http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/client/client-test.cc@10364
PS5, Line 10364: sort(row.begin(), row.end());
> Why do we need sort here?  If it's really necessary, please add a comment t
This change is no longer needed. I will revert it.
Theoretically, there is no guarantee that the CLOSEST_REPLICA contains 
everything as soon as the LEADER committed the changes (since only a 2/3 
majority of tservers has to finish).
Also, there is no guarantee that we receive the data in the same order, and in 
theory, we should sort the two vectors before comparison if we only want to 
check whether they contain the same elements (but not necessarily in the same 
order).
Of course, if there is no significant delay in the system, then this should 
almost never break on localhost.


http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus.proto
File src/kudu/consensus/consensus.proto:

http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus.proto@419
PS5, Line 419: tablet_id
> Is 'tablet_id' really shared between messages?
no


http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus.proto@438
PS5, Line 438: ConsensusRequestPB
> Is this a typo?
Yes


http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus_peers.h
File src/kudu/consensus/consensus_peers.h:

http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus_peers.h@232
PS5, Line 232:     NO_ACTIVE,
             :     PENDING_NON_BUFFERED,
             :     PENDING_BUFFERED_PREPARE,
             :     PENDING_BUFFERED_POSSIBLY_IN_BUFFERED,
             :     PENDING_BUFFERED_REQUEST_TO_FLUSH,
             :     PENDING_BUFFERED_FLUSHED
> Please add short description for each of the enumerators.
Done


http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/consensus/consensus_peers.h@240
PS5, Line 240: std::uint64_t
> nit: is it possible to have just 'uint64_t'?
yes


http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/22867/5/src/kudu/tserver/tablet_service.cc@1843
PS5, Line 1843: mzzzzz
> nit for here and elsewhere: consider using TODO(github_name)  instead for t
Done



--
To view, visit http://gerrit.cloudera.org:8080/22867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie92ba4de5eae00d56cd513cb644dce8fb6e14538
Gerrit-Change-Number: 22867
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Martonka <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan <[email protected]>
Gerrit-Reviewer: Zoltan Martonka <[email protected]>
Gerrit-Comment-Date: Wed, 04 Jun 2025 13:45:19 +0000
Gerrit-HasComments: Yes

Reply via email to