iamaleksey commented on code in PR #4106: URL: https://github.com/apache/cassandra/pull/4106#discussion_r2070147166
########## src/java/org/apache/cassandra/replication/Shard.java: ########## @@ -94,6 +109,42 @@ void addSummaryForRange(AbstractBounds<PartitionPosition> range, boolean include }); } + List<InetAddressAndPort> remoteReplicas() + { + List<InetAddressAndPort> replicas = new ArrayList<>(participants.size() - 1); + for (int i = 0, size = participants.size(); i < size; ++i) + { + int hostId = participants.get(i); + if (hostId != localHostId) + replicas.add(ClusterMetadata.current().directory.endpoint(new NodeId(hostId))); + } + return replicas; + } + + /** + * Collects replicated offsets for the logs owned by this coordinator on this shard. + */ + ShardReplicatedOffsets collectReplicatedOffsets() + { + Long2ObjectHashMap<LogReplicatedOffsets> offsets = new Long2ObjectHashMap<>(); + for (CoordinatorLogPrimary log : primaryLogs()) Review Comment: It's not about the broadcast payload size in isolation, which I agree is ultimately not a serious issue. There is also work that you need to do with that message when it arrives. Multiply that by frequency of broadcasts, and - possibly - by RF, and you get the final cost. There is a maximum cost that we are willing to pay here, and the main variable - client write frequency being mainly outside of our control - is the frequency of broadcasts. If only the coordinator does broadcasting of its logs' states, then you can have a higher frequency of broadcasts. If everyone replica does, then you have to scale down the maximum broadcast frequency by an order of RF. And we want the broadcasts to be *frequent* to make reads as cheap as possible. Every avoidable delay in propagation potentially costs us blocking on reconciles that don't really need to be done, and/or triggering SRP that could be avoided by a broadcast arriving earlier. Additionally, the broadcasts from non-coordinator nodes will be always almost entirely redundant subsets of coordinator's broadcasts - who will always have the freshest and fullest picture, barring some in-flight write responses. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org