Hi, looking at docs I see that Kafka seems to support throttling of consumer/replication traffic, but I can't find anything that would suggest you can prioritize one traffic type over another. The problem: if at some point consumers starts to be lagging they will start consuming messages as fast as they can, if there's enough lag to recover from and if there's enough affected consumers that traffic can easily saturate network on the leader, which can in turn affect replicas of the partitions on that leader. Is there a way to avoid replicas falling out of sync with the leader under such scenario? Is there a way to prefer replication traffic over consumer traffic? Or is throttling the only way to achieve this? Throttling solves this, but it requires setting limits that can change over time, so it's bit more manual and requires more maintenance than priority based solution.
-- Łukasz Mierzwa