Hi All, I am pretty new to kafka development and I am trying to limit the amount of traffic that a client can send/produce to the kafka cluster.
Using quotas on client-id, I was able to limit the amount of traffic that can be produced by a client to a value very close to the quota. Now I also want to make sure that the replicated traffic also follows the same limit. For example, I have 2 clients: client-1 and client-2, and 3 machines: mac-1, mac-2 and mac-3. Both the clients have a master replica at mac-1 and both have one slave replica each in mac-2 and mac-3 (Replication factor = 3). I have set quota of client-1 as 50 MBps and client-2 as 40 MBps. The client-quotas were able to ensure that data is written into master at the specified rate ( approximately 50 and 40 MBps respectively). I tested it by running producers of both clients simultaneously, setting ack as 1 (means ack when master receives) and using the script bin/kafka-producer-perf-test.sh. But when I ran the same script with ack as -1 (ack from all replicas), both my client's performance drops down to 30 MBps. My first question is shouldn't kafka still allow producers to get throughput of 50 and 40 MBps respectively even in presence of replicas? I then thought may be this is happening because the replica traffic does not have the same quota. Now I've been exploring the kafka source code and found out that there is already something called replication quota but I believe it does throttling only during partition movement. I've looked at ReplicationQuotaManager.scala , ClientQuotaManager.scala files but I am not sure how to implement this (throttle replica traffic traffic based on client-id). Does anyone have any ideas/suggestions on how this can be done in the existing kafka code? Thanks, Archie