[ 
https://issues.apache.org/jira/browse/KAFKA-10888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286039#comment-17286039
 ] 

Jun Rao commented on KAFKA-10888:
---------------------------------

[~hachikuji]: Thanks for the analysis. One thing that led to this behavior is 
that sticky partitioner tries to distribute the data to different partitions in 
batches and the batches sometimes are not even in size as you pointed out. So, 
one way of fixing this is to have the sticky partitioner distribute the data in 
units that are always equal (e.g., a fixed amount of bytes based on batch size).

>  Sticky partition leads to uneven product msg, resulting in abnormal delays 
> in some partations
> ----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-10888
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10888
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, producer 
>    Affects Versions: 2.4.1
>            Reporter: jr
>            Priority: Major
>         Attachments: image-2020-12-24-21-05-02-800.png, 
> image-2020-12-24-21-09-47-692.png, image-2020-12-24-21-10-24-407.png
>
>
>   110 producers ,550 partitions ,550 consumers , 5 nodes Kafka cluster
>   The producer uses the nullkey+stick partitioner, the total production rate 
> is about 100w tps
> Observed partition delay is abnormal and message distribution is uneven, 
> which leads to the maximum production and consumption delay of the partition 
> with more messages 
> abnormal.
>   I cannot find reason that stick will make the message distribution uneven 
> at this production rate.
>   I can't switch to the round-robin partitioner, which will increase the 
> delay and cpu cost. Is thathe stick partationer design cause uneven message 
> distribution, or this is abnormal. How to solve it?
>   !image-2020-12-24-21-09-47-692.png!
> As shown in the picture, the uneven distribution is concentrated on some 
> partitions and some brokers, there seems to be some rules.
> This problem does not only occur in one cluster, but in many high tps 
> clusters,
> The problem is more obvious on the test cluster we built.
> !image-2020-12-24-21-10-24-407.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to