[ 
https://issues.apache.org/jira/browse/KAFKA-16283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819363#comment-17819363
 ] 

Artem Livshits commented on KAFKA-16283:
----------------------------------------

The issue was created by the first attempt to implement sticky partitioner that 
introduced onNewBatch and heavily relies on it.  Now, the old sticky 
partitioner and onNewBatch have been deprecated for a while in favor of the new 
strictly uniform sticky partitioner 
([https://cwiki.apache.org/confluence/display/KAFKA/KIP-794%3A+Strictly+Uniform+Sticky+Partitioner)].
  We should be able to remove the old sticky partitioner and onNewBatch logic 
in 4.0 and that will fix the RoundRobin partitioner. 

> RoundRobinPartitioner will only send to half of the partitions in a topic
> -------------------------------------------------------------------------
>
>                 Key: KAFKA-16283
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16283
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.1.0, 3.0.0, 3.6.1
>            Reporter: Luke Chen
>            Priority: Major
>
> When using `org.apache.kafka.clients.producer.RoundRobinPartitioner`, we 
> expect data are sent to all partitions in round-robin manner. But we found 
> there are only half of the partitions got the data. This causes half of the 
> resources(storage, consumer...) are wasted.
> {code:java}
> > bin/kafka-topics.sh --create --topic quickstart-events4 --bootstrap-server 
> > localhost:9092 --partitions 2 
> Created topic quickstart-events4.
> # send 1000 records to the topic, expecting 500 records in partition0, and 
> 500 records in partition1
> > bin/kafka-producer-perf-test.sh --topic quickstart-events4 --num-records 
> > 1000 --record-size 1024 --throughput -1 --producer-props 
> > bootstrap.servers=localhost:9092 
> > partitioner.class=org.apache.kafka.clients.producer.RoundRobinPartitioner
> 1000 records sent, 6535.947712 records/sec (6.38 MB/sec), 2.88 ms avg 
> latency, 121.00 ms max latency, 2 ms 50th, 7 ms 95th, 10 ms 99th, 121 ms 
> 99.9th.
> > ls -al /tmp/kafka-logs/quickstart-events4-1
> total 24
> drwxr-xr-x   7 lukchen  wheel       224  2 20 19:53 .
> drwxr-xr-x  70 lukchen  wheel      2240  2 20 19:53 ..
> -rw-r--r--   1 lukchen  wheel  10485760  2 20 19:53 00000000000000000000.index
> -rw-r--r--   1 lukchen  wheel   1037819  2 20 19:53 00000000000000000000.log
> -rw-r--r--   1 lukchen  wheel  10485756  2 20 19:53 
> 00000000000000000000.timeindex
> -rw-r--r--   1 lukchen  wheel         8  2 20 19:53 leader-epoch-checkpoint
> -rw-r--r--   1 lukchen  wheel        43  2 20 19:53 partition.metadata
> # No records in partition 1
> > ls -al /tmp/kafka-logs/quickstart-events4-0
> total 8
> drwxr-xr-x   7 lukchen  wheel       224  2 20 19:53 .
> drwxr-xr-x  70 lukchen  wheel      2240  2 20 19:53 ..
> -rw-r--r--   1 lukchen  wheel  10485760  2 20 19:53 00000000000000000000.index
> -rw-r--r--   1 lukchen  wheel         0  2 20 19:53 00000000000000000000.log
> -rw-r--r--   1 lukchen  wheel  10485756  2 20 19:53 
> 00000000000000000000.timeindex
> -rw-r--r--   1 lukchen  wheel         0  2 20 19:53 leader-epoch-checkpoint
> -rw-r--r--   1 lukchen  wheel        43  2 20 19:53 partition.metadata
> {code}
> Tested in kafka 3.0.0, 3.2.3, and the latest trunk, they all have the same 
> issue. It should already exist for a long time.
>  
> Had a quick look, it's because we will abortOnNewBatch each time when new 
> batch created.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to