[
https://issues.apache.org/jira/browse/KAFKA-14171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Justinwins updated KAFKA-14171:
-------------------------------
Description:
- This seems a small bug (or improvment) ,but it really impacts perf of mm2.
- When DistributedHerder starts, it will startServices()-->
this.worker.start() --> offsetBackingStore.start() --> offsetLog.start() ,and
finally in `KafkaBasedLog` class ,we see
`consumer.seekToBeginning(partitions)` . Take a look at
`org.apache.kafka.connect.util.KafkaBasedLog#start` ,you will get to know it.
- Basically, mm2-offsets topic will be kept for 7 days (as defined by
'retention.ms' ) . If there are many paritions for mm2 to replicate ,then
mm2-offsets topic may be quite 'big' in 7 days. And it may take a few
minutes or more to poll unitil the consumer reaches the latest . This is a VERY
Cpu-consuming action, and it incurs cpu throttle in the k8s container.
- I think mm-offsets topic ,or to be specific , KafkaBasedLog is a special
topic .At least, we can set a much shorter ttl for it to avoid this bug .
was:
- This seems a small bug (or improvment) ,but it really impacts perf of mm2.
- When DistributedHerder starts, it will startServices()-->
this.worker.start() -->
offsetBackingStore.start() --> offsetLog.start() ,and finally in
`KafkaBasedLog` class ,we
see `consumer.seekToBeginning(partitions)` .
Take a look at `org.apache.kafka.connect.util.KafkaBasedLog#start` ,you will
get to know it.
- Basically, mm2-offsets topic will be kept for 7 days (as defined by
'retention.ms' ) . If there are many paritions for mm2 to replicate ,then
mm2-offsets topic may be quite 'big' in 7 days. And it may take a few
minutes or more to poll unitil the consumer reaches the latest . This is a VERY
Cpu-consuming action, and it incurs cpu throttle in the k8s container.
- I think mm-offsets topic ,or to be specific , KafkaBasedLog is a special
topic .At least, we can set a much shorter ttl for it to avoid this bug .
> mm2-offsets topic should be set retention.ms=1h or less as default
> ------------------------------------------------------------------
>
> Key: KAFKA-14171
> URL: https://issues.apache.org/jira/browse/KAFKA-14171
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 3.2.1
> Reporter: Justinwins
> Priority: Major
>
> - This seems a small bug (or improvment) ,but it really impacts perf of mm2.
> - When DistributedHerder starts, it will startServices()-->
> this.worker.start() --> offsetBackingStore.start() --> offsetLog.start()
> ,and finally in `KafkaBasedLog` class ,we see
> `consumer.seekToBeginning(partitions)` . Take a look at
> `org.apache.kafka.connect.util.KafkaBasedLog#start` ,you will get to know it.
> - Basically, mm2-offsets topic will be kept for 7 days (as defined by
> 'retention.ms' ) . If there are many paritions for mm2 to replicate ,then
> mm2-offsets topic may be quite 'big' in 7 days. And it may take a few
> minutes or more to poll unitil the consumer reaches the latest . This is a
> VERY Cpu-consuming action, and it incurs cpu throttle in the k8s container.
> - I think mm-offsets topic ,or to be specific , KafkaBasedLog is a special
> topic .At least, we can set a much shorter ttl for it to avoid this bug .
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)