weiguang liu created KAFKA-15120:
------------------------------------
Summary: Optimizing Recovery Time for Non-Transactional and
Idempotent Partitions in Kafka
Key: KAFKA-15120
URL: https://issues.apache.org/jira/browse/KAFKA-15120
Project: Kafka
Issue Type: Improvement
Components: core
Reporter: weiguang liu
Kafka's recovery logic involves rebuilding the index and producerStats from the
log segment after the recovery point. In scenarios where a broker has a large
number of partitions, the recovery time can become very long. For example, when
a broker has 1,000 partitions and the average log segment size is 1GB, the
broker may require reading as much as 500GB of log data for recovery, which can
be unbearable. Most of the partitions might not be using transactions and
idempotency, so can we consider using a recovery method that starts from the
recovery point for those partitions that do not use transactions and
idempotency, instead of starting the recovery from the beginning of the entire
log segment? My understanding is that for non-transactional and idempotent
partitions, the index is append-only and can be completely recovered from the
recovery point, rather than from the start offset of the log segment. I am not
sure what the potential risks of this approach might be or why the community
did not consider it.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)