Haruki Okada created KAFKA-16541: ------------------------------------ Summary: Potential leader epoch checkpoint file corruption on OS crash Key: KAFKA-16541 URL: https://issues.apache.org/jira/browse/KAFKA-16541 Project: Kafka Issue Type: Bug Components: core Reporter: Haruki Okada Assignee: Haruki Okada
Pointed out by [~junrao] on [GitHub|https://github.com/apache/kafka/pull/14242#discussion_r1556161125] [A patch for KAFKA-15046|https://github.com/apache/kafka/pull/14242] got rid of fsync of leader-epoch ckeckpoint file in some path for performance reason. However, since now checkpoint file is flushed to the device asynchronously by OS, content would corrupt if OS suddenly crashes (e.g. by power failure, kernel panic) in the middle of flush. Corrupted checkpoint file could prevent Kafka broker to start-up -- This message was sent by Atlassian Jira (v8.20.10#820010)