[
https://issues.apache.org/jira/browse/KAFKA-20663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bill Bejeck resolved KAFKA-20663.
---------------------------------
Resolution: Fixed
> KIP-1035: stale persisted changelog offset causes
> OffsetOutOfRangeException/TaskCorruptedException on restart
> -------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-20663
> URL: https://issues.apache.org/jira/browse/KAFKA-20663
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 4.3.0
> Reporter: Bill Bejeck
> Assignee: Bill Bejeck
> Priority: Blocker
> Fix For: 4.3.1, 4.4.0
>
>
> In 4.3, KIP-1035 moved the changelog offset into RocksDB and removed the
> forced flush on commit, so the persisted offset is now only made durable by
> an organic memtable flush or a clean close. When that offset goes stale —
> after an unclean exit, or a clean shutdown followed by changelog
> truncation/compaction while the instance is down — and the changelog
> log-start offset has advanced past it, the restore consumer seeks out of
> range and throws OffsetOutOfRangeException, which Streams converts to a
> TaskCorruptedException (full local-state wipe and rebuild). This happens far
> more often than in 4.2 (where the forced flush kept the offset within roughly
> commit.interval.ms), affecting both at-least-once and exactly-once and
> hitting windowed/segmented stores hardest.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)