[ 
https://issues.apache.org/jira/browse/KAFKA-20663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck resolved KAFKA-20663.
---------------------------------
    Resolution: Fixed

> KIP-1035: stale persisted changelog offset causes 
> OffsetOutOfRangeException/TaskCorruptedException on restart
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-20663
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20663
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 4.3.0
>            Reporter: Bill Bejeck
>            Assignee: Bill Bejeck
>            Priority: Blocker
>             Fix For: 4.3.1, 4.4.0
>
>
> In 4.3, KIP-1035 moved the changelog offset into RocksDB and removed the 
> forced flush on commit, so the persisted offset is now only made durable by 
> an organic memtable flush or a clean close. When that offset goes stale — 
> after an unclean exit, or a clean shutdown followed by changelog 
> truncation/compaction while the instance is down — and the changelog 
> log-start offset has advanced past it, the restore consumer seeks out of 
> range and throws OffsetOutOfRangeException, which Streams converts to a 
> TaskCorruptedException (full local-state wipe and rebuild). This happens far 
> more often than in 4.2 (where the forced flush kept the offset within roughly 
> commit.interval.ms), affecting both at-least-once and exactly-once and 
> hitting windowed/segmented stores hardest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to