[
https://issues.apache.org/jira/browse/KAFKA-7072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guozhang Wang resolved KAFKA-7072.
----------------------------------
Resolution: Fixed
> Kafka Streams may drop rocksb window segments before they expire
> ----------------------------------------------------------------
>
> Key: KAFKA-7072
> URL: https://issues.apache.org/jira/browse/KAFKA-7072
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 0.11.0.0, 0.11.0.1, 1.0.0, 0.11.0.2, 1.1.0, 2.0.0, 1.0.1
> Reporter: John Roesler
> Assignee: John Roesler
> Priority: Minor
> Fix For: 2.1.0
>
>
> The current implementation of Segments used by Rocks Session and Time window
> stores is in conflict with our current timestamp management model.
> The current segmentation approach allows configuration of a fixed number of
> segments (let's say *4*) and a fixed retention time. We essentially divide up
> the retention time into the available number of segments:
> {quote}{{<---------|-----------------------------|}}
> {{ expiration date right now}}
> {{ -------retention time--------/}}
> {{ | seg 0 | seg 1 | seg 2 | seg 3 |}}
> {quote}
> Note that we keep one extra segment so that we can record new events, while
> some events in seg 0 are actually expired (but we only drop whole segments,
> so they just get to hang around.
> {quote}{{<-------------|-----------------------------|}}
> {{ expiration date right now}}
> {{ -------retention time--------/}}
> {{ | seg 0 | seg 1 | seg 2 | seg 3 |}}
> {quote}
> When it's time to provision segment 4, we know that segment 0 is completely
> expired, so we drop it:
> {quote}{{<-------------------|-----------------------------|}}
> {{ expiration date right now}}
> {{ -------retention time--------/}}
> {{ | seg 1 | seg 2 | seg 3 | seg 4 |}}
> {quote}
>
> However, the current timestamp management model allows for records from the
> future. Namely, because we define stream time as the minimum buffered
> timestamp (nondecreasing), we can have a buffer like this: [ 5, 2, 6 ], and
> our stream time will be 2, but we'll handle a record with timestamp 5 next.
> referring to the example, this means we could wind up having to provision
> segment 4 before segment 0 expires!
> Let's say "f" is our future event:
> {quote}{{<-------------------|-----------------------------|----f}}
> {{ expiration date right now}}
> {{ -------retention time--------/}}
> {{ | seg 1 | seg 2 | seg 3 | seg 4 |}}
> {quote}
> Should we drop segment 0 prematurely? Or should we crash and refuse to
> process "f"?
> Today, we do the former, and this is probably the better choice. If we refuse
> to process "f", then we cannot make progress ever again.
> Dropping segment 0 prematurely is a bummer, but users could also set the
> retention time high enough that they don't think they'll actually get any
> events late enough to need segment 0. Worst case, since we can have many
> future events without advancing stream time, sparse enough to each require
> their own segment, which would eat deeply into the retention time, dropping
> many segments that should be live.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)