Hi Jorge,

You should check the JIRA: https://issues.apache.org/jira/browse/KAFKA-16385
where we had some discussion.
Welcome to provide your thoughts there.

Thanks.
Luke

On Thu, Mar 21, 2024 at 3:33 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Hi dev community,
>
> I'd like to share some findings on how rotation of active segment differ
> depending on whether topic retention is time- or size-based.
>
> I was (wrongly) under the assumption that active segments are only rotated
> when segment configs (segment.bytes (1GiB) or segment.ms (7d)) or global
> log configs (log.roll.ms) force it  -- regardless of the retention
> configuration.
> This seems to be different depending on how retention is defined:
>
> - If a topic has a retention based on time[1], the condition to rotate the
> active segment is based on the latest timestamp. If the difference with
> current time is largest than retention time, then segment (including
> active) should be deleted. Active segment is rotated, and in next round is
> deleted.
>
> - If a topic has retention based on size[2] though, the condition not only
> depends on the size of the segment itself but first on the total log size,
> forcing to always have at least a single (active) segment: first difference
> between total log size and retention is calculated, let's say a single
> segment of 5MB and retention is 1MB; then total difference is 4MB, then the
> condition to delete validates if the difference of the current segment and
> the total difference is higher than zero, then delete. As the segment size
> will always be higher than the total difference when there is a single
> segment, then there will always be at least 1 segment. In this case the
> only case where active segment is rotated it is when a new message arrives.
>
> Added steps to reproduce[3].
>
> Maybe I missing something obvious, but this seems inconsistent to me.
> Either both retention configs should rotate active segments, or none of
> them should and active segment should be only governed by segment bytes|ms
> configs or log.roll config.
>
> I believe it's a useful feature to "force" active segment rotation without
> changing segment of global log rotation given that features like Compaction
> and Tiered Storage can benefit from this; but would like to clarify this
> behavior and make it consistent for both retention options, and/or call it
> out explicitly in the documentation.
>
> Looking forward to your feedback!
>
> Jorge.
>
> [1]:
>
> https://github.com/apache/kafka/blob/55a6d30ccbe971f4d2e99aeb3b1a773ffe5792a2/core/src/main/scala/kafka/log/UnifiedLog.scala#L1566
> [2]:
>
> https://github.com/apache/kafka/blob/55a6d30ccbe971f4d2e99aeb3b1a773ffe5792a2/core/src/main/scala/kafka/log/UnifiedLog.scala#L1575-L1583
>
> [3]: https://gist.github.com/jeqo/d32cf07493ee61f3da58ac5e77b192b2
>

Reply via email to