[
https://issues.apache.org/jira/browse/KAFKA-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15596965#comment-15596965
]
Jiangjie Qin commented on KAFKA-4099:
-------------------------------------
[~junrao] Thanks for the explanation. I agree that it is reasonable to roll the
log segment based on create time. However, I have a few concern over using the
original proposal:
1. It seems the rareness of replica movement is related to scale. e.g. today we
have over 1800 brokers at LI and 1-2 brokers die every day. So partition
reassignment almost happen every day. So I think there is a difference between
"rare at small scale" and "rare regardless of scale".
2. The incorrect create time does not only happen when partition movement
occurs. It seems most linux does not have a create time for the files. So the
create time of a segment would be lost when the brokers are rebooted.
Actually after thinking about the case of oscillating timestamp again, I am not
sure if that would actually cause frequent log rolling or not. Let's say we
have two producers one producing messages with current timestamp. The other one
is producing with timestamps of 7 days old. Assume the current active segment
is segment 0 and the current time is T. Because the log rolling is based on the
timestamp of the first message in a log segment, it is possible that the first
timestamp in segment 0 is 7 days ago (T - 7 days) so once we append a current
timestamp T, segment 1 is rolled out and its first timestamp will be T, so
segment 1 won't roll immediately like the previous one, i.e. segment 2 will
only be rolled out when it sees a timestamp greater than (T + log.roll.ms), and
so on.
In the above example, it is possible that segment 2 is rolled out because of
the segment size. In that case, segment 2 may have the first timestamp of (T -
7days) and segment 3 may get rolled out immediately but segment 3 will again
wait until either the segment is full or it sees a bigger timestamp that
triggers the log rolling. So in the worst case, we may roll out two new
segments in a row. not sure how bad it would be in terms of performance.
Admittedly, if we have some certain timestamp pattern, frequent log rolling may
still happen. I am curious did you see any real timestamp pattern that has
caused the frequent log rolling?
> Change the time based log rolling to only based on the message timestamp.
> -------------------------------------------------------------------------
>
> Key: KAFKA-4099
> URL: https://issues.apache.org/jira/browse/KAFKA-4099
> Project: Kafka
> Issue Type: Bug
> Components: core
> Reporter: Jiangjie Qin
> Assignee: Jiangjie Qin
> Fix For: 0.10.1.0
>
>
> This is an issue introduced in KAFKA-3163. When partition relocation occurs,
> the newly created replica may have messages with old timestamp and cause the
> log segment rolling for each message. The fix is to change the log rolling
> behavior to only based on the message timestamp when the messages are in
> message format 0.10.0 or above. If the first message in the segment does not
> have a timetamp, we will fall back to use the wall clock time for log rolling.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)