[jira] [Commented] (KAFKA-10760) In compacted topic with max.compaction.lag.ms, the segments are not rolled until new messages arrive
[ https://issues.apache.org/jira/browse/KAFKA-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619908#comment-17619908 ] Jun Rao commented on KAFKA-10760: - A potential solution is to implement the time based segment rolling based on the difference between the current time and the segment creation time. Since java 7, we could use [https://docs.oracle.com/javase/8/docs/api/java/nio/file/attribute/BasicFileAttributes.html#creationTime--] to get segment create time. > In compacted topic with max.compaction.lag.ms, the segments are not rolled > until new messages arrive > > > Key: KAFKA-10760 > URL: https://issues.apache.org/jira/browse/KAFKA-10760 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Sarwar Bhuiyan >Assignee: Brajesh Kumar >Priority: Major > > Currently, if a compacted topic has min.cleanable.dirty.ratio set to > something low and max.compaction.lag.ms set to a small time, according to KIP > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-354] the expectation > is that the active segment will be rolled regardless or segment.ms or whether > new data has come in to "advance" the time. However, in practice, the current > implementation only rolls the segment when new data which means that there > are situations where the topic is not fully compacted until new data arrives > which may not be until a while later. The implementation can be improved by > rolling the segment just purely based on the max.compaction.lag.ms setting. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (KAFKA-10760) In compacted topic with max.compaction.lag.ms, the segments are not rolled until new messages arrive
[ https://issues.apache.org/jira/browse/KAFKA-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249409#comment-17249409 ] Jun Rao commented on KAFKA-10760: - [~kbrajesh176]: For max.compaction.lag.ms, we simply set segment.ms to that value to trigger a segment roll. Currently, time-based rolling is achieved by comparing an incoming record's timestamp with the timestamp of the first record. So, a segment will only be rolled if there is a new record coming in. Not sure what's the best way to improve this since the straight-forward approach of just comparing the record's timestamp and current time for log rolling has the potential issue that sending records with old timestamp triggers frequent log rolling. > In compacted topic with max.compaction.lag.ms, the segments are not rolled > until new messages arrive > > > Key: KAFKA-10760 > URL: https://issues.apache.org/jira/browse/KAFKA-10760 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Sarwar Bhuiyan >Assignee: Brajesh Kumar >Priority: Major > > Currently, if a compacted topic has min.cleanable.dirty.ratio set to > something low and max.compaction.lag.ms set to a small time, according to KIP > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-354] the expectation > is that the active segment will be rolled regardless or segment.ms or whether > new data has come in to "advance" the time. However, in practice, the current > implementation only rolls the segment when new data which means that there > are situations where the topic is not fully compacted until new data arrives > which may not be until a while later. The implementation can be improved by > rolling the segment just purely based on the max.compaction.lag.ms setting. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (KAFKA-10760) In compacted topic with max.compaction.lag.ms, the segments are not rolled until new messages arrive
[ https://issues.apache.org/jira/browse/KAFKA-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249315#comment-17249315 ] Brajesh Kumar commented on KAFKA-10760: [~junrao] Can I pick this up and reproduce the issue? > In compacted topic with max.compaction.lag.ms, the segments are not rolled > until new messages arrive > > > Key: KAFKA-10760 > URL: https://issues.apache.org/jira/browse/KAFKA-10760 > Project: Kafka > Issue Type: Bug > Components: core >Reporter: Sarwar Bhuiyan >Priority: Major > > Currently, if a compacted topic has min.cleanable.dirty.ratio set to > something low and max.compaction.lag.ms set to a small time, according to KIP > [https://cwiki.apache.org/confluence/display/KAFKA/KIP-354] the expectation > is that the active segment will be rolled regardless or segment.ms or whether > new data has come in to "advance" the time. However, in practice, the current > implementation only rolls the segment when new data which means that there > are situations where the topic is not fully compacted until new data arrives > which may not be until a while later. The implementation can be improved by > rolling the segment just purely based on the max.compaction.lag.ms setting. -- This message was sent by Atlassian Jira (v8.3.4#803005)