[jira] [Commented] (KAFKA-10760) In compacted topic with max.compaction.lag.ms, the segments are not rolled until new messages arrive

2022-10-18 Thread Jun Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619908#comment-17619908
 ] 

Jun Rao commented on KAFKA-10760:
-

A potential solution is to implement the time based segment rolling based on 
the difference between the current time and the segment creation time. Since 
java 7, we could use 
[https://docs.oracle.com/javase/8/docs/api/java/nio/file/attribute/BasicFileAttributes.html#creationTime--]
  to get segment create time.

> In compacted topic with max.compaction.lag.ms, the segments are not rolled 
> until new messages arrive
> 
>
> Key: KAFKA-10760
> URL: https://issues.apache.org/jira/browse/KAFKA-10760
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sarwar Bhuiyan
>Assignee:  Brajesh Kumar
>Priority: Major
>
> Currently, if a compacted topic has min.cleanable.dirty.ratio set to 
> something low and max.compaction.lag.ms set to a small time, according to KIP 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-354] the expectation 
> is that the active segment will be rolled regardless or segment.ms or whether 
> new data has come in to "advance" the time. However, in practice, the current 
> implementation only rolls the segment when new data which means that there 
> are situations where the topic is not fully compacted until new data arrives 
> which may not be until a while later. The implementation can be improved by 
> rolling the segment just purely based on the max.compaction.lag.ms setting. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (KAFKA-10760) In compacted topic with max.compaction.lag.ms, the segments are not rolled until new messages arrive

2020-12-14 Thread Jun Rao (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249409#comment-17249409
 ] 

Jun Rao commented on KAFKA-10760:
-

[~kbrajesh176]: For max.compaction.lag.ms, we simply set segment.ms to that 
value to trigger a segment roll. Currently, time-based rolling is achieved by 
comparing an incoming record's timestamp with the timestamp of the first 
record. So, a segment will only be rolled if there is a new record coming in. 

Not sure what's the best way to improve this since the straight-forward 
approach of just comparing the record's timestamp and current time for log 
rolling has the potential issue that sending records with old timestamp 
triggers frequent log rolling.

> In compacted topic with max.compaction.lag.ms, the segments are not rolled 
> until new messages arrive
> 
>
> Key: KAFKA-10760
> URL: https://issues.apache.org/jira/browse/KAFKA-10760
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sarwar Bhuiyan
>Assignee:  Brajesh Kumar
>Priority: Major
>
> Currently, if a compacted topic has min.cleanable.dirty.ratio set to 
> something low and max.compaction.lag.ms set to a small time, according to KIP 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-354] the expectation 
> is that the active segment will be rolled regardless or segment.ms or whether 
> new data has come in to "advance" the time. However, in practice, the current 
> implementation only rolls the segment when new data which means that there 
> are situations where the topic is not fully compacted until new data arrives 
> which may not be until a while later. The implementation can be improved by 
> rolling the segment just purely based on the max.compaction.lag.ms setting. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KAFKA-10760) In compacted topic with max.compaction.lag.ms, the segments are not rolled until new messages arrive

2020-12-14 Thread Brajesh Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/KAFKA-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17249315#comment-17249315
 ] 

 Brajesh Kumar commented on KAFKA-10760:


[~junrao] Can I pick this up and reproduce the issue?

> In compacted topic with max.compaction.lag.ms, the segments are not rolled 
> until new messages arrive
> 
>
> Key: KAFKA-10760
> URL: https://issues.apache.org/jira/browse/KAFKA-10760
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Sarwar Bhuiyan
>Priority: Major
>
> Currently, if a compacted topic has min.cleanable.dirty.ratio set to 
> something low and max.compaction.lag.ms set to a small time, according to KIP 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-354] the expectation 
> is that the active segment will be rolled regardless or segment.ms or whether 
> new data has come in to "advance" the time. However, in practice, the current 
> implementation only rolls the segment when new data which means that there 
> are situations where the topic is not fully compacted until new data arrives 
> which may not be until a while later. The implementation can be improved by 
> rolling the segment just purely based on the max.compaction.lag.ms setting. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)