[ https://issues.apache.org/jira/browse/KAFKA-475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13438911#comment-13438911 ]
Neha Narkhede commented on KAFKA-475: ------------------------------------- If you roll log segments based on retention time, seems like you can have only one segment for that log at any point of time. If you want to roll 5 minute segments, it means that you can only have 5 minute worth of data for that partition. On the contrary, if I choose size based rolling and size based retention, I can have multiple log segments each of a specific size. What seems desirable is to have time based rolling + retention also behave the same way. I would imagine applications wanting to roll segments every 1 hour and retain 24 hours worth of data. This is an advantage for applications using getOffsetsBefore() to do some time indexed fetch of the data, since getOffsetsBefore only returns offsets at the log segment granularity. And it also gives applications a way to reason about the time window of the data retained for a partition. One potential downside is that, you can end up creating large number of log segments for your partition, if you choose too small a value for log.file.time.ms. But this problem exists today with size based log segment rolling too. So we are not introducing any regression in behavior. Other review comments - 1. Log 1.1 Rename currentMS to currentMs (Follow camel case convention). 1.2 How about renaming retentionMSInterval to retentionIntervalMs to be consistent with naming convention ? 1.3 In maybeRoll, looks like currentMS is unused apart from being used to compute the time difference. How about removing currentMS ? 2. LogManager 2.1 This is unrelated to your patch, but lets also rename logRetentionMSMap to logRetentionMsMap > Time based log segment rollout > ------------------------------ > > Key: KAFKA-475 > URL: https://issues.apache.org/jira/browse/KAFKA-475 > Project: Kafka > Issue Type: New Feature > Affects Versions: 0.7.1 > Reporter: Swapnil Ghike > Assignee: Swapnil Ghike > Labels: features > Fix For: 0.7.2 > > Attachments: kafka-475-v1.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > Some applications might want their data to be deleted from the Kafka servers > earlier than the default retention time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira