[ 
https://issues.apache.org/jira/browse/KAFKA-8270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiangtao Liu updated KAFKA-8270:
--------------------------------
    Attachment: Screen Shot 2020-04-15 at 11.02.55 AM.png

> Kafka timestamp-based retention policy is not working when Kafka client's 
> time is not reliable.
> -----------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8270
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8270
>             Project: Kafka
>          Issue Type: Bug
>          Components: log, log cleaner, logging
>    Affects Versions: 1.1.1
>            Reporter: Jiangtao Liu
>            Priority: Major
>              Labels: storage
>
> What's the issue?
> {quote} # There were log segments, which can not be deleted over configured 
> retention hours.{quote}
> What are impacts? 
> {quote} # Log space keep in increasing and finally cause space shortage.
>  # There are lots of log segment rolled with a smaller size. e.g log segment 
> may be only 50mb, not the expected 1gb.
>  # Kafka stream or client may experience missing data.
>  # It will be a way used to attack Kafka server.{quote}
> What's workaround adopted to resolve this issue?
> {quote} # If it's already happened on your Kafka system, you will need to run 
> a very tricky steps to resolve it.
>  # If it has not happened on your Kafka system yet, you may need to evaluate 
> whether you can switch to LogAppendTime for log.message.timestamp.type. 
> {quote}
> What are the reproduce steps?
> {quote} # Make sure Kafka client and server are not hosted in the same 
> machine.
>  # Configure log.message.timestamp.type with *CreateTime*, not LogAppendTime.
>  # Hack Kafka client's system clock time with a *future time*, e.g 
> 03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]
>  # Send message from Kafka client to server.{quote}
> What kinds of things you need to have a look after message handled by Kafka 
> server?
> {quote} # Check the timestamp in segment *.timeindex and record in segment 
> *.log. You will see all the timestamp values in **.timeindex are messed up 
> with a future time after `03/04/*2025*, 3:25:52 PM 
> [GMT-08:00|https://www.epochconverter.com/timezones?q=1741130752]`.   (Let's 
> say 00000000035957300794.log is the log segment which first receive the test 
> client's message. It will be referenced in #3)
>  # You will also see the log segment will be rolled with a smaller size (e.g 
> 50mb) than configured segment max size (e.g 1gb). 
>  # All of log segments including 00000000035957300794.* and new rolled, will 
> not be deleted over retention hours.{quote}
> What's the particular logic to cause this issue?
> {quote} # private def deletableSegments(predicate: (LogSegment, 
> Option[LogSegment]) => 
> Boolean)|[https://github.com/apache/kafka/blob/1.1/core/src/main/scala/kafka/log/Log.scala#L1227]]
>  will always return empty deletable log segments.{color:#172b4d} 
> {color}{quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to