[ 
https://issues.apache.org/jira/browse/KAFKA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15548776#comment-15548776
 ] 

ASF GitHub Bot commented on KAFKA-3224:
---------------------------------------

GitHub user bill-warshaw opened a pull request:

    https://github.com/apache/kafka/pull/1972

    KAFKA-3224: New log deletion policy based on timestamp

    * adds a new topic-level broker configuration, `log.retention.min.timestamp`
      * if unset, this setting is ignored
      * setting this value to a Unix timestamp will allow the log cleaner to 
delete any segments for a given topic whose last timestamp is earlier than the 
set timestamp
    
    --
    
    ### 
[KIP-47](https://cwiki.apache.org/confluence/display/KAFKA/KIP-47+-+Add+timestamp-based+log+deletion+policy)
    ### [JIRA](https://issues.apache.org/jira/browse/KAFKA-3224)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bill-warshaw/kafka KAFKA-3224

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/1972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1972
    
----
commit 008dc813d82e5938201f069712ba1bd44277e755
Author: Bill Warshaw <bill.wars...@appian.com>
Date:   2016-02-02T16:45:47Z

    KAFKA-3224: New log deletion policy based on timestamp
    
    * setting log.retention.min.timestamp will set a timestamp for a log,
      and any message before that timestamp is eligible for deletion

----


> Add timestamp-based log deletion policy
> ---------------------------------------
>
>                 Key: KAFKA-3224
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3224
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Bill Warshaw
>              Labels: kafka
>
> One of Kafka's officially-described use cases is a distributed commit log 
> (http://kafka.apache.org/documentation.html#uses_commitlog). In this case, 
> for a distributed service that needed a commit log, there would be a topic 
> with a single partition to guarantee log order. This service would use the 
> commit log to re-sync failed nodes. Kafka is generally an excellent fit for 
> such a system, but it does not expose an adequate mechanism for log cleanup 
> in such a case. With a distributed commit log, data can only be deleted when 
> the client application determines that it is no longer needed; this creates 
> completely arbitrary ranges of time and size for messages, which the existing 
> cleanup mechanisms can't handle smoothly.
> A new deletion policy based on the absolute timestamp of a message would work 
> perfectly for this case.  The client application will periodically update the 
> minimum timestamp of messages to retain, and Kafka will delete all messages 
> earlier than that timestamp using the existing log cleaner thread mechanism.
> This is based off of the work being done in KIP-32 - Add timestamps to Kafka 
> message.
> h3. Initial Approach
> https://github.com/apache/kafka/compare/trunk...bill-warshaw:KAFKA-3224



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to