[ 
https://issues.apache.org/jira/browse/KAFKA-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15439827#comment-15439827
 ] 

ASF GitHub Bot commented on KAFKA-1981:
---------------------------------------

GitHub user ewasserman opened a pull request:

    https://github.com/apache/kafka/pull/1794

    KAFKA-1981 Make log compaction point configurable

    Now uses LogSegment.largestTimestamp to determine age of segment's 
messages. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ewasserman/kafka feat-1981

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/1794.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1794
    
----
commit 50bcc6036217720a69229868fbd7ab3a18c47ff1
Author: Eric Wasserman <eric.wasser...@gmail.com>
Date:   2016-08-26T19:09:26Z

    merge fixes

commit 7e5da446cee19e2db2f7f7f93306d7d81de4c3aa
Author: Eric Wasserman <eric.wasser...@gmail.com>
Date:   2016-08-26T19:57:58Z

    back out orig files

commit 6e8c1ea8832691f4bd8d0c08460dd24a82f676fc
Author: Eric Wasserman <eric.wasser...@gmail.com>
Date:   2016-08-26T20:48:00Z

    change logs to string interpolation

----


> Make log compaction point configurable
> --------------------------------------
>
>                 Key: KAFKA-1981
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1981
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.8.2.0
>            Reporter: Jay Kreps
>              Labels: newbie++
>         Attachments: KIP for Kafka Compaction Patch.md
>
>
> Currently if you enable log compaction the compactor will kick in whenever 
> you hit a certain "dirty ratio", i.e. when 50% of your data is uncompacted. 
> Other than this we don't give you fine-grained control over when compaction 
> occurs. In addition we never compact the active segment (since it is still 
> being written to).
> Other than this we don't really give you much control over when compaction 
> will happen. The result is that you can't really guarantee that a consumer 
> will get every update to a compacted topic--if the consumer falls behind a 
> bit it might just get the compacted version.
> This is usually fine, but it would be nice to make this more configurable so 
> you could set either a # messages, size, or time bound for compaction.
> This would let you say, for example, "any consumer that is no more than 1 
> hour behind will get every message."
> This should be relatively easy to implement since it just impacts the 
> end-point the compactor considers available for compaction. I think we 
> already have that concept, so this would just be some other overrides to add 
> in when calculating that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to