[ 
https://issues.apache.org/jira/browse/KAFKA-77?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481140#comment-13481140
 ] 

Jun Rao commented on KAFKA-77:
------------------------------

I think what Jay meant is that in 0.8, a message is considered as committed as 
long as it's written in memory in f brokers (f being the replication factor). 
This is probably as good or better than forcing data to disk, assuming failures 
are rare. Therefore, flushing to disk does not need to be optimized for 
durability guarantees.
                
> Implement "group commit" for kafka logs
> ---------------------------------------
>
>                 Key: KAFKA-77
>                 URL: https://issues.apache.org/jira/browse/KAFKA-77
>             Project: Kafka
>          Issue Type: Improvement
>    Affects Versions: 0.7
>            Reporter: Jay Kreps
>            Assignee: Jay Kreps
>             Fix For: 0.8
>
>         Attachments: kafka-group-commit.patch
>
>
> The most expensive operation for the server is usually going to be the 
> fsync() call to sync data in a log to disk, if you don't flush your data is 
> at greater risk of being lost in a crash. Currently we give two knobs to tune 
> this trade--log.flush.interval and log.default.flush.interval.ms (no idea why 
> one has default and the other doesn't since they are both defaults). However 
> if you flush frequently, say on every write, then performance is not that 
> great.
> One trick that can be used to improve this worst case of continual flushes is 
> to allow a single fsync() to be used for multiple writes that occur at the 
> same time. This is a lot like "group commit" in databases. It is unclear 
> which cases this would improve and by how much but it might be worth a try.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to