[
https://issues.apache.org/jira/browse/KAFKA-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13729559#comment-13729559
]
Jun Rao commented on KAFKA-615:
-------------------------------
50.3 Yes, I think your reasoning is correct. I didn't look at the code
carefully enough.
52.2 For the first part, that was my initial analysis too. Then, I was thinking
the file system has to flush both the metadata and the data. During a crash,
could the last segment be in a state that it's metadata (and thus length) is
flushed, but the actual data is not. Does flush guarantee data is flushed
before the metadata? Forcing a flush on every truncate is safe, but will delay
the processing of the LeaderAndIsr request and it's probably too pessimistic.
That's why I was thinking of running recovery on the last segment during
startup if lastOffset < this.recoveryPoint.
For the second part, the hole that you described in current 0.8 won't happen
since we force a flush on log rolling.
> Avoid fsync on log segment roll
> -------------------------------
>
> Key: KAFKA-615
> URL: https://issues.apache.org/jira/browse/KAFKA-615
> Project: Kafka
> Issue Type: Bug
> Reporter: Jay Kreps
> Assignee: Neha Narkhede
> Attachments: KAFKA-615-v1.patch, KAFKA-615-v2.patch,
> KAFKA-615-v3.patch, KAFKA-615-v4.patch, KAFKA-615-v5.patch, KAFKA-615-v6.patch
>
>
> It still isn't feasible to run without an application level fsync policy.
> This is a problem as fsync locks the file and tuning such a policy so that
> the flushes aren't so frequent that seeks reduce throughput, yet not so
> infrequent that the fsync is writing so much data that there is a noticable
> jump in latency is very challenging.
> The remaining problem is the way that log recovery works. Our current policy
> is that if a clean shutdown occurs we do no recovery. If an unclean shutdown
> occurs we recovery the last segment of all logs. To make this correct we need
> to ensure that each segment is fsync'd before we create a new segment. Hence
> the fsync during roll.
> Obviously if the fsync during roll is the only time fsync occurs then it will
> potentially write out the entire segment which for a 1GB segment at 50mb/sec
> might take many seconds. The goal of this JIRA is to eliminate this and make
> it possible to run with no application-level fsyncs at all, depending
> entirely on replication and background writeback for durability.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira