[
https://issues.apache.org/jira/browse/FLUME-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13803140#comment-13803140
]
Hari Shreedharan commented on FLUME-2181:
-----------------------------------------
Hmm, looks like there are a couple of changes that need to be made:
# Each of the files need to fsync-ed before the checkpoint is written out, else
it is possible that the checkpoint will have offsets to files that may not
exist.
# We need to additionally safeguard against a situation where one file with the
take for an event in another file may be fsync-ed while the file with the event
itself was not (maybe because of timing, maybe because the system crashed
before the file with the event fsync-ed etc). In that case, the take should
really be ignored during a replay. (This is a problem during full replay - if a
full checkpoint is available, the above fix would handle that).
> Optionally disable File Channel fsyncs
> ---------------------------------------
>
> Key: FLUME-2181
> URL: https://issues.apache.org/jira/browse/FLUME-2181
> Project: Flume
> Issue Type: Improvement
> Reporter: Hari Shreedharan
> Assignee: Hari Shreedharan
> Attachments: FLUME-2181.patch
>
>
> This will give File Channel performance a big boost, at the cost of possible
> data loss if a crash happens between checkpoints.
> Also we should make it configurable, with default to false. If the user does
> not mind slight inconsistencies, this feature can be explicitly enabled
> through configuration. So if it is not configured, then the behavior will be
> exactly as it is now.
--
This message was sent by Atlassian JIRA
(v6.1#6144)