[
https://issues.apache.org/jira/browse/FLUME-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628207#comment-13628207
]
Brock Noland commented on FLUME-1968:
-------------------------------------
We store exact offsets two places:
1) for each event
2) the current offset for each file when we checkpoint
I discuss geting rid of 2, not 1. The problem with 2 is that are all kinds of
situations where we want to replay a log but the checkpoint position doesn't
match the checkpoint position of the log. This results in reading tons of extra
log data. Beyond that because of 2, we update the .meta file of each log during
a checkpoint which consumes a large amount of time.
> FileChannel new format while being backwards compatible
> -------------------------------------------------------
>
> Key: FLUME-1968
> URL: https://issues.apache.org/jira/browse/FLUME-1968
> Project: Flume
> Issue Type: Bug
> Components: Channel, File Channel
> Reporter: Brock Noland
>
> There are a couple issues with the current format:
> 1) We have to track the offset at checkpoint time and write the offset to a
> special location so we can seek to that offset during replay. In FLUME-1516
> we are tracking two offsets.
> 2) We have no way to detect partial writes FLUME-1967
> 3) We can only checksum the body of the event, not the entire record
> FLUME-1485 and therefore cannot detect corruption outside an event body.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira