[
https://issues.apache.org/jira/browse/FLUME-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brock Noland updated FLUME-1432:
--------------------------------
Attachment: FLUME-1432-6.patch
and here is the patch without --binary
> FileChannel should replay logs in the order they were written
> -------------------------------------------------------------
>
> Key: FLUME-1432
> URL: https://issues.apache.org/jira/browse/FLUME-1432
> Project: Flume
> Issue Type: Bug
> Components: Channel
> Affects Versions: v1.2.0
> Reporter: Brock Noland
> Assignee: Brock Noland
> Attachments: FLUME-1432-6.patch, FLUME-1432-6.patch,
> flume-1432-resources.tar.gz
>
>
> Currently we replay the logs one at a time causing us to build large queue of
> pending takes. Additionally, there maybe scenerios where this simply will not
> work. Take a queue which is full (via checkpoint) and two files:
> 1:
> put
> commit
> put
> commit
> 2:
> take
> commit
> take
> commit
> take
> commit
> Replaying these logs in the current form will not work because we will we try
> and reply the puts first and exceed our queue size. For these reasons, we
> should replay them in the order they were written.
> However, at present there is no way to do this. Currently we have two
> identifers in each record we write, a transaction id and a timestamp. Neither
> can be used in replaying logs in order because the transaction id is created
> when we create the transaction not when we write to the log. Someone could
> create transaction, sleep, and then do work. The timestamp its not granular
> enough as we could have duplicates.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira