[
https://issues.apache.org/jira/browse/FLUME-1432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brock Noland updated FLUME-1432:
--------------------------------
Summary: FileChannel should replay logs in the order they were written
(was: FileChannel should replay logs in order of them being written)
> FileChannel should replay logs in the order they were written
> -------------------------------------------------------------
>
> Key: FLUME-1432
> URL: https://issues.apache.org/jira/browse/FLUME-1432
> Project: Flume
> Issue Type: Bug
> Components: Channel
> Affects Versions: v1.2.0
> Reporter: Brock Noland
> Assignee: Brock Noland
>
> Currently we replay the logs one at a time causing us to build large queue of
> pending takes. Additionally, there maybe scenerios where this simply will not
> work. Take a queue which is full (via checkpoint) and two files:
> 1:
> put
> commit
> put
> commit
> 2:
> take
> commit
> take
> commit
> take
> commit
> Replaying these logs in the current form will not work because we will we try
> and reply the puts first and exceed our queue size. For these reasons, we
> should replay them in the order they were written.
> However, at present there is no way to do this. Currently we have two
> identifers in each record we write, a transaction id and a timestamp. Neither
> can be used in replaying logs in order because the transaction id is created
> when we create the transaction not when we write to the log. Someone could
> create transaction, sleep, and then do work. The timestamp its not granular
> enough as we could have duplicates.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira