Brock Noland created FLUME-1580:
-----------------------------------
Summary: FileChannel some sets of log files cannot be replayed
Key: FLUME-1580
URL: https://issues.apache.org/jira/browse/FLUME-1580
Project: Flume
Issue Type: Improvement
Components: Channel
Affects Versions: v1.3.0
Reporter: Brock Noland
Assignee: Brock Noland
When log files not longer have put's referenced in the queue we delete them.
Deleting these logs files is necessary to free up space. However, this can
cause the error below due to this scenario:
Imagine a queue with capacity 2 and the following activity:
put a in log 1
put b in log 1
take a from log 2
take b from log 2
put c in log 1
put d in log 1
roll logs 1 & 2
checkpoint and delete log 2 since no puts in the queue reference it
for whatever reason the checkpoint is deleted
On replay we will see:
put a in log 1
put b in log 1
put c in log 1 <- this will exceed the queue capacity and throw the error below
put d in log 1
{noformat}
2012-09-13 17:45:14,095 (lifecycleSupervisor-1-0) [ERROR -
org.apache.flume.channel.file.Log.replay(Log.java:354)] Failed to initialize
Log on [channel=channel1]
java.lang.IllegalStateException: Unable to add FlumeEventPointer [fileID=15,
offset=2104422]. Queue depth = 5000, Capacity = 5000
at
org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:394)
at
org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:329)
at org.apache.flume.channel.file.Log.replay(Log.java:339)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:272)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}
The solution at present is to delete the checkpoint, increase the capacity of
the channel, and restart. There will be duplicate events in this case.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira