[
https://issues.apache.org/jira/browse/FLUME-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brock Noland updated FLUME-1580:
--------------------------------
Description:
When log files not longer have put's referenced in the queue we delete them.
Deleting these logs files is necessary to free up space. However, this can
cause the error below due to this scenario:
Imagine a queue with capacity 2 and the following activity:
put a in log 1
put b in log 1
take a from log 2
take b from log 2
put c in log 1
put d in log 1
roll logs 1 & 2
checkpoint and delete log 2 since no puts in the queue reference it
for whatever reason the checkpoint is deleted
On replay we will see:
put a in log 1
put b in log 1
put c in log 1 <- this will exceed the queue capacity and throw the error below
put d in log 1
Example error message:
{noformat}
2012-09-13 17:45:14,095 (lifecycleSupervisor-1-0) [ERROR -
org.apache.flume.channel.file.Log.replay(Log.java:354)] Failed to initialize
Log on [channel=channel1]
java.lang.IllegalStateException: Unable to add FlumeEventPointer [fileID=15,
offset=2104422]. Queue depth = 5000, Capacity = 5000
at
org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:394)
at
org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:329)
at org.apache.flume.channel.file.Log.replay(Log.java:339)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:272)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}
The solution at present is to delete the checkpoint, increase the capacity of
the channel, and restart. There will be duplicate events in this case.
was:
When log files not longer have put's referenced in the queue we delete them.
Deleting these logs files is necessary to free up space. However, this can
cause the error below due to this scenario:
Imagine a queue with capacity 2 and the following activity:
put a in log 1
put b in log 1
take a from log 2
take b from log 2
put c in log 1
put d in log 1
roll logs 1 & 2
checkpoint and delete log 2 since no puts in the queue reference it
for whatever reason the checkpoint is deleted
On replay we will see:
put a in log 1
put b in log 1
put c in log 1 <- this will exceed the queue capacity and throw the error below
put d in log 1
{noformat}
2012-09-13 17:45:14,095 (lifecycleSupervisor-1-0) [ERROR -
org.apache.flume.channel.file.Log.replay(Log.java:354)] Failed to initialize
Log on [channel=channel1]
java.lang.IllegalStateException: Unable to add FlumeEventPointer [fileID=15,
offset=2104422]. Queue depth = 5000, Capacity = 5000
at
org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:394)
at
org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:329)
at org.apache.flume.channel.file.Log.replay(Log.java:339)
at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:272)
at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
{noformat}
The solution at present is to delete the checkpoint, increase the capacity of
the channel, and restart. There will be duplicate events in this case.
> FileChannel some sets of log files cannot be replayed
> -----------------------------------------------------
>
> Key: FLUME-1580
> URL: https://issues.apache.org/jira/browse/FLUME-1580
> Project: Flume
> Issue Type: Improvement
> Components: Channel
> Affects Versions: v1.3.0
> Reporter: Brock Noland
> Assignee: Brock Noland
>
> When log files not longer have put's referenced in the queue we delete them.
> Deleting these logs files is necessary to free up space. However, this can
> cause the error below due to this scenario:
> Imagine a queue with capacity 2 and the following activity:
> put a in log 1
> put b in log 1
> take a from log 2
> take b from log 2
> put c in log 1
> put d in log 1
> roll logs 1 & 2
> checkpoint and delete log 2 since no puts in the queue reference it
> for whatever reason the checkpoint is deleted
> On replay we will see:
> put a in log 1
> put b in log 1
> put c in log 1 <- this will exceed the queue capacity and throw the error
> below
> put d in log 1
> Example error message:
> {noformat}
> 2012-09-13 17:45:14,095 (lifecycleSupervisor-1-0) [ERROR -
> org.apache.flume.channel.file.Log.replay(Log.java:354)] Failed to initialize
> Log on [channel=channel1]
> java.lang.IllegalStateException: Unable to add FlumeEventPointer [fileID=15,
> offset=2104422]. Queue depth = 5000, Capacity = 5000
> at
> org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:394)
> at
> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:329)
> at org.apache.flume.channel.file.Log.replay(Log.java:339)
> at
> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:272)
> at
> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:236)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> The solution at present is to delete the checkpoint, increase the capacity of
> the channel, and restart. There will be duplicate events in this case.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira