Just to add more info to this, I've checked the File channel where a "ChannelException: Cannot acquire capacity" is reported against, and can see file log-1 has the size of 0 and log-2 has over 300 MB of data, comparing with another File channel which has files log-2 and log-3 both with data in it but no file log-1 is found.
sounds like log-1 is the one causing the "NullPointerException: LogFile is null for id 1" below, and when I restarted flume, I get the following warning. I can confirm there was no manual tampering in the file channel directory 2012-10-02 09:38:10,231 INFO [conf-file-poller-0] DefaultLogicalNodeManager.java - Starting Channel probeFileChannel1 2012-10-02 09:38:10,239 INFO [conf-file-poller-0] DefaultLogicalNodeManager.java - Starting Channel probeFileChannel3 2012-10-02 09:38:10,313 WARN [lifecycleSupervisor-1-2] ReplayHandler.java - Hit EOF on /home/user/flume-ng/filechannel3/data/log-1 2012-10-02 09:38:10,314 INFO [lifecycleSupervisor-1-1] DirectMemoryUtils.java - Unable to get maxDirectMemory from VM: NoSuchMethodException: sun.misc.VM.maxDirectMemory(null) 2012-10-02 09:38:10,317 INFO [lifecycleSupervisor-1-1] DirectMemoryUtils.java - Direct Memory Allocation: Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 954466304, Remaining = 954466304 2012-10-02 09:38:10,381 WARN [lifecycleSupervisor-1-1] LogFile.java - Checkpoint for file(/home/user/flume-ng/filechannel1/data/log-2) is: 1349166469095, which is beyond the requested checkpoint time: 0. 2012-10-02 09:38:10,381 WARN [lifecycleSupervisor-1-2] LogFile.java - Checkpoint for file(/home/user/flume-ng/filechannel3/data/log-2) is: 1349166991594, which is beyond the requested checkpoint time: 0.* *2012-10-02 09:41:52,144 ERROR [lifecycleSupervisor-1-2] ReplayHandler.java - Pending takes 32103 exist after the end of replay. Duplicate messages will exist in destination. 2012-10-02 09:41:52,709 INFO [lifecycleSupervisor-1-2] MonitoredCounterGroup.java - Component type: CHANNEL, name: probeFileChannel3 started 2012-10-02 09:42:31,413 WARN [lifecycleSupervisor-1-1] LogFile.java - Checkpoint for file(/home/cluster_admin/flume-ng/filechannel1/data/log-3) is: 1349166981020, which is beyond the requested checkpoint time: 0. 2012-10-02 09:45:14,836 ERROR [lifecycleSupervisor-1-1] ReplayHandler.java - Pending takes 8409 exist after the end of replay. Duplicate messages will exist in destination. 2012-10-02 09:45:15,453 INFO [lifecycleSupervisor-1-1] MonitoredCounterGroup.java - Component type: CHANNEL, name: probeFileChannel1 started On Tue, Oct 2, 2012 at 9:19 AM, Raymond Ng <[email protected]> wrote: > Hi > > Could I have some advice for the following exception please, is this > related to the "ChannelException: Cannot acquire capacity" which I > experience from time to time > > > 2012-10-02 09:16:53,563 ERROR [Log-BackgroundWorker] Log.java - General > error in checkpoint worker > java.lang.NullPointerException > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738) > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692) > at org.apache.flume.channel.file.Log.access$300(Log.java:57) > at org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892) > 2012-10-02 09:16:56,317 ERROR > [SinkRunner-PollingRunner-DefaultSinkProcessor] HDFSEventSink.java - > process failed > java.lang.NullPointerException: LogFile is null for id 1 > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at org.apache.flume.channel.file.Log.get(Log.java:316) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373) > at > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113) > at > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91) > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:662) > 2012-10-02 09:16:56,318 ERROR > [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - Unable to > deliver event. Exception follows. > org.apache.flume.EventDeliveryException: java.lang.NullPointerException: > LogFile is null for id 1 > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:450) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.NullPointerException: LogFile is null for id 1 > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at org.apache.flume.channel.file.Log.get(Log.java:316) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373) > at > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113) > at > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91) > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383) > ... 3 more > 2012-10-02 09:16:56,625 ERROR [Log-BackgroundWorker] Log.java - General > error in checkpoint worker > java.lang.NullPointerException > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738) > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692) > at org.apache.flume.channel.file.Log.access$300(Log.java:57) > at org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892) > 2012-10-02 09:16:59,678 ERROR [Log-BackgroundWorker] Log.java - General > error in checkpoint worker > java.lang.NullPointerException > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:738) > at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:692) > at org.apache.flume.channel.file.Log.access$300(Log.java:57) > at org.apache.flume.channel.file.Log$BackgroundWorker.run(Log.java:892) > 2012-10-02 09:17:01,318 ERROR > [SinkRunner-PollingRunner-DefaultSinkProcessor] HDFSEventSink.java - > process failed > java.lang.NullPointerException: LogFile is null for id 1 > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at org.apache.flume.channel.file.Log.get(Log.java:316) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373) > at > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113) > at > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91) > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:662) > 2012-10-02 09:17:01,318 ERROR > [SinkRunner-PollingRunner-DefaultSinkProcessor] SinkRunner.java - Unable to > deliver event. Exception follows. > org.apache.flume.EventDeliveryException: java.lang.NullPointerException: > LogFile is null for id 1 > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:450) > at > org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68) > at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147) > at java.lang.Thread.run(Thread.java:662) > Caused by: java.lang.NullPointerException: LogFile is null for id 1 > at > com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204) > at org.apache.flume.channel.file.Log.get(Log.java:316) > at > org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doTake(FileChannel.java:373) > at > org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113) > at > org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:91) > at > org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:383) > ... 3 more > > > > -- > Rgds > Ray > -- Rgds Ray
