Can you send your full logs? I suspect the channel did a full replay because it 
was restarted during a restart. (If it did, the logs would show a 
BadCheckpointException). 


Hari 

-- 
Hari Shreedharan


On Monday, February 25, 2013 at 1:20 PM, Rahul Ravindran wrote:

> Thanks Hari. I had waited for 20 minutes and this did not move change. Now, 
> after more than an hour, I see it working
> 
> From: Hari Shreedharan <[email protected] 
> (mailto:[email protected])>
> To: [email protected] (mailto:[email protected]); Rahul Ravindran 
> <[email protected] (mailto:[email protected])> 
> Sent: Monday, February 25, 2013 12:46 PM
> Subject: Re: File Channel error stops flume
> 
> Rahul, 
> 
> Those messages actually just suggest that your channel is replaying. The 
> channel will complete the replay and the agent will start the sinks once the 
> channel is ready. It might take a few minutes based on how many events you 
> have in the channel.
> 
> 
> Hari 
> 
> -- 
> Hari Shreedharan
> 
> On Monday, February 25, 2013 at 12:07 PM, Rahul Ravindran wrote:
> > Hi,
> >    I modified a parameter to the HDFS sink on a flume config (added an 
> > idleInterval) on 2 machines. Things worked fine on one, and not on the 
> > other. I tried restarting flume a couple of times and I continue seeing the 
> > same log statement (bolded below) with no writes to HDFS
> > 
> > 25 Feb 2013 08:27:00,174 INFO  [Log-BackgroundWorker-ch2] 
> > (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:109)  
> > - Start checkpoint for /flume2/checkpoint/checkpoint, elements to sync = 
> > 8506
> > :%
> > 25 Feb 2013 19:55:51,577 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume2/data/log-17
> > 25 Feb 2013 19:55:51,585 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume1/data/log-17
> > 25 Feb 2013 19:55:51,588 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.tools.DirectMemoryUtils.getDefaultDirectMemorySize:113)  
> > - Unable to get maxDirectMemory from VM: NoSuchMethodException: 
> > sun.misc.VM.maxDirectMemory(null)
> > 25 Feb 2013 19:55:51,592 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.tools.DirectMemoryUtils.allocate:47)  - Direct Memory 
> > Allocation:  Allocation = 1048576, Allocated = 0, MaxDirectMemorySize = 
> > 268435456, Remaining = 268435456
> > 25 Feb 2013 19:55:51,634 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 1622812128
> > 25 Feb 2013 19:55:51,634 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 1622720601
> > 25 Feb 2013 19:55:51,654 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume2/data/log-18
> > 25 Feb 2013 19:55:51,655 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 1622821593
> > 25 Feb 2013 19:55:51,655 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume2/data/log-19
> > 25 Feb 2013 19:55:51,656 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 1622678590
> > 25 Feb 2013 19:55:51,656 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume2/data/log-20
> > 25 Feb 2013 19:55:51,657 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 244707334
> > 25 Feb 2013 19:55:51,657 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume2/data/log-21
> > 25 Feb 2013 19:55:51,657 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 530601497
> > 25 Feb 2013 19:55:51,658 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 530601497 in /flume2/data/log-21
> > 25 Feb 2013 19:55:51,658 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume2/data/log-22
> > 25 Feb 2013 19:55:51,658 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume1/data/log-18
> > 25 Feb 2013 19:55:51,658 WARN  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:470)
> >   - Checkpoint for file(/flume2/data/log-22) is: 1361844516782, which is 
> > beyond the requested checkpoint time: 1361844516783 and position 0
> > 25 Feb 2013 19:55:51,659 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 1622674426
> > 25 Feb 2013 19:55:51,659 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume2/data/log-23
> > 25 Feb 2013 19:55:51,659 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume1/data/log-19
> > 25 Feb 2013 19:55:51,659 WARN  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:470)
> >   - Checkpoint for file(/flume2/data/log-23) is: 1361844516783, which is 
> > beyond the requested checkpoint time: 1361844516783 and position 0
> > 25 Feb 2013 19:55:51,660 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 1622239091
> > 25 Feb 2013 19:55:51,660 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume1/data/log-20
> > 25 Feb 2013 19:55:51,661 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 221490603
> > 25 Feb 2013 19:55:51,661 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:236)  - Replaying 
> > /flume1/data/log-21
> > 25 Feb 2013 19:55:51,661 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.skipToLastCheckpointPosition:466)
> >   - fast-forward to checkpoint position: 532696754
> > 25 Feb 2013 19:55:52,048 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 1623195597 in /flume1/data/log-17
> > 25 Feb 2013 19:55:52,103 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 1623195583 in /flume2/data/log-17
> > 25 Feb 2013 19:55:52,308 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 1623195536 in /flume2/data/log-18
> > 25 Feb 2013 19:55:52,319 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 1623195584 in /flume1/data/log-18
> > 25 Feb 2013 19:55:52,418 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 1623195600 in /flume2/data/log-19
> > 25 Feb 2013 19:55:52,439 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 244816205 in /flume2/data/log-20
> > 25 Feb 2013 19:55:52,440 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.ReplayHandler.replayLog:320)  - read: 12348, 
> > put: 0, take: 0, rollback: 0, commit: 0, skip: 12348, eventCount:0
> > 25 Feb 2013 19:55:52,441 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.Log.replay:399)  - Rolling /flume2/data
> > 25 Feb 2013 19:55:52,441 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.Log.roll:811)  - Roll start /flume2/data
> > 25 Feb 2013 19:55:52,443 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$Writer.<init>:171)  - Opened 
> > /flume2/data/log-24
> > 25 Feb 2013 19:55:52,449 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.Log.roll:826)  - Roll end
> > 25 Feb 2013 19:55:52,453 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:109)  
> > - Start checkpoint for /flume2/checkpoint/checkpoint, elements to sync = 0
> > 25 Feb 2013 19:55:52,455 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.EventQueueBackingStoreFile.checkpoint:117)  
> > - Updating checkpoint metadata: logWriteOrderID: 1361844516784, queueSize: 
> > 34525000, queueHead: 40625267
> > 25 Feb 2013 19:55:52,489 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  
> > - Updating log-24.meta currentPosition = 0, logWriteOrderID = 1361844516784
> > 25 Feb 2013 19:55:52,491 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.Log.writeCheckpoint:886)  - Updated 
> > checkpoint for file: /flume2/data/log-24 position: 0 logWriteOrderID: 
> > 1361844516784
> > 25 Feb 2013 19:55:52,491 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
> > RandomReader /flume2/data/log-17
> > 25 Feb 2013 19:55:52,497 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  
> > - Updating log-17.meta currentPosition = 1622720601, logWriteOrderID = 
> > 1361844516784
> > 25 Feb 2013 19:55:52,499 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
> > RandomReader /flume2/data/log-18
> > 25 Feb 2013 19:55:52,500 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 1623195593 in /flume1/data/log-19
> > 25 Feb 2013 19:55:52,505 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  
> > - Updating log-18.meta currentPosition = 1622821593, logWriteOrderID = 
> > 1361844516784
> > 25 Feb 2013 19:55:52,507 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
> > RandomReader /flume2/data/log-19
> > 25 Feb 2013 19:55:52,513 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  
> > - Updating log-19.meta currentPosition = 1622678590, logWriteOrderID = 
> > 1361844516784
> > 25 Feb 2013 19:55:52,514 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
> > RandomReader /flume2/data/log-20
> > 25 Feb 2013 19:55:52,520 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  
> > - Updating log-20.meta currentPosition = 244707334, logWriteOrderID = 
> > 1361844516784
> > 25 Feb 2013 19:55:52,521 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFile$RandomReader.close:356)  - Closing 
> > RandomReader /flume2/data/log-21
> > 25 Feb 2013 19:55:52,527 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.LogFileV3$MetaDataWriter.markCheckpoint:85)  
> > - Updating log-21.meta currentPosition = 530601497, logWriteOrderID = 
> > 1361844516784
> > 25 Feb 2013 19:55:52,529 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.channel.file.FileChannel.start:309)  - Queue Size after 
> > replay: 34525000 [channel=ch2]
> > 25 Feb 2013 19:55:52,594 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.instrumentation.MonitoredCounterGroup.register:89)  - 
> > Monitoried counter group for type: CHANNEL, name: ch2, registered 
> > successfully.
> > 25 Feb 2013 19:55:52,594 INFO  [lifecycleSupervisor-1-0] 
> > (org.apache.flume.instrumentation.MonitoredCounterGroup.start:73)  - 
> > Component type: CHANNEL, name: ch2 started
> > 25 Feb 2013 19:55:52,619 INFO  [lifecycleSupervisor-1-1] 
> > (org.apache.flume.channel.file.LogFile$SequentialReader.next:491)  - 
> > Encountered EOF at 222290119 in /flume1/data/log-20
> > 
> > 
> > 
> > 
> > 
> 
> 
> 
> 

Reply via email to