Note the dual and backup checkpoint configs here: http://flume.apache.org/FlumeUserGuide.html#file-channel
On Thu, Aug 8, 2013 at 7:37 AM, Anat Rozenzon <[email protected]> wrote: > I use 3.5G but I can change it to 5G. > > Not sure I understand what you mean about dual checkpoint, this is my > configuration for each of the three channels, should I change it? > > collector.channels.mc1.type = file > > collector.channels.mc1.checkpointDir=/home/flume/collector1/channels/mc1/checkpoint > collector.channels.mc1.dataDirs=/home/flume/collector1/channels/mc1/data > collector.channels.mc1.capacity=100000000 > collector.channels.mc1.transactionCapacity=10000 > collector.channels.mc1.use-fast-replay=true > > > > On Thu, Aug 8, 2013 at 3:19 PM, Brock Noland <[email protected]> wrote: > >> use-fast-replay would help but you'd need 4-5GB of heap per channel. With >> heaps that large you use be using dual checkpointing to avoid this. >> >> Here is the thread doing the replay: >> >> "lifecycleSupervisor-1-0" prio=10 tid=0x00007f040472c800 nid=0x1332b >> runnable [0x00007f03f84ce000] >> java.lang.Thread.State: RUNNABLE >> at >> org.apache.flume.channel.file.FlumeEventQueue.remove(FlumeEventQueue.java:194) >> - locked <0x00000007256d3dc8> (a >> org.apache.flume.channel.file.FlumeEventQueue) >> at >> org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:405) >> at >> org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:328) >> at org.apache.flume.channel.file.Log.doReplay(Log.java:503) >> at org.apache.flume.channel.file.Log.replay(Log.java:430) >> at >> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:302) >> - locked <0x00000007256d2e38> (a >> org.apache.flume.channel.file.FileChannel) >> at >> org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251) >> - locked <0x00000007256d2e38> (a >> org.apache.flume.channel.file.FileChannel) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> at >> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351) >> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:722) >> >> >> >> On Thu, Aug 8, 2013 at 12:52 AM, Anat Rozenzon <[email protected]> wrote: >> >>> Hi, >>> >>> I'm trying to restart Flume. My setup is: >>> >>> Acro source => File channel 1 => HDFS sink >>> => File channel 2 => Another HDFS sink >>> => File channel 3 => File sink >>> >>> But it seem to be doing replayLog for hours now, after seeing this >>> yesterday, I even tried setting use-fast-replay=true, but it didn't help. >>> >>> Each file channel capacity is 100000000, is this too high for Flume? I >>> started on lower number but then it complained that the channel is getting >>> filled so I made it higher. >>> >>> My log is repeatedly writing such lines: >>> 08 Aug 2013 01:36:22,856 INFO [lifecycleSupervisor-1-1] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 3240000 >>> records >>> 08 Aug 2013 01:36:41,324 INFO [lifecycleSupervisor-1-0] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 3350000 >>> records >>> 08 Aug 2013 01:38:35,794 INFO [lifecycleSupervisor-1-1] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 3250000 >>> records >>> 08 Aug 2013 01:40:48,759 INFO [lifecycleSupervisor-1-1] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 3260000 >>> records >>> 08 Aug 2013 01:41:01,684 INFO [lifecycleSupervisor-1-0] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 4090000 >>> records >>> 08 Aug 2013 01:41:36,691 INFO [lifecycleSupervisor-1-0] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 4100000 >>> records >>> 08 Aug 2013 01:42:27,528 INFO [lifecycleSupervisor-1-0] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 4110000 >>> records >>> 08 Aug 2013 01:42:57,725 INFO [lifecycleSupervisor-1-1] >>> (org.apache.flume.channel.file.ReplayHandler.replayLog:293) - Read 3270000 >>> records >>> >>> >>> In attaching jstack output, I wasn't sure what the threads are doing but >>> in any case many of them seem to be waiting.. >>> >>> Any idea what I can do to make the server start? >>> >>> Thanks >>> Anat >>> >>> >> >> >> -- >> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org >> > > -- Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
