Fast replay will start only if the checkpoint files are deleted (or don't 
exist). If there is a checkpoint - fast replay will not start up even if the 
files are corrupt/unreadable. Depending on how many events are in the channel, 
fast replay can also fall victim to Java's GC slowdowns. That is something to 
think of.  

I have filed https://issues.apache.org/jira/browse/FLUME-2155 to improve 
certain aspects of the File Channel replay. I have some ideas though I am not 
entirely sure when I will get time to work on this.  


Thanks,
Hari


On Friday, August 9, 2013 at 10:10 AM, Edwin Chiu wrote:

> Thanks, Brock! I'll check out this 1.4 feature. 
> 
> I already have 1.3 running on production machines though, it's still 
> preferred to keep 1.3 unless there's no way around this potentially lengthy 
> log replay.  
> 
> In my scenario, there's about 4G of files under data directory. My system has 
> about 40G free memory. I've restarted flume with 36G max memory in flume-env, 
> after setting fast-replay to true. 
> 
> The resource monitoring shows 36G is allocated to the flume process. But 
> while replay was running, it was using about the same amount of memory as 
> before the new max memory was set in flume-env and with fast-replay was off.  
> 
> Any tips to "force" fast-replay to kick in?
> 
> thanks!
> 
> - e
> 
> 
> - Edwin 
> 
> On Fri, Aug 9, 2013 at 4:26 AM, Brock Noland <[email protected] 
> (mailto:[email protected])> wrote:
> > If fast replay doesn't help then you don't have enough RAM. I'd suggest you 
> > use the new dual checkpoint feature. Note the dual and backup checkpoint 
> > configs here:
> > 
> > http://flume.apache.org/FlumeUserGuide.html#file-channel 
> > http://issues.apache.org/jira/browse/FLUME-1516
> > 
> > Brock 
> > 
> > On Thu, Aug 8, 2013 at 2:48 PM, Edwin Chiu <[email protected] 
> > (mailto:[email protected])> wrote:
> > > Hi there!
> > > 
> > > I'm using flume-ng 1.3.1 (Hortonworks latest production stable version as 
> > > of now) on centos 6 with jdk 1.6.  
> > > 
> > > I'm wondering how to speed up the replay of logs after changing file 
> > > channel parameters in flume.conf -- capacity and transactionCapacity.  
> > > 
> > > it takes hours for the node to catch up and able to receive and send 
> > > events again.
> > > 
> > > use-fast-replay = true with a ridiculous amount of max memory doesn't 
> > > speed things up.  
> > > 
> > > Any recommendations to avoid the down time? 
> > > 
> > > thanks!
> > > 
> > > Ed 
> > 
> > 
> > -- 
> > Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org 

Reply via email to