Don't use -9 From: Shady Xu <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Thursday, July 23, 2015 1:23 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Replay log taking to much time
I didn't set this property so it has its default value true. Any other idea? BTW, if I use `kill -9` to kill the flume process, flume will not be able to create a checkpoint, right? 2015-07-23 15:39 GMT+08:00 Roshan Naik <[email protected]<mailto:[email protected]>>: You can set the 'checkpointOnClose = true if its not already the case (default is true). This setting that was added in 1.6. It will create a checkpoint when flume is trying to shutdown file channel ... consequently replay on restart/reconfgure should be much quicker. -roshan From: Shady Xu <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Thursday, July 23, 2015 12:35 AM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Replay log taking to much time Yes I'm using Flume 1.6 now and dualCheckpoints are also used, but every time I restart the agent, it takes less time but still dozens of minutes to replay the log. This is not normal, right? 2015-06-25 23:15 GMT+08:00 Johny Rufus <[email protected]<mailto:[email protected]>>: If the checkpointing interval is 30 seconds (by default), and dualCheckpoints are enabled (in case, the agent was interrupted while writing a checkpoint), then replay should happen only from the last 30 secs (worst case 60 secs). Not sure if this is happening in your case, or a Full replay is happening. Thanks, Rufus On Wed, Jun 24, 2015 at 10:40 PM, Shady Xu <[email protected]<mailto:[email protected]>> wrote: I have tried 1.6, replaying log has been faster, but not enough. We have G bytes of logs, replaying these logs still takes us hours even days. This is frustrating, and has been the biggest concern for us to use it in a larger scale. 2015-06-01 15:32 GMT+08:00 Hari Shreedharan <[email protected]<mailto:[email protected]>>: 1.6 has been released. We were waiting for maven central to sync up. Now that it is on central, I will post the update on the site tomorrow. On Sunday, May 31, 2015, Shady Xu <[email protected]<mailto:[email protected]>> wrote: I noticed that Flume 1.6 has been released on Github but not the official website. I have compiled some of the modules from source myself (for other reasons), but I'm not sure compiling the whole project is a good idea. We have tons of data, every time we change the configurations, replaying log takes us way too many hours... 2015-04-17 12:38 GMT+08:00 Hari Shreedharan <[email protected]>: Changes that went into Flume 1.6 should improve replay time. Flume 1.6 will be out in a few days. Thanks, Hari On Thu, Apr 16, 2015 at 7:55 PM, Shady Xu <[email protected]> wrote: Every time I restart Flume NG, it will try to replay the log and the process usually takes hours. During this time, Flume does not take any data from the source. So how can I make the replay faster? -- Thanks, Hari
