Hi- I'm the author of the backup checkpoint compression patch.
We backported it to 1.4 and are running it in production without a problem. Abe -- Abraham Fine | Software Engineer (516) 567-2535 BrightRoll, Inc. | Smart Video Advertising | www.brightroll.com On Mon, Sep 8, 2014 at 1:59 PM, Gary Malouf <[email protected]> wrote: > Hi Hari, > > I'm a colleague of Michael's, if we are in need of a few of these patches, > would you recommend we do our own custom build? > > Separate from Apache's release cycle, would these patches get included in > the next CDH build that includes Flume? (Not sure what the schedule of > that is...) > > Thanks, > > Gary > > > On Mon, Sep 8, 2014 at 4:55 PM, Hari Shreedharan < > [email protected]> wrote: > >> Flume releases are once every few months - since we just had one a couple >> of months back, I don't think there will be one happening right away. >> >> Michael Diamant wrote: >> >> >> Hari, thank you for your quick reply. A follow-up question to help me >> figure out how best to proceed on my end: Can you provide an estimate >> as to when the next Flume release will occur? >> >> >> On Mon, Sep 8, 2014 at 4:07 PM, Hari Shreedharan >> <[email protected] <mailto:[email protected]>> wrote: >> >> This patch should address the issue, if enabled: >> >> https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commitdiff;h=69fd6b3ad5e5b9ae6f1293b3d8e57ed57fd6701c;hp=f15f20785262ac3cb3e35c2a12e669b7a836d35f >> >> It will be part of the next Flume release (or CDH5.2.0). >> >> -- >> >> Thanks, >> Hari >> >> >> >> Michael Diamant <mailto:[email protected]> >> September 8, 2014 at 12:58 PM >> My team uses Flume 1.4.0 packaged with CDH5.0.2 via an embedded >> agent to write to a file channel. From a previous thread started >> by my colleague, "FileChannel Replays consistently take a long >> time" and associated issue, >> https://issues.apache.org/jira/browse/FLUME-2450, it was >> suggested to use a backup checkpoint directory to avoid lengthy >> replays. When I enabled the backup checkpoint directory, I >> observed via iotop near 100% IO by my application with the >> embedded agent. This level of IO persists for about 30 seconds >> rendering the application unusable during this time period. >> >> For comparison, I monitored via iotop when backup checkpoint is >> disabled. IO activity occurs for at most several seconds. That >> is, there is a qualitative difference when enabling the backup >> checkpoint directory. Additionally, I also tried deleting the >> existing checkpoints/data directories to start with a clean >> slate. Those experiment results are in-line with my above >> observations. >> >> Is this expected behavior when using a backup checkpoint >> directory? Is there anyway in which the amount of IO can be >> reduced? I appreciate feedback and insights because the current >> behavior is untenable for a production environment. >> >> Thank you, >> Michael >> >> >> >> >
