Re: Enabling file channel backup checkpoint causes significant disk IO at start-up

Hari Shreedharan Mon, 08 Sep 2014 13:56:11 -0700

Flume releases are once every few months - since we just had one acouple of months back, I don't think there will be one happening rightaway.


Michael Diamant wrote:


Hari, thank you for your quick reply. A follow-up question to help me
figure out how best to proceed on my end: Can you provide an estimate
as to when the next Flume release will occur?


On Mon, Sep 8, 2014 at 4:07 PM, Hari Shreedharan
<[email protected] <mailto:[email protected]>> wrote:

This patch should address the issue, if enabled:
https://git-wip-us.apache.org/repos/asf?p=flume.git;a=commitdiff;h=69fd6b3ad5e5b9ae6f1293b3d8e57ed57fd6701c;hp=f15f20785262ac3cb3e35c2a12e669b7a836d35f

It will be part of the next Flume release (or CDH5.2.0).

--

Thanks,
Hari


Michael Diamant <mailto:[email protected]>
September 8, 2014 at 12:58 PM
My team uses Flume 1.4.0 packaged with CDH5.0.2 via an embedded
agent to write to a file channel. From a previous thread started
by my colleague, "FileChannel Replays consistently take a long
time" and associated issue,
https://issues.apache.org/jira/browse/FLUME-2450, it was
suggested to use a backup checkpoint directory to avoid lengthy
replays. When I enabled the backup checkpoint directory, I
observed via iotop near 100% IO by my application with the
embedded agent. This level of IO persists for about 30 seconds
rendering the application unusable during this time period.

For comparison, I monitored via iotop when backup checkpoint is
disabled. IO activity occurs for at most several seconds. That
is, there is a qualitative difference when enabling the backup
checkpoint directory. Additionally, I also tried deleting the
existing checkpoints/data directories to start with a clean
slate. Those experiment results are in-line with my above
observations.

Is this expected behavior when using a backup checkpoint
directory? Is there anyway in which the amount of IO can be
reduced? I appreciate feedback and insights because the current
behavior is untenable for a production environment.

Thank you,
Michael

Re: Enabling file channel backup checkpoint causes significant disk IO at start-up

Reply via email to