Hey all,
I've not been active on flume lately as my colleague Iijima-san here has
taken over most of our internal flume related work while I've moved on
to other projects.
This new tail source is a java version of an internal program we've been
using for over an year(written in python but that had several
idiosyncrasies intended for our use which made it not particularly
useful to be contributed). We've had this new source in production for a
month now and it was in testing for a long time before that. To my
knowledge we have not lost any logs when using it(and an appropriate
pipeline of course), though I believe there may be some extreme corner
cases involving either very fast rotation of log files or moving log
files intended to be read to another directory before they've been
opened(both should never happen in any kind of normal environment).
As was mentioned, it uses inodes to detect file rotations and is thus
only suitable for use with unix operating systems. That being said it
would be good to get this into flume since I think a reliable tailing
source is something a lot of people have asked for in the past.
Please let us know what you think,
Thanks,
Juhani
On 10/09/2014 03:22 PM, 飯島賢志 wrote:
Hello,
We would like to contribute a new tailing source to the Apache Flume
community.
We have operated it in production for 1 month and there is no technical
problem for now.
This source is below:
---
Taildir source watches the specified files, and tails them in nearly
real-time once appends are detected to these files.
This source is reliable and will not miss data even when the tailing files
rotate.
It periodically writes the last read position, inode and absolute path of
each file in a position file using the JSON format.
---
If Flume is stopped or down for some reason, it can restart tailing from
the position written in the position file.
This source is very convenient for tailing files and recovering form the
system failures such as the Flume process dying..
But it requires Unix-style file system and Java 1.7 or later because it
uses inode to identify files.
Can we contribute it to the Apache Flume community?
If acceptable I will attach the patch of it in JIRA.
Thank you,
-Satoshi iijima