Hi Chris,
A few months back I actually ported the original flumes tail source, but
it was decided(and I agree with the reasoning) not to include it for a
number of reasons, which can be seen on the original ticket at
https://issues.apache.org/jira/browse/FLUME-931 . One of the big ones is
the fact that java cannot access inode information.
What we do is have a python program that tracks the files in a directory
and then sends the data using the scribe format to the ScribeSource(we
were using scribe until switching to flume, so are just using our ingest
system from then). This allows for the freedom to customize the ingest
to our own expectations, and we write checkpoints of how far we have
tailed. You could write this in whatever language you're comfortable
with and pass the data via avro or thrift.
On 08/30/2012 01:18 AM, Chris Neal wrote:
Hey guys,
I'm sure this is not a new question, but I haven't found an answer in
my searches. I'm curious why there is as of yet no Tail Source with
NG? It seems one of the most common use cases for Flume is to tail a
log file and dump it "somewhere". Given that, it sure would seem that
a Tail Source would be one of the first sources that gets written with
a new version.
I know about all the other ways to implement something *like* a Tail
Source: Exec Source, AVRO, Log4Jappender... and unfortunately they
all have limitations with regards to either functionality or
reliability/recoverability.
What am I missing here?
Is there any work being done on a Tail Source for NG?
I promise I'm not complaining, just trying to understand the logic. :)
Much appreciated.
Chris