[ 
https://issues.apache.org/jira/browse/FLUME-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13988472#comment-13988472
 ] 

Otis Gospodnetic edited comment on FLUME-2344 at 5/3/14 12:28 AM:
------------------------------------------------------------------

Here is [~iekpo]'s FileSource: 
http://search-hadoop.com/m/DEeB4urEUJ&subj=Re+Flume+Jambalaya+A+Flume+Plugin+with+Multiple+Components
 .  It relies on Tailer from Apache Commons IO, which doesn't use tail -F, but 
instead keeps track of position in the file and seeks to that and reads forward 
(and checks file size/length to figure out if the file has been rotated)

This implementation from [~guillermo.of] seems to store the position in a 
separate file, which seems more robust because it survives things like 
restarts, which the Tailer from Apache Commons IO and thus [~iekpo]'s 
implementation don't handle, I think.


was (Author: otis):
Here is [~iekpo]'s FileSource: 
http://search-hadoop.com/m/DEeB4urEUJ&subj=Re+Flume+Jambalaya+A+Flume+Plugin+with+Multiple+Components
 .  It relies on Tailer from Apache Commons IO, which doesn't use tail -F, but 
instead keeps track of position in the file and seeks to that and reads forward 
(and checks file size/length to figure out if the file has been rotated)


> New source for tailing files
> ----------------------------
>
>                 Key: FLUME-2344
>                 URL: https://issues.apache.org/jira/browse/FLUME-2344
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.4.0
>         Environment: Centos 6.4, Java 1.6.0_34
>            Reporter: Guillermo Ortiz Fernández, Pragsis.
>             Fix For: v1.4.0
>
>         Attachments: FLUME-2344-0.patch
>
>
> New source to be able to tail a file. There's a extra file when it saves the 
> last offset until it has been read. So, it Flume is down, it could read the 
> data it lost. It has been implemented control about rotated files. 
> The possible variables to configure this source are:
> -BufferSize gives us the possibility to send little to little.
> -Separator to cut the lines wherever we want  to to generate our events.
> -WatchedFile to indicate what file we want to watch.
> -RotatedFile to indicate where the wacthed file is going to rotate.
> -Type of events to indicate if we generate one event per line or we want to 
> group many lines and just emit one event.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to