That's awesome! Patrick,. Thank you so much. That would tremendously help us.
We are currently using Flume NG 1.2.0. Will we be able to use spooldir on that version? or do we have to upgrade to latest version? Thanks, Sadu On Tue, Oct 16, 2012 at 11:37 PM, Patrick Wendell <[email protected]>wrote: > Hey Sadu, your use case is exactly what I'm writing this for. I'm > hoping this patch will get committed within a few days, we're on a > last rev of reviews. > > - Patrick > > On Tue, Oct 16, 2012 at 10:47 AM, Brock Noland <[email protected]> wrote: > > Correct, it's only available in that patch, from the RB it looks like > > it's not too far off from being committed. > > > > Brock > > > > On Tue, Oct 16, 2012 at 12:00 PM, Sadananda Hegde <[email protected]> > wrote: > >> Yes, It is very similar. > >> > >> The spool directory will keep getting new files. We need to scan > through the > >> directory, send the data in the existing files to HDFS , cleanup the > files > >> (delete / move/ rename, etc) and scan for new files again. The Spooldir > >> source is not available yet, right? > >> > >> Thanks, > >> Sadu > >> > >> > >> On Tue, Oct 16, 2012 at 10:11 AM, Brock Noland <[email protected]> > wrote: > >>> > >>> Sounds like https://issues.apache.org/jira/browse/FlUME-1425 ? > >>> > >>> Brock > >>> > >>> On Mon, Oct 15, 2012 at 11:37 PM, Sadananda Hegde <[email protected] > > > >>> wrote: > >>> > Hello, > >>> > > >>> > I have a scenario where in the client application is continuously > >>> > pushing > >>> > xml messages. Actually the application is writing these messages to > >>> > files > >>> > (new files; same directory). So we will be keep getting new files > >>> > throughout > >>> > the day. I am trying to configure Flume agents on these applcation > >>> > servers > >>> > (4 of them) to pick up the new data and transfer them to HDFS on a > >>> > hadoop > >>> > cluster. How should I configure my source to pick up new files (and > >>> > exclude > >>> > the files that have been processed already)? I don't think Exec > source > >>> > with > >>> > tail -F will work in this scenario because data is not getting > added to > >>> > existing files; rather new files get created. > >>> > > >>> > Thank you very much for your time and support. > >>> > > >>> > Sadu > >>> > >>> > >>> > >>> -- > >>> Apache MRUnit - Unit testing MapReduce - > >>> http://incubator.apache.org/mrunit/ > >> > >> > > > > > > > > -- > > Apache MRUnit - Unit testing MapReduce - > http://incubator.apache.org/mrunit/ >
