Yes new files are created at fixed interval but write time is not fixed and files written as and when request comes. I was thinking of creating utility to copy files to new directory and use Spool Dir source.
Regards Abhijeet On Thu, Oct 10, 2013 at 11:41 AM, Steve Morin <[email protected]> wrote: > If the files are continually written to I don't think there is a good > option. Can new files be written to every time interval? > > > On Wed, Oct 9, 2013 at 11:09 PM, Abhijeet Shipure > <[email protected]>wrote: > >> Hi Steve, >> >> Thanks for quick reply, as you pointed out Exec Source does not provide >> reliability, which is required in my case, and hence it is not suitable. >> >> So which other inbuilt source could be used to read from many files ? >> Just one other requirement is file name s are also dynamically generated >> using time stamp after every 5 mins. >> >> >> Regards >> Abhijeet >> >> >> On Thu, Oct 10, 2013 at 11:22 AM, Steve Morin <[email protected]>wrote: >> >>> If your read the Flume manual it doesn't support a tail source >>> >>> http://flume.apache.org/FlumeUserGuide.html#exec-source >>> >>> Warning >>> >>> >>> The problem with ExecSource and other asynchronous sources is that the >>> source can not guarantee that if there is a failure to put the event into >>> the Channel the client knows about it. In such cases, the data will be >>> lost. As a for instance, one of the most commonly requested features is the >>> tail -F [file]-like use case where an application writes to a log file >>> on disk and Flume tails the file, sending each line as an event. While this >>> is possible, there’s an obvious problem; what happens if the channel fills >>> up and Flume can’t send an event? Flume has no way of indicating to the >>> application writing the log file that it needs to retain the log or that >>> the event hasn’t been sent, for some reason. If this doesn’t make sense, >>> you need only know this: Your application can never guarantee data has been >>> received when using a unidirectional asynchronous interface such as >>> ExecSource! As an extension of this warning - and to be completely clear - >>> there is absolutely zero guarantee of event delivery when using this >>> source. For stronger reliability guarantees, consider the Spooling >>> Directory Source or direct integration with Flume via the SDK. >>> >>> >>> >>> On Wed, Oct 9, 2013 at 10:33 PM, Abhijeet Shipure < >>> [email protected]> wrote: >>> >>>> Hi, >>>> >>>> I am looking for Flume NG source that can be used for reading many >>>> files which are getting continuously updated. >>>> I trued Spool Dir source but it does not work if file to be read gets >>>> modified. >>>> >>>> Here is the scenario: >>>> 100 files are getting generated at one time and these files >>>> are continuously updated for fixed interval say 5 mins, after 5 mins new >>>> 100 files get generated and being written again for 5 mins. >>>> >>>> Which flume source is most suitable and how it should be used >>>> effectively without any data loss. >>>> >>>> Any help is greatly appreciated. >>>> >>>> >>>> Thanks >>>> Abhijeet Shipure >>>> >>>> >>> >> >
