I think that would be the best option
On Wed, Oct 9, 2013 at 11:27 PM, Abhijeet Shipure <[email protected]>wrote: > Yes new files are created at fixed interval but write time is not fixed > and files written as and when request comes. > I was thinking of creating utility to copy files to new directory and use > Spool Dir source. > > Regards > Abhijeet > > > > On Thu, Oct 10, 2013 at 11:41 AM, Steve Morin <[email protected]>wrote: > >> If the files are continually written to I don't think there is a good >> option. Can new files be written to every time interval? >> >> >> On Wed, Oct 9, 2013 at 11:09 PM, Abhijeet Shipure <[email protected] >> > wrote: >> >>> Hi Steve, >>> >>> Thanks for quick reply, as you pointed out Exec Source does not provide >>> reliability, which is required in my case, and hence it is not suitable. >>> >>> So which other inbuilt source could be used to read from many files ? >>> Just one other requirement is file name s are also dynamically generated >>> using time stamp after every 5 mins. >>> >>> >>> Regards >>> Abhijeet >>> >>> >>> On Thu, Oct 10, 2013 at 11:22 AM, Steve Morin <[email protected]>wrote: >>> >>>> If your read the Flume manual it doesn't support a tail source >>>> >>>> http://flume.apache.org/FlumeUserGuide.html#exec-source >>>> >>>> Warning >>>> >>>> >>>> The problem with ExecSource and other asynchronous sources is that the >>>> source can not guarantee that if there is a failure to put the event into >>>> the Channel the client knows about it. In such cases, the data will be >>>> lost. As a for instance, one of the most commonly requested features is the >>>> tail -F [file]-like use case where an application writes to a log file >>>> on disk and Flume tails the file, sending each line as an event. While this >>>> is possible, there’s an obvious problem; what happens if the channel fills >>>> up and Flume can’t send an event? Flume has no way of indicating to the >>>> application writing the log file that it needs to retain the log or that >>>> the event hasn’t been sent, for some reason. If this doesn’t make sense, >>>> you need only know this: Your application can never guarantee data has been >>>> received when using a unidirectional asynchronous interface such as >>>> ExecSource! As an extension of this warning - and to be completely clear - >>>> there is absolutely zero guarantee of event delivery when using this >>>> source. For stronger reliability guarantees, consider the Spooling >>>> Directory Source or direct integration with Flume via the SDK. >>>> >>>> >>>> >>>> On Wed, Oct 9, 2013 at 10:33 PM, Abhijeet Shipure < >>>> [email protected]> wrote: >>>> >>>>> Hi, >>>>> >>>>> I am looking for Flume NG source that can be used for reading many >>>>> files which are getting continuously updated. >>>>> I trued Spool Dir source but it does not work if file to be read gets >>>>> modified. >>>>> >>>>> Here is the scenario: >>>>> 100 files are getting generated at one time and these files >>>>> are continuously updated for fixed interval say 5 mins, after 5 mins new >>>>> 100 files get generated and being written again for 5 mins. >>>>> >>>>> Which flume source is most suitable and how it should be used >>>>> effectively without any data loss. >>>>> >>>>> Any help is greatly appreciated. >>>>> >>>>> >>>>> Thanks >>>>> Abhijeet Shipure >>>>> >>>>> >>>> >>> >> >
