I have made this in past per minute rolls (YY/MM/DD/HH/MM) and closed a sink after 30 secs. This matched in my cases mostly perfect. But depends on your use case.
Cheers, Alex On Nov 16, 2012, at 5:16 AM, Mohit Anchlia <mohitanch...@gmail.com> wrote: > Another question I had was about rollover. What's the best way to rollover > files in reasonable timeframe? For instance our path is YY/MM/DD/HH so > every hour there is new file and the -1 hr is just sitting with .tmp and it > takes sometimes even hour before .tmp is closed and renamed to .snappy. In > this situation is there a way to tell flume to rollover files sooner based > on some idle time limit? > > On Thu, Nov 15, 2012 at 8:14 PM, Mohit Anchlia <mohitanch...@gmail.com>wrote: > >> Thanks Mike it makes sense. Anyway I can help? >> >> >> On Thu, Nov 15, 2012 at 11:54 AM, Mike Percy <mpe...@apache.org> wrote: >> >>> Hi Mohit, this is a complicated issue. I've filed >>> https://issues.apache.org/jira/browse/FLUME-1714 to track it. >>> >>> In short, it would require a non-trivial amount of work to implement >>> this, and it would need to be done carefully. I agree that it would be >>> better if Flume handled this case more gracefully than it does today. >>> Today, Flume assumes that you have some job that would go and clean up the >>> .tmp files as needed, and that you understand that they could be partially >>> written if a crash occurred. >>> >>> Regards, >>> Mike >>> >>> >>> On Sun, Nov 11, 2012 at 8:32 AM, Mohit Anchlia >>> <mohitanch...@gmail.com>wrote: >>> >>>> What we are seeing is that if flume gets killed either because of server >>>> failure or other reasons, it keeps around the .tmp file. Sometimes for >>>> whatever reasons .tmp file is not readable. Is there a way to rollover .tmp >>>> file more gracefully? >>>> >>> >>> >> -- Alexander Alten-Lorenz http://mapredit.blogspot.com German Hadoop LinkedIn Group: http://goo.gl/N8pCF