Hi Maciek,

Currently this functionality is not supported but this seems like a good 
Actually, give that the feature is rather new, we were thinking of opening a 
in the dev mailing list in order to 

i) discuss some current limitations of the Continuous File Processing source
ii) see how people use it and adjust our features accordingly

I will let you know as soon as I open this thread.

By the way for your use-case, we should probably have a callback in the 
that will inform the source that a given checkpoint was successfully performed 
and then 
we can purge the already processed files. This can be a good solution.


> On Oct 18, 2016, at 9:40 AM, Maciek Próchniak <m...@touk.pl> wrote:
> Hi,
> we want to monitor hdfs (or local) directory, read csv files that appear and 
> after successful processing - delete them (mainly not to run out of disk 
> space...)
> I'm not quite sure how to achieve it with current implementation. Previously, 
> when we read binary data (unsplittable files) we made small hack and deleted 
> them
> in our FileInputFormat - but now we want to use splits and detecting which 
> split is 'the last one' is no longer so obvious - of course it's also 
> problematic when it comes to checkpointing...
> So my question is - is there a idiomatic way of deleting processed files?
> thanks,
> maciek

Reply via email to