Currently this functionality is not supported but this seems like a good
Actually, give that the feature is rather new, we were thinking of opening a
in the dev mailing list in order to
i) discuss some current limitations of the Continuous File Processing source
ii) see how people use it and adjust our features accordingly
I will let you know as soon as I open this thread.
By the way for your use-case, we should probably have a callback in the
that will inform the source that a given checkpoint was successfully performed
we can purge the already processed files. This can be a good solution.
> On Oct 18, 2016, at 9:40 AM, Maciek Próchniak <m...@touk.pl> wrote:
> we want to monitor hdfs (or local) directory, read csv files that appear and
> after successful processing - delete them (mainly not to run out of disk
> I'm not quite sure how to achieve it with current implementation. Previously,
> when we read binary data (unsplittable files) we made small hack and deleted
> in our FileInputFormat - but now we want to use splits and detecting which
> split is 'the last one' is no longer so obvious - of course it's also
> problematic when it comes to checkpointing...
> So my question is - is there a idiomatic way of deleting processed files?