Dear flinksters,

I'm using the class `ContinuousFileMonitoringFunction` as a source to
monitor a folder for new incoming files.* I have the problem that not all
the files that are sent to the folder get processed / triggered by the
function*. Specific details of my workflow is that I send up to 1k files
per minute, and that I consume the stream as a `AsyncDataStream`.

I myself raised an unrelated issue with the
`ContinuousFileMonitoringFunction` class some time ago (
https://issues.apache.org/jira/browse/FLINK-8046): if two or more files
shared the very same timestamp, only the first one (non-deterministically
chosen) would be processed. However, I patched the file myself to fix that
problem by using a LinkedHashMap to remember which files had been really
processed before or not. My patch is working fine as far as I can tell.

The problem seems to be rather that some files (when many are sent at once
to the same folder) do not even get triggered/activated/registered by the
class.


Am I properly explaining my problem?


Any hints to solve this challenge would be greatly appreciated ! ❤ THANK YOU

-- 
Juanmi, CEO and co-founder @ 🍃tagtog.net

    Follow tagtog updates on 🐦 Twitter: @tagtog_net
<https://twitter.com/tagtog_net>

Reply via email to