[
https://issues.apache.org/jira/browse/FLINK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248913#comment-16248913
]
ASF GitHub Bot commented on FLINK-8046:
---------------------------------------
Github user juanmirocks commented on the issue:
https://github.com/apache/flink/pull/4997
No. I don't think this is going to be a suitable solution, as if = is
allowed in the comparison, the very same file will be triggered multiple times.
Note that the older and deprecated `FileMonitoringFunction` solves this
situation by having a map of filenames to modification times. More robust but
also more expensive memory-wise. A limit to a possible map could be given in
`LinkedHashMap` with `removeEldestEntry`.
> ContinuousFileMonitoringFunction wrongly ignores files with exact same
> timestamp
> --------------------------------------------------------------------------------
>
> Key: FLINK-8046
> URL: https://issues.apache.org/jira/browse/FLINK-8046
> Project: Flink
> Issue Type: Bug
> Components: Streaming
> Affects Versions: 1.3.2
> Reporter: Juan Miguel Cejuela
> Labels: stream
> Fix For: 1.5.0
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> The current monitoring of files sets the internal variable
> `globalModificationTime` to filter out files that are "older". However, the
> current test (to check "older") does
> `boolean shouldIgnore = modificationTime <= globalModificationTime;` (rom
> `shouldIgnore`)
> The comparison should strictly be SMALLER (NOT smaller or equal). The method
> documentation also states "This happens if the modification time of the file
> is _smaller_ than...".
> The equality acceptance for "older", makes some files with same exact
> timestamp to be ignored. The behavior is also non-deterministic, as the first
> file to be accepted ("first" being pretty much random) makes the rest of
> files with same exact timestamp to be ignored.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)