Koji Kawamura created NIFI-4069:
-----------------------------------
Summary: ListXXX processors can miss files those created while the
processor is listing and filesystem does not provide timestamp milliseconds
precision
Key: NIFI-4069
URL: https://issues.apache.org/jira/browse/NIFI-4069
Project: Apache NiFi
Issue Type: Bug
Components: Extensions
Affects Versions: 1.0.0
Reporter: Koji Kawamura
Assignee: Koji Kawamura
Attachments: ListFilesWithoutMilliseconds.png
For some filesystems such as Mac OS X HFS (Hierarchical File System) or EXT3
are known that only support timestamp in seconds precision. Also some FTP
server is reported that it can only provides timestamp precision in minutes.
This can cause files to NOT be listed as ListXXX processors logic expects
timestamps in milliseconds.
Specifically, if generate several files in one second, not all files will be
listened.
Steps to reproduce:
1. start processor ListFile
2. generate 10000 zero size files with following command:
{code}
for i in {1..10000}; do touch ./test_$i; done
{code}
3. see processor stats: out 3952 (0 bytes)
Current AbstractListProcessor logic adopts LISTING_LAG_NANOS (100ms) and
postponing the files those have the latest timestamp within a listing iteration
to next iteration, however with those filesystem without milliseconds
precision, these logics do not work as expected.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)