[
https://issues.apache.org/jira/browse/NIFI-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joseph Witt resolved NIFI-512.
------------------------------
Resolution: Duplicate
> Allow GetFile to pull in data without deleting the local file
> -------------------------------------------------------------
>
> Key: NIFI-512
> URL: https://issues.apache.org/jira/browse/NIFI-512
> Project: Apache NiFi
> Issue Type: Task
> Components: Extensions
> Reporter: Mark Payne
>
> There have been several people asking for this capability. Currently, when we
> do a file listing, it's placed into a HashSet, so there is no ordering for
> how we pull the files in. My proposal is that we instead order the files such
> that we pull the oldest file first and keep track of the latest timestamp
> that we've pulled in. This way on restart we can resume where we left off.
> I would create a FileOutputStream and keep it open. Write out the timestamp
> each time we pull data in. Then periodically flush the data to disk. Perhaps
> every second or so - maybe this should be configurable. We need a tradeoff
> between how much possible duplication we get and how much time we spend
> persisting the timestamp.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)