[ 
https://issues.apache.org/jira/browse/NIFI-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Witt resolved NIFI-512.
------------------------------
    Resolution: Duplicate

> Allow GetFile to pull in data without deleting the local file
> -------------------------------------------------------------
>
>                 Key: NIFI-512
>                 URL: https://issues.apache.org/jira/browse/NIFI-512
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Extensions
>            Reporter: Mark Payne
>
> There have been several people asking for this capability. Currently, when we 
> do a file listing, it's placed into a HashSet, so there is no ordering for 
> how we pull the files in. My proposal is that we instead order the files such 
> that we pull the oldest file first and keep track of the latest timestamp 
> that we've pulled in. This way on restart we can resume where we left off.
> I would create a FileOutputStream and keep it open. Write out the timestamp 
> each time we pull data in. Then periodically flush the data to disk. Perhaps 
> every second or so - maybe this should be configurable. We need a tradeoff 
> between how much possible duplication we get and how much time we spend 
> persisting the timestamp.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to