Mark Payne created NIFI-631:
-------------------------------

             Summary: Create ListFile and FetchFile processors
                 Key: NIFI-631
                 URL: https://issues.apache.org/jira/browse/NIFI-631
             Project: Apache NiFi
          Issue Type: Improvement
            Reporter: Mark Payne


This pair of Processors will provide several benefits over the existing GetFile 
processor:

1. Currently, GetFile will continually pull the same files if the "Keep Source 
File" property is set to true. There is no way to pull the file and leave it in 
the directory without continually pulling the same file. We could implement 
state here, but it would either be a huge amount of state to remember 
everything pulled or it would have to always pull the oldest file first so that 
we can maintain just the Last Modified Date of the last file pulled plus all 
files with the same Last Modified Date that have already been pulled.

2. If pulling from a network attached storage such as NFS, this would allow a 
single processor to run ListFiles and then distribute those FlowFiles to the 
cluster so that the cluster can share the work of pulling the data.

3. There are use cases when we may want to pull a specific file (for example, 
in conjunction with ProcessHttpRequest/ProcessHttpResponse) rather than just 
pull all files in a directory. GetFile does not support this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to