[
https://issues.apache.org/jira/browse/NIFI-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14650490#comment-14650490
]
Mark Payne commented on NIFI-631:
---------------------------------
Joe,
For #1, I'd say there's a super tiny advantage in creating the static final
collection, because you don't need it for every instance. But I don't see that
advantage as being big enough to hold any water. I'd say it's much more
important in that respect that we make the code clean and easy to understand.
This is why I have generally avoided static collections, because I think the
code reads better without them.
Regarding #2, I think it's more of the same thing - I feel the code reads
better if you create the collection inline. There are also some processors that
have to create them inline because the contents of the collection change, for
instance based on user-defined properties. So in order to keep it consistent, I
started just building the collections inline for all of my processors. I
wouldn't fault someone for creating the static final collections, though.
I guess that's the long way of saying "it's 6 of one, half dozen of the other."
> Create ListFile and FetchFile processors
> ----------------------------------------
>
> Key: NIFI-631
> URL: https://issues.apache.org/jira/browse/NIFI-631
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Mark Payne
>
> This pair of Processors will provide several benefits over the existing
> GetFile processor:
> 1. Currently, GetFile will continually pull the same files if the "Keep
> Source File" property is set to true. There is no way to pull the file and
> leave it in the directory without continually pulling the same file. We could
> implement state here, but it would either be a huge amount of state to
> remember everything pulled or it would have to always pull the oldest file
> first so that we can maintain just the Last Modified Date of the last file
> pulled plus all files with the same Last Modified Date that have already been
> pulled.
> 2. If pulling from a network attached storage such as NFS, this would allow a
> single processor to run ListFiles and then distribute those FlowFiles to the
> cluster so that the cluster can share the work of pulling the data.
> 3. There are use cases when we may want to pull a specific file (for example,
> in conjunction with ProcessHttpRequest/ProcessHttpResponse) rather than just
> pull all files in a directory. GetFile does not support this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)