[
https://issues.apache.org/jira/browse/NIFI-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17080620#comment-17080620
]
Joe Witt commented on NIFI-7352:
--------------------------------
Reduced priority of this to our default which is 'major'.
The situation this is used for is when you're putting files somewhere and there
is already a file with a matching name. The terms 'FAIL when there is a
confict' and 'IGNORE when there is a conflict' and 'RENAME when there is a
conflict' are reasonable. The safest option and one which avoids data loss is
FAIL and it is also the default. A user would have to select a less safe
option or keep the default and intentionally route failures for termination.
Now to your point you interpreted what these meant differently than the author
or others have. Please consider submitting a PR to make the meaning of each of
these options more clear for the user.
To your other point it might be nice for a user to be able to select FAIL as a
conflict path but be able to distinguish between failures due to conflict and
failures because it was unable to write the contents at a given time. This
could be achieved by adding a new relationship for that case. This is more
disruptive arguably but clear for users. Alternatively an attribute could be
added to the flowfile indicating whether failure was due to conflict or some
other reason. This could allow someone who really needed the distinction to
use RouteOnAttribute to refine flow logic. More work for the user but possibly
an appropriate path for the relatively rare cases where this distinction will
change flow logic.
> Improve PutFile State Handling
> ------------------------------
>
> Key: NIFI-7352
> URL: https://issues.apache.org/jira/browse/NIFI-7352
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Reporter: Frederick Pletz
> Priority: Major
> Labels: Processor, PutFile
>
> Currently PutFile has three conflict resolution states: REPLACE, IGNORE,
> FAIL. REPLACE writes the new file to disk over the old file and transfers
> the file to SUCCESS. FAIL does not replace the file on disk and transfers
> the file to FAIL. IGNORE does not replace the file on disk and transfers the
> file to SUCCESS. This breakout is less than useful, it is actively inviting
> misunderstanding and miss-use. It is very easy to assume IGNORE would
> instead have the following behavior: write to disk, but keep both original
> and new file by appending notation information to the end of the filename -
> similar to how filename conflicts are handled in other programs. I have
> personal experience with this misinterpretation causing a project to drop
> data for an extended period of time without realizing it. Additionally, the
> FAIL state is not optimally useful in its current state as it is
> indistinguishable from other failure states, such as folder does not exist or
> lack of write permissions.
>
> Desired result: there should be a way to key off a greater degree of detail
> from a PutFile processor. The easiest from a user perspective would be
> correcting the output queues to include a "FAIL_DUPLICATE" output, opposed to
> a single generic "FAIL" output. This would remove the need for "IGNORE",
> since that function could be performed by using "FAIL_DUPLICATE" in the
> desired way - most likely by auto-terminating that relationship. Barring
> that, an attribute added to the flow file on output could give better
> indication of what happened related to the success or failure of the
> processor - was it ignored? Written to disk? if it failed, what was the
> failure: duplicate filename, write permission, folder didn't exist?
>
> A note toward backwards compatibility: I think the more likely result from
> the NiFi team is the attribute route since it prevents breaking backwards
> compatibility, however, I would caution that this also means teams which are
> using "IGNORE" with an incorrect understanding of what that option means will
> continue to be unaware they are dropping data.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)