Frederick Pletz created NIFI-7352:
-------------------------------------

             Summary: Improve PutFile State Handling
                 Key: NIFI-7352
                 URL: https://issues.apache.org/jira/browse/NIFI-7352
             Project: Apache NiFi
          Issue Type: Improvement
          Components: Core Framework
            Reporter: Frederick Pletz


Currently PutFile has three conflict resolution states: REPLACE, IGNORE, FAIL.  
REPLACE writes the new file to disk over the old file and transfers the file to 
SUCCESS.  FAIL does not replace the file on disk and transfers the file to 
FAIL.  IGNORE does not replace the file on disk and transfers the file to 
SUCCESS.  This breakout is less than useful, it is actively inviting 
misunderstanding and miss-use.  It is very easy to assume IGNORE would instead 
have the following behavior: write to disk, but keep both original and new file 
by appending notation information to the end of the filename - similar to how 
filename conflicts are handled in other programs.  I have personal experience 
with this misinterpretation causing a project to drop data for an extended 
period of time without realizing it.  Additionally, the FAIL state is not 
optimally useful in its current state as it is indistinguishable from other 
failure states, such as folder does not exist or lack of write permissions.

 

Desired result: there should be a way to key off a greater degree of detail 
from a PutFile processor.  The easiest from a user perspective would be 
correcting the output queues to include a "FAIL_DUPLICATE" output, opposed to a 
single generic "FAIL" output.  This would remove the need for "IGNORE", since 
that function could be performed by using "FAIL_DUPLICATE" in the desired way - 
most likely by auto-terminating that relationship.  Barring that, an attribute 
added to the flow file on output could give better indication of what happened 
related to the success or failure of the processor - was it ignored?  Written 
to disk?  if it failed, what was the failure: duplicate filename, write 
permission, folder didn't exist?

 

A note toward backwards compatibility: I think the more likely result from the 
NiFi team is the attribute route since it prevents breaking backwards 
compatibility, however, I would caution that this also means teams which are 
using "IGNORE" with an incorrect understanding of what that option means will 
continue to be unaware they are dropping data.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to