Brandon DeVries created NIFI-5702:
-------------------------------------

             Summary: FlowFileRepo should not discard data (at least not by 
default)
                 Key: NIFI-5702
                 URL: https://issues.apache.org/jira/browse/NIFI-5702
             Project: Apache NiFi
          Issue Type: Improvement
            Reporter: Brandon DeVries


The WriteAheadFlowFileRepository currently discards data it cannot find a queue 
for.  Unfortunately, we have run in to issues where, when rejoining a node to a 
cluster, the flow.xml.gz can go "missing".  This results in the instance 
creating a new, empty, flow.xml.gz and then continuing on... and not finding 
queues for any of its existing data, dropping it all.  Regardless of the 
circumstances leading to an empty (or unexpectedly modified) flow.xml.gz, 
dropping data without user input seems less than ideal. 

Internally, my group has added a property 
"....remove.orphaned.flowfiles.on.startup", defaulting to "false".  On startup, 
rather than silently dropping data, the repo will throw an exception preventing 
startup.  The operator can then choose to either "fix" any unexpected issues 
with the flow.xml.gz, or they can set the above property to "true" which 
restores the original behavior allowing the system to be restarted.  When set 
to "true" this property also results in a warning message indicating that in 
this configuration the repo can drop data without (advance) warning.  

 

 

[1] 
https://github.com/apache/nifi/blob/support/nifi-1.7.x/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L596



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to