Brandon DeVries created NIFI-5702:
-------------------------------------
Summary: FlowFileRepo should not discard data (at least not by
default)
Key: NIFI-5702
URL: https://issues.apache.org/jira/browse/NIFI-5702
Project: Apache NiFi
Issue Type: Improvement
Reporter: Brandon DeVries
The WriteAheadFlowFileRepository currently discards data it cannot find a queue
for. Unfortunately, we have run in to issues where, when rejoining a node to a
cluster, the flow.xml.gz can go "missing". This results in the instance
creating a new, empty, flow.xml.gz and then continuing on... and not finding
queues for any of its existing data, dropping it all. Regardless of the
circumstances leading to an empty (or unexpectedly modified) flow.xml.gz,
dropping data without user input seems less than ideal.
Internally, my group has added a property
"....remove.orphaned.flowfiles.on.startup", defaulting to "false". On startup,
rather than silently dropping data, the repo will throw an exception preventing
startup. The operator can then choose to either "fix" any unexpected issues
with the flow.xml.gz, or they can set the above property to "true" which
restores the original behavior allowing the system to be restarted. When set
to "true" this property also results in a warning message indicating that in
this configuration the repo can drop data without (advance) warning.
[1]
https://github.com/apache/nifi/blob/support/nifi-1.7.x/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L596
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)