[ 
https://issues.apache.org/jira/browse/NIFI-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171719#comment-17171719
 ] 

Mark Payne commented on NIFI-5702:
----------------------------------

Due to some changes that were made for 1.12 that are entirely unrelated to this 
concern, we did make some changes to the FlowFile Repository. Those changes 
were then addressed by implementing exactly this. I added a property to control 
whether or not data was dropped and default the property to retain the data 
instead of drop. I don't think there's a backward compatibility concern here - 
we are not changing an API. Not dropping the data, I think, is the more highly 
anticipated behavior, but more importantly the default should always err on the 
side of "Do not lose data" :)

> FlowFileRepo should not discard data (at least not by default)
> --------------------------------------------------------------
>
>                 Key: NIFI-5702
>                 URL: https://issues.apache.org/jira/browse/NIFI-5702
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 1.7.1, 1.9.2
>            Reporter: Brandon Rhys DeVries
>            Assignee: Mark Payne
>            Priority: Major
>
> The WriteAheadFlowFileRepository currently discards data it cannot find a 
> queue for.  Unfortunately, we have run in to issues where, when rejoining a 
> node to a cluster, the flow.xml.gz can go "missing".  This results in the 
> instance creating a new, empty, flow.xml.gz and then continuing on... and not 
> finding queues for any of its existing data, dropping it all.  Regardless of 
> the circumstances leading to an empty (or unexpectedly modified) flow.xml.gz, 
> dropping data without user input seems less than ideal. 
> Internally, my group has added a property 
> "....remove.orphaned.flowfiles.on.startup", defaulting to "false".  On 
> startup, rather than silently dropping data, the repo will throw an exception 
> preventing startup.  The operator can then choose to either "fix" any 
> unexpected issues with the flow.xml.gz, or they can set the above property to 
> "true" which restores the original behavior allowing the system to be 
> restarted.  When set to "true" this property also results in a warning 
> message indicating that in this configuration the repo can drop data without 
> (advance) warning.  
>  
>  
> [1] 
> https://github.com/apache/nifi/blob/support/nifi-1.7.x/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L596



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to