[ 
https://issues.apache.org/jira/browse/NIFI-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943661#comment-16943661
 ] 

Paul Gibeault commented on NIFI-5702:
-------------------------------------

One of the great features of NiFi is that it does not loose state.  We have 
come to depend on NiFi managing the workflow for us, knowing that each file 
will be processed at least once.  When this issue affected us, it caused a 
great amount of investigation to recover the state for each of our flows in the 
cluster.

The idea of dropping data as a default, in any scenario, seems to be our of 
character for NiFi.  However, it makes sense to maintain behavioral consistency 
when new features are added.  Having a setting that defaults to drop data is 
much preferred to not having a setting at all.

> FlowFileRepo should not discard data (at least not by default)
> --------------------------------------------------------------
>
>                 Key: NIFI-5702
>                 URL: https://issues.apache.org/jira/browse/NIFI-5702
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 1.7.1, 1.9.2
>            Reporter: Brandon Rhys DeVries
>            Priority: Major
>
> The WriteAheadFlowFileRepository currently discards data it cannot find a 
> queue for.  Unfortunately, we have run in to issues where, when rejoining a 
> node to a cluster, the flow.xml.gz can go "missing".  This results in the 
> instance creating a new, empty, flow.xml.gz and then continuing on... and not 
> finding queues for any of its existing data, dropping it all.  Regardless of 
> the circumstances leading to an empty (or unexpectedly modified) flow.xml.gz, 
> dropping data without user input seems less than ideal. 
> Internally, my group has added a property 
> "....remove.orphaned.flowfiles.on.startup", defaulting to "false".  On 
> startup, rather than silently dropping data, the repo will throw an exception 
> preventing startup.  The operator can then choose to either "fix" any 
> unexpected issues with the flow.xml.gz, or they can set the above property to 
> "true" which restores the original behavior allowing the system to be 
> restarted.  When set to "true" this property also results in a warning 
> message indicating that in this configuration the repo can drop data without 
> (advance) warning.  
>  
>  
> [1] 
> https://github.com/apache/nifi/blob/support/nifi-1.7.x/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L596



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to