[ 
https://issues.apache.org/jira/browse/NIFI-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943648#comment-16943648
 ] 

Brandon Rhys DeVries commented on NIFI-5702:
--------------------------------------------

[~pwicks], the RocksDB based FlowFile repo implementation, 
RocksDBFlowFileRepository [1] includes the property 
"remove.orphaned.flowfiles.on.startup" to prevent this.  It would be easy 
enough to include in the WriteAheadFlowFileRepository's loadFlowFiles method 
[2].  However, we would then have to decide if we could make this the default 
behavior, or if that would break backwards compatibility (i.e. people expect 
the node to come, dropping data if necessary.)
 
[~markap14], thoughts?
 
[1] 
[https://github.com/apache/nifi/blob/7d77b464ccfbd86ff1f2057c44bc35580d7f9fe2/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/RocksDBFlowFileRepository.java#L1064-L1072]
[2] 
[https://github.com/apache/nifi/blob/7d77b464ccfbd86ff1f2057c44bc35580d7f9fe2/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L685]

> FlowFileRepo should not discard data (at least not by default)
> --------------------------------------------------------------
>
>                 Key: NIFI-5702
>                 URL: https://issues.apache.org/jira/browse/NIFI-5702
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 1.7.1, 1.9.2
>            Reporter: Brandon Rhys DeVries
>            Priority: Major
>
> The WriteAheadFlowFileRepository currently discards data it cannot find a 
> queue for.  Unfortunately, we have run in to issues where, when rejoining a 
> node to a cluster, the flow.xml.gz can go "missing".  This results in the 
> instance creating a new, empty, flow.xml.gz and then continuing on... and not 
> finding queues for any of its existing data, dropping it all.  Regardless of 
> the circumstances leading to an empty (or unexpectedly modified) flow.xml.gz, 
> dropping data without user input seems less than ideal. 
> Internally, my group has added a property 
> "....remove.orphaned.flowfiles.on.startup", defaulting to "false".  On 
> startup, rather than silently dropping data, the repo will throw an exception 
> preventing startup.  The operator can then choose to either "fix" any 
> unexpected issues with the flow.xml.gz, or they can set the above property to 
> "true" which restores the original behavior allowing the system to be 
> restarted.  When set to "true" this property also results in a warning 
> message indicating that in this configuration the repo can drop data without 
> (advance) warning.  
>  
>  
> [1] 
> https://github.com/apache/nifi/blob/support/nifi-1.7.x/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-framework-core/src/main/java/org/apache/nifi/controller/repository/WriteAheadFlowFileRepository.java#L596



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to