[ 
https://issues.apache.org/jira/browse/NIFI-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371852#comment-16371852
 ] 

ASF GitHub Bot commented on NIFI-4775:
--------------------------------------

Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/2416
  
    I would not consider that circumstance to be unusual, but rather a common 
scenario if power is lost, after NIFI-4775 has been implemented. Given that 
NIFI-4775 was created and that there were no objections, I considered that 
verification that it is intended to be implemented in the future. Once this is 
done, it will guarantee no loss of data (though it would allow loss of 
processing). The proposed solution, however, still results in data loss if 
power is lost, but also prevents us from implementing NIFI-4775 effectively 
because once it is implemented it would provide us no real benefit with such a 
solution, as it would still throw out those fsync'ed CREATE events if another 
partition was not also fsync'ed.


> Allow FlowFile Repository to optionally perform fsync when writing CREATE 
> events but not other events
> -----------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4775
>                 URL: https://issues.apache.org/jira/browse/NIFI-4775
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Mark Payne
>            Priority: Major
>
> Currently, when a FlowFile is written to the FlowFile Repository, the repo 
> can either fsync or not, depending on nifi.properties. We should allow a 
> third option, of fsync only for CREATE events. In this case, if we receive 
> new data from a source we can fsync the update to the FlowFile Repository 
> before ACK'ing the data from the source. This allows us to guarantee data 
> persistence without the overhead of an fsync for every FlowFile Repository 
> update.
> It may make sense, though, to be a bit more selective about when do this. For 
> example if the source is a system that does not allow us to acknowledge the 
> receipt of data, such as a ListenUDP processor, this doesn't really buy us 
> much. In such a case, we could be smart about avoiding the high cost of an 
> fsync. However, for something like GetSFTP where we have to remove the file 
> in order to 'acknowledge receipt' we can ensure that we wait for the fsync 
> before proceeding.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to