[ 
https://issues.apache.org/jira/browse/NIFI-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16905531#comment-16905531
 ] 

Joseph Witt edited comment on NIFI-4775 at 8/12/19 7:42 PM:
------------------------------------------------------------

Additionally the rocksdb library is 12MB in size.  We're already right at the 
limit that the ASF will allow for our builds and therefore we cannot really 
keep including new libraries at this stage until we break extensions away from 
the core.  I would strongly recommend this only gets bundled by someone 
activating it in a profile for their build rather than our default distribution.


was (Author: joewitt):
Additionally the rocksdb library is 12MB in size.  We're already right at the 
limit and therefore we cannot really keep including new libraries at this stage 
until we break extensions away from the core.  I would strongly recommend this 
only gets bundled by someone activating it in a profile for their build rather 
than our default distribution.

> Allow FlowFile Repository to optionally perform fsync when writing CREATE 
> events but not other events
> -----------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4775
>                 URL: https://issues.apache.org/jira/browse/NIFI-4775
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Mark Payne
>            Assignee: Brandon DeVries
>            Priority: Major
>             Fix For: 1.10.0
>
>         Attachments: RocksDBFlowFileRepo.html, rocksdb-flowfile-repo.adoc
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, when a FlowFile is written to the FlowFile Repository, the repo 
> can either fsync or not, depending on nifi.properties. We should allow a 
> third option, of fsync only for CREATE events. In this case, if we receive 
> new data from a source we can fsync the update to the FlowFile Repository 
> before ACK'ing the data from the source. This allows us to guarantee data 
> persistence without the overhead of an fsync for every FlowFile Repository 
> update.
> It may make sense, though, to be a bit more selective about when do this. For 
> example if the source is a system that does not allow us to acknowledge the 
> receipt of data, such as a ListenUDP processor, this doesn't really buy us 
> much. In such a case, we could be smart about avoiding the high cost of an 
> fsync. However, for something like GetSFTP where we have to remove the file 
> in order to 'acknowledge receipt' we can ensure that we wait for the fsync 
> before proceeding.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to