[ 
https://issues.apache.org/jira/browse/NIFI-3356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15866337#comment-15866337
 ] 

ASF GitHub Bot commented on NIFI-3356:
--------------------------------------

Github user markap14 commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/1493#discussion_r101106345
  
    --- Diff: nifi-docs/src/main/asciidoc/administration-guide.adoc ---
    @@ -2074,7 +2074,25 @@ The Provenance Repository contains the information 
related to Data Provenance. T
     
     |====
     |*Property*|*Description*
    -|nifi.provenance.repository.implementation|The Provenance Repository 
implementation. The default value is 
org.apache.nifi.provenance.PersistentProvenanceRepository and should only be 
changed with caution. To store provenance events in memory instead of on disk 
(at the risk of data loss in the event of power/machine failure), set this 
property to org.apache.nifi.provenance.VolatileProvenanceRepository.
    +|nifi.provenance.repository.implementation|The Provenance Repository 
implementation. The default value is 
org.apache.nifi.provenance.PersistentProvenanceRepository.
    +Two additional repositories are available as and should only be changed 
with caution.
    +To store provenance events in memory instead of on disk (at the risk of 
data loss in the event of power/machine failure),
    +set this property to 
org.apache.nifi.provenance.VolatileProvenanceRepository. This leaves a 
configurable number of Provenance Events in the Java heap, so the number
    +of events that can be retained is very limited. It has been used 
essentially as a no-op repository and is not recommended.
    --- End diff --
    
    I can agree with that.


> Provide a newly refactored provenance repository
> ------------------------------------------------
>
>                 Key: NIFI-3356
>                 URL: https://issues.apache.org/jira/browse/NIFI-3356
>             Project: Apache NiFi
>          Issue Type: Task
>          Components: Core Framework
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>             Fix For: 1.2.0
>
>
> The Persistent Provenance Repository has been redesigned a few different 
> times over several years. The original design for the repository was to 
> provide storage of events and sequential iteration over those events via a 
> Reporting Task. After that, we added the ability to compress the data so that 
> it could be held longer. We then introduced the notion of indexing and 
> searching via Lucene. We've since made several more modifications to try to 
> boost performance.
> At this point, however, the repository is still the bottleneck for many flows 
> that handle large volumes of small FlowFiles. We need a new implementation 
> that is based around the current goals for the repository and that can 
> provide better throughput.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to