[
https://issues.apache.org/jira/browse/NIFI-3376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102152#comment-16102152
]
Brandon Zachary commented on NIFI-3376:
---------------------------------------
Content Repo has it's own dedicated storage of 14GB.
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=./content_repository
nifi.content.repository.archive.max.retention.period=12 hours
nifi.content.repository.archive.max.usage.percentage=50%
nifi.content.repository.archive.enabled=false
nifi.content.repository.always.sync=false
nifi.content.viewer.url=/nifi-content-viewer/
And everything about that section of the comments is the default with the
exception of the archiving which we turned off.
> Implement content repository ResourceClaim compaction
> -----------------------------------------------------
>
> Key: NIFI-3376
> URL: https://issues.apache.org/jira/browse/NIFI-3376
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Affects Versions: 0.7.1, 1.1.1
> Reporter: Michael Moser
> Assignee: Michael Hogue
>
> On NiFi systems that deal with many files whose size is less than 1 MB, we
> often see that the actual disk usage of the content_repository is much
> greater than the size of flowfiles that NiFi reports are in its queues. As
> an example, NiFi may report "50,000 / 12.5 GB" but the content_repository
> takes up 240 GB of its file system. This leads to scenarios where a 500 GB
> content_repository file system gets 100% full, but "I only had 40 GB of data
> in my NiFi!"
> When several content claims exist in a single resource claim, and most but
> not all content claims are terminated, the entire resource claim is still not
> eligible for deletion or archive. This could mean that only one 10 KB
> content claim out of a 1 MB resource claim is counted by NiFi as existing in
> its queues.
> If a particular flow has a slow egress point where flowfiles could back up
> and remain on the system longer than expected, this problem is exacerbated.
> A potential solution is to compact resource claim files on disk. A background
> thread could examine all resource claims, and for those that get "old" and
> whose active content claim usage drops below a threshold, then rewrite the
> resource claim file.
> A potential work-around is to allow modification of the FileSystemRepository
> MAX_APPENDABLE_CLAIM_LENGTH to make it a smaller number. This would increase
> the probability that the content claims reference count in a resource claim
> would reach 0 and the resource claim becomes eligible for deletion/archive.
> Let users trade-off performance for more accurate accounting of NiFi queue
> size to content repository size.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)