[
https://issues.apache.org/jira/browse/NIFI-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15611872#comment-15611872
]
Joseph Gresock commented on NIFI-2934:
--------------------------------------
I was able to reproduce the scenario described in the ticket. The log
statements that I see spammed after restarting nifi is:
2016-10-27 13:18:20,991 INFO [main] o.a.n.c.repository.FileSystemRepository
Found unknown file /data/nifi/content_repository/123/1477571460115-1147
(10485760 bytes) in File System Repository; archiving file
In the case I reproduced, this was printed for ~5600 files in the
content_repository. After this point, the disk usage went back down below the
configured max usage percentage.
One NiFi dev asked me if there were any files in the
content_repository/*/archive directories during this scenario. I just checked,
and in the worker that has not been restarted yet, there are no files in
archive. However, in the worker that has been restarted, ~1GB is in the
archive directories.
> Archiver still not respecting
> nifi.content.repository.archive.max.usage.percentage
> ----------------------------------------------------------------------------------
>
> Key: NIFI-2934
> URL: https://issues.apache.org/jira/browse/NIFI-2934
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 0.7.0, 0.7.1
> Reporter: Joseph Gresock
> Attachments: Disk-Usage-Increasing.png, NiFi-80-percent-disk.png,
> Queued.png, System-Diagnostics.png, content_repository usage.png, lsof.txt
>
>
> This seems related to NIFI-1726: we've noticed that the content repository
> takes up increasingly more space over time, even beyond the configured max
> usage percentage (see images). After restarting the NiFi cluster we get an
> immediate drop in disk usage with lots of log statements indicating that
> expired content is being removed.
> Not sure if this is related, but we also often get "Too many open files"
> during this expiration process after NiFi restart, despite lsof indicating a
> count far lower than our configured nofile and fs-max.
> In the environment indicated by the pictures,
> nifi.content.repository.archive.max.usage.percentage = 50%. Note that the
> flow itself only has ~240GB queued across the entire cluster, but each
> content_repository directory has over 360GB on each worker. Also note the
> disk usage graph increasing above 50% on each worker, until we finally
> restart and then the usage drops below 50%.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)