[
https://issues.apache.org/jira/browse/NIFI-1577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172612#comment-15172612
]
Mark Payne commented on NIFI-1577:
----------------------------------
The easiest way that I've found to test this is to run a Processor like
ListenSyslog that supports batching and calls session.append(). If the Run
Duration is set to 25 ms and a fairly large amount of data is pushed to it, the
logs will start being filled with errors about Too Many Open Files. Once this
patch is applied, that goes away.
Unfortunately, the patch does not lend itself well to unit tests, as it would
require inspecting a lot of internal private state about the
StandardProcessSession, which would result in very brittle unit tests. However,
since checkpoint() clears the 'records' map, those streams that would be
accessible will no longer be accessible anyway because the Mapping is from
ContentClaim (which belongs to exactly 1 RepositoryRecord in the 'records' Map)
to an OutputStream. Since the 'records' map is cleared, we cannot access the
OutputStream, so they were being held open without any benefit.
> NiFi holds open too many files when using a Run Duration > 0 ms and calling
> session.append
> ------------------------------------------------------------------------------------------
>
> Key: NIFI-1577
> URL: https://issues.apache.org/jira/browse/NIFI-1577
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Reporter: Mark Payne
> Attachments:
> 0001-NIFI-1577-Close-any-streams-that-are-left-open-for-a.patch
>
>
> If a Processor calls ProcessSession.append() and has a Run Duration scheduled
> > 0 ms, we quickly end up with "Too many open files" exceptions.
> This appears to be due to the fact that calling append() holds the content
> repository's stream open so that the session can keep appending to it, but on
> checkpoint() the session does not close these streams. It should close these
> streams on checkpoint, since the Processor is no longer allowed to reference
> these FlowFiles anyway at that point.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)