[
https://issues.apache.org/jira/browse/NIFI-8633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17362009#comment-17362009
]
ASF subversion and git services commented on NIFI-8633:
-------------------------------------------------------
Commit 172afac6ab32350caf6f882e6e7c18e86a5c8c82 in nifi's branch
refs/heads/main from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=172afac ]
NIFI-8633: This closes #5104. When reading a Content/Resource Claim from
FileSystemRepository, avoid the unnecessary Files.exists call and instead just
create a FileInputStream, catching FileNotFoundException
Signed-off-by: Joe Witt <[email protected]>
> Content Repository can be improved to make fewer disks accesses on read
> -----------------------------------------------------------------------
>
> Key: NIFI-8633
> URL: https://issues.apache.org/jira/browse/NIFI-8633
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
> Fix For: 1.14.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When {{FileSystemRepository.read(ContentClaim)}} or
> {{FileSystemRepository.read(ResourceClaim)}} is called, the repository
> determines the file path for the claim via {{getPath(claim, true);}} where
> the true indicates that we should verify that the file exists.
> This is done so that if we were to pass in a ContentClaim that does not
> exist, we throw a more meaningful ContentNotFoundException instead of just
> letting a FileNotFoundException fly.
> However, this call to {{Files.exists(Path)}} is fairly expensive, as it's a
> disk access. For a flow that uses a lot of smaller files, this can be
> extremely expensive.
> We can improve this by removing the call to {{Files.exists}} all together.
> Instead, just blindly create the {{FileInputStream}} in a try/catch block and
> catch FileNotFoundException, and then wrap that in a
> {{ContentNotFoundException}}. This results in the same API and the same
> contracts as before but avoids the overhead of additional disk accesses/seeks.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)