[
https://issues.apache.org/jira/browse/NIFI-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264426#comment-16264426
]
Jeroen Dries commented on NIFI-4631:
------------------------------------
Proposed implementation using NIO method and streams, note that the Files.walk
method also supports a max recursion depth:
{code:java}
@Override
protected List<FileInfo> performListing(final ProcessContext context, final
Long minTimestamp) throws IOException {
final File baseDir = new File(getPath(context));
final Boolean recurse = context.getProperty(RECURSE).asBoolean();
try(Stream<Path> allFiles =
recurse?Files.walk(baseDir.toPath()):Files.list(baseDir.toPath())){
return allFiles.map(path -> path.toFile()).filter(file ->
minTimestamp == null || file.lastModified() >= minTimestamp)
.filter(file -> fileFilterRef.get().accept(file))
.map(file -> new FileInfo.Builder()
.directory(file.isDirectory())
.filename(file.getName())
.fullPathFileName(file.getAbsolutePath())
.lastModifiedTime(file.lastModified())
.build())
.collect(Collectors.toList());
}
}
{code}
> Improve ListFile performance (using walkFileTree)
> -------------------------------------------------
>
> Key: NIFI-4631
> URL: https://issues.apache.org/jira/browse/NIFI-4631
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Affects Versions: 1.4.0
> Reporter: Jeroen Dries
> Labels: performance
>
> The listfile processor is quite slow when recursing through large directory
> structures. (In my case an NFS mounted directory, on Linux.)
> A possible fix would be to use the Files.walkFileTree method, which is
> reported to be faster.
> Another option is mentioned in NIFI-4039
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)