[ 
https://issues.apache.org/jira/browse/NIFI-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264426#comment-16264426
 ] 

Jeroen Dries commented on NIFI-4631:
------------------------------------

Proposed implementation using NIO method and streams, note that the Files.walk 
method also supports a max recursion depth:
{code:java}
   @Override
    protected List<FileInfo> performListing(final ProcessContext context, final 
Long minTimestamp) throws IOException {
        final File baseDir = new File(getPath(context));
        final Boolean recurse = context.getProperty(RECURSE).asBoolean();

        try(Stream<Path> allFiles = 
recurse?Files.walk(baseDir.toPath()):Files.list(baseDir.toPath())){
            return allFiles.map(path -> path.toFile()).filter(file -> 
minTimestamp == null || file.lastModified() >= minTimestamp)
                    .filter(file -> fileFilterRef.get().accept(file))
                    .map(file -> new FileInfo.Builder()
                            .directory(file.isDirectory())
                            .filename(file.getName())
                            .fullPathFileName(file.getAbsolutePath())
                            .lastModifiedTime(file.lastModified())
                            .build())
                    .collect(Collectors.toList());
        }
    }
{code}


> Improve ListFile performance (using walkFileTree)
> -------------------------------------------------
>
>                 Key: NIFI-4631
>                 URL: https://issues.apache.org/jira/browse/NIFI-4631
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>    Affects Versions: 1.4.0
>            Reporter: Jeroen Dries
>              Labels: performance
>
> The listfile processor is quite slow when recursing through large directory 
> structures. (In my case an NFS mounted directory, on Linux.)
> A possible fix would be to use the Files.walkFileTree method, which is 
> reported to be faster.
> Another option is mentioned in NIFI-4039



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to