[
https://issues.apache.org/jira/browse/NIFI-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16566989#comment-16566989
]
ASF GitHub Bot commented on NIFI-4434:
--------------------------------------
Github user bbende commented on the issue:
https://github.com/apache/nifi/pull/2930
I think the primary concern is ensuring that existing flows behave the same
as they currently do, which means the default behavior needs to apply the file
filter to files and directories, and then give a choice to apply it
differently. Besides that, I am fine with having three options.
> ListHDFS applies File Filter also to subdirectory names in recursive search
> ---------------------------------------------------------------------------
>
> Key: NIFI-4434
> URL: https://issues.apache.org/jira/browse/NIFI-4434
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 1.3.0
> Reporter: Holger Frydrych
> Assignee: Jeff Storck
> Priority: Major
>
> The File Filter regex configured in the ListHDFS processor is applied not
> just to files found, but also to subdirectories.
> If you try to set up a recursive search to list e.g. all csv files in a
> directory hierarchy via a regex like ".*\.csv", it will only pick up csv
> files in the base directory, not in any subdirectory. This is because
> subdirectories don't typically match that regex pattern.
> To fix this, either subdirectories should not be matched against the file
> filter, or the file filter should be applied to the full path of all files
> (relative to the base directory). The GetHDFS processor offers both options
> via a switch.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)