Yeah there are about 4 million files in the directory and NiFi wasn't too happy 
about listing all of them. This is just for a test anyway so I might be able to 
use GetHDFS.


Thanks

Shawn Weeks

________________________________
From: Bryan Bende <[email protected]>
Sent: Friday, September 21, 2018 8:54:03 AM
To: [email protected]
Subject: Re: How to Use ListHDFS File Filter Regex

Shawn,

I believe this issue [1] is the fix you are looking for, but
unfortunately isn't in a release yet, but should be in the next one.

As a work around, you may be able to list everything with ListHDFS and
then send the results to a RouteOnContent that routes any content
matching your .bat regex to FetchHDFS, and anything unmatched to
auto-terminated.

Of course if you have tons of files that are not .bat files then this
may not work as well since it will creates lots of flow files that are
unneeded, but should work for many reasonable cases.

Thanks,

Bryan

[1] https://issues.apache.org/jira/browse/NIFI-4434
On Fri, Sep 21, 2018 at 9:33 AM Shawn Weeks <[email protected]> wrote:
>
> Much like the user commenting at the bottom of 
> https://issues.apache.org/jira/browse/NIFI-4074 I can't figure out how to use 
> the File Filter regex. For example if I want to recursively find every file 
> ending in .bat under the hdfs director /data I would use the regex ".*\.bat" 
> however that doesn't appear to work and the ListHDFS Processor returns 
> nothing. This is in the Hortonworks HDF 3.1.2 release of NiFi 1.5. The tool 
> tip seems to indicate the regex is only applied to the file name so what am i 
> missing.
>
>
> Thanks
>
> Shawn Weeks

Reply via email to