[
https://issues.apache.org/jira/browse/NIFI-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15429750#comment-15429750
]
ASF subversion and git services commented on NIFI-2547:
-------------------------------------------------------
Commit 26d362b144e15ea4a224e346c340d74e978c134c in nifi's branch
refs/heads/master from [~rickysaltzer]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=26d362b ]
NIFI-2547: Add DeleteHDFS Processor
This processor adds the capability to delete files or
directories inside of HDFS.
Paths supports both static and expression language values,
as well as glob support (e.g. /data/for/2016/07/*).
This processor may be used standalone, as well as part of a
downstream connection.
Signed-off-by: Matt Burgess <[email protected]>
Add Glob Matcher with Tests
Also set displayName on properties.
Signed-off-by: Matt Burgess <[email protected]>
This closes #850
> Add DeleteHDFS Processor
> -------------------------
>
> Key: NIFI-2547
> URL: https://issues.apache.org/jira/browse/NIFI-2547
> Project: Apache NiFi
> Issue Type: New Feature
> Reporter: Ricky Saltzer
> Assignee: Ricky Saltzer
> Fix For: 1.0.0
>
>
> There are times where a user may want to remove a file or directory from
> HDFS. The reasons for this vary, but to provide some context, I currently
> have a pipeline where I need to periodically delete files that my NiFi
> pipeline is producing. In my case, it's a "Delete files after they are 7 days
> old".
> Currently, I have to use the {{ExecuteStreamCommand}} processor and manually
> call {{hdfs dfs -rm}}, which is awful when dealing with a large amount of
> files. For one, an entire JVM is spun up for each delete, and two, when
> deleting directories with thousands of files, it can sometimes cause the
> command to hang indefinitely.
> With that being said, I am proposing we add a {{DeleteHDFS}} processor which
> meets the following criteria.
> * Can delete both directories and files
> * Can delete directories recursively
> * Supports the dynamic expression language
> * Supports using glob paths (e.g. /data/for/2017/08/*)
> * Capable of being a downstream processor as well as a standalone processor
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)