[
https://issues.apache.org/jira/browse/NIFI-5004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412200#comment-16412200
]
Greg Senia commented on NIFI-5004:
----------------------------------
[~joewitt] thank you for the explanation. I think we were just thinking of
being able to optimize certain types of flows as in some cases the amount of
data being FTPed is extremely large and it would of been an interesting use
case. Can Nifi kick off say an oozie job or some other function if you wanted
to manage the flow that way. Like in our case the data being moved is coming
from a mainframe as a ebcidic variable block RDW copybook that we are
transforming using JRrecord inside of a program writen a few years ago
[https://github.com/gss2002/copybook_formatter] and than making it available as
Hive tables etc.
> Ability to Execute File (FTP/CIFS/SFTP) Copy jobs on Mapreduce From Nifi
> ------------------------------------------------------------------------
>
> Key: NIFI-5004
> URL: https://issues.apache.org/jira/browse/NIFI-5004
> Project: Apache NiFi
> Issue Type: Wish
> Reporter: Greg Senia
> Priority: Critical
>
> Would like to see Nifi run programs on MapReduce exampesl of these like
> FTP2HDFS [https://github.com/gss2002/ftp2hdfs] and CIFS2HDFS
> [https://github.com/gss2002/cifs2hdfs] as a MapReduce application where the
> final resting place is HDFS without any type of data transform on the way in.
> This would reduce overhead on the Nifi node and move the incoming data
> directly to the datanode via shortcircuit/read rites. As I currently have
> these two applications running as MR jobs now and doing this being able to do
> this from within Nifi pointing at HDFS/YARN.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)