[ 
https://issues.apache.org/jira/browse/NIFI-5004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412200#comment-16412200
 ] 

Greg Senia commented on NIFI-5004:
----------------------------------

[~joewitt] thank you for the explanation. I think we were just thinking of 
being able to optimize certain types of flows as in some cases the amount of 
data being FTPed is extremely large and it would of been an interesting use 
case. Can Nifi kick off say an oozie job or some other function if you wanted 
to manage the flow that way. Like in our case the data being moved is coming 
from a mainframe as a ebcidic variable block RDW copybook that we are 
transforming using JRrecord inside of a program writen a few years ago 
[https://github.com/gss2002/copybook_formatter] and than making it available as 
Hive tables etc. 

> Ability to Execute File (FTP/CIFS/SFTP) Copy jobs on Mapreduce From Nifi
> ------------------------------------------------------------------------
>
>                 Key: NIFI-5004
>                 URL: https://issues.apache.org/jira/browse/NIFI-5004
>             Project: Apache NiFi
>          Issue Type: Wish
>            Reporter: Greg Senia
>            Priority: Critical
>
> Would like to see Nifi run programs on MapReduce exampesl of these like 
> FTP2HDFS [https://github.com/gss2002/ftp2hdfs] and CIFS2HDFS 
> [https://github.com/gss2002/cifs2hdfs] as a MapReduce application where the 
> final resting place is HDFS without any type of data transform on the way in. 
> This would reduce overhead on the Nifi node and move the incoming data 
> directly to the datanode via shortcircuit/read rites. As I currently have 
> these two applications running as MR jobs now and doing this being able to do 
> this from within Nifi pointing at HDFS/YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to