I want to run a flow like this:

Notice file in directory
Call script passing path to file
The script calls a mapreduce job
Take the output of the mapreduce job (files) and move those to a new HDFS folder

I see there is an ExecuteStreamProcess which passes a FlowFile to stdin and 
then uses stdout as a flow file.  But in my case, the script reads and writes 
from files based on a path.   I wanted to be able to create these steps and 
connect them in the UI, but I'm thinking that what I need to do instead is have:


1.       A file watcher on the original directory that then calls the scripts

2.       A file watcher on the script's output directory

Does that make sense?

-Dave

Dave Tauzell | Senior Software Engineer | Surescripts
O: 651.855.3042 | www.surescripts.com<http://www.surescripts.com/> |   
[email protected]<mailto:[email protected]>
Connect with us: Twitter<https://twitter.com/Surescripts> I 
LinkedIn<https://www.linkedin.com/company/surescripts-llc> I 
Facebook<https://www.facebook.com/Surescripts> I 
YouTube<http://www.youtube.com/SurescriptsTV>


This e-mail and any files transmitted with it are confidential, may contain 
sensitive information, and are intended solely for the use of the individual or 
entity to whom they are addressed. If you have received this e-mail in error, 
please notify the sender by reply e-mail immediately and destroy all copies of 
the e-mail and any attachments.

Reply via email to