I would agree with Joe as NiFi is not primarily an orchestration tool so it
may not offer you a full fledged orchestration tool's experience. Having
said that, we have been using NiFi to launch Spark jobs in our HDP and
Azure HDInsight clusters. We are leveraging the Livy service available in
the clusters to do the job. We are using InvokeHTTP processor submit the
job through Livy.

-
Sivaprasanna

On Sat, Jan 12, 2019 at 3:16 AM Otto Fowler <[email protected]> wrote:

> You may want to monitor https://issues.apache.org/jira/browse/NIFI-3698
>
>
>
> On January 11, 2019 at 14:22:24, Jonathan Meran ([email protected])
> wrote:
>
> Thanks Joe!
>
>
>
> We appreciate the kind words and am happy you enjoy our products!
>
>
>
> My thinking is aligned with yours for sure. A main driver for the
> consideration of NiFi for orchestration is that it’s a system we already
> have up and running and maintain.
>
>
>
> Thanks again,
>
> Jon
>
>
>
> *From: *Joe Witt <[email protected]>
> *Reply-To: *"[email protected]" <[email protected]>
> *Date: *Friday, January 11, 2019 at 12:28 PM
> *To: *"[email protected]" <[email protected]>
> *Subject: *Re: NiFI as Data Pipeline Orchestration Tool?
>
>
>
> Jon
>
>
>
> First things first - Sonos is awesome.
>
>
>
> Now back to the matter at hand... NiFi is quite often used for various
> forms of orchestration of other systems doing their thing.  However, I'll
> state that isn't really its primary purpose so for pure orchestration cases
> it can leave you with a less than ideal user experience.
>
>
>
> NiFi is more about managing the flow of data to and from systems and doing
> the necessary
> routing/splitting/forking/joining/merging/transforming/cajoling to make
> that work well.  We're less about telling those other systems what to do
> with the data or when to run.
>
>
>
> Now, having said this it is pretty common.  We have the Spark Livy
> integration for example.  I'd recommend you give tools that cater primarily
> to orchestration a first stab on this and if you find the problem looks
> more and more like I describe then NiFi is probably appropriate.
>
>
>
> Hope that helps a bit.  Talking at a terminology basis is tough as things
> like ETL, orchestration, transformation often mean wildly different things
> to different people.
>
>
>
> Thanks
>
>
>
> On Fri, Jan 11, 2019 at 12:02 PM Jonathan Meran <[email protected]>
> wrote:
>
> Hello,
>
> I am looking into the possibility of using NiFi as a Data Pipeline
> Orchestration Tool. I’m evaluating NiFi along with some other tools such as
> Airflow and AWS Step Functions/Lambdas.
>
>
>
> Has anyone used NiFi as an orchestration/scheduling tool for tasks such as
> submitting spark jobs to an EMR cluster? These are some of the requirements
> we are considering while evaluating such a tool:
>
>
>
>    1. SSH capabilities to execute remote commands
>    2. Rich scheduling (CRON)
>    3. Ability to write custom routines and import custom libraries
>    4. Event-based triggering of a pipeline
>
>
>
> Any insight would be helpful. We have used NiFi for about a year now for
> data movement and are familiar with its capabilities. My biggest worry is
> the ability to coordinate with other machines using SSH.
>
>
>
> Thanks,
>
> Jon
>
>

Reply via email to