My work is mostly old world ETL. I create data pipelines using SSIS where I 
move data usually from flat files into a data warehouse. Since I discovered the 
Hadoop ecosystem, I’ve been looking for ways to speed up data warehouse load 
even to a point where I’m loading data in real time. I’ve seen Storm and Spark 
used as a way to do that. However, I’m not a java developer (yet) and Storm has 
a pretty steep learning curve for me. When NiFi got announced, I looked at it 
and said, “hey this looks like SSIS for big data”. So I’ve been kind of looking 
at it through the lens of visual data flow development.

While I’ve seen stacks where Storm is used to load data warehouses but 
warehouse loads aren’t that complex; using java to cleanse data kind of seams 
like over kill. 95% of my work can get done using SQL and SSIS is just the 
traffic cop that tells which stored procs to execute. 

I guess a better question to ask is, can NiFi be used in place of SSIS (Data 
Stage, Informatica, etc)? So instead of:
SSIS –> Warehouse (batch processing paradigm)
you have
Kafka –> Nifi –> Warehouse (real time processing)

Am I even thinking about this correctly? I know we’re talking about moving data 
between systems but frequently the move I deal with is on the same box and 
there is some other piece of software that drops the files to be processed into 
a folder.

B.

From: Joe Percivall 
Sent: Wednesday, October 28, 2015 7:43 PM
To: [email protected] 
Subject: Re: NiFi for CEP

Hey Bob,


It really depends on your definition of CEP (complex event processing). If what 
you're trying to do is advanced processing on a single piece of data (anything 
from small sensor data to huge medical data)  NiFi could potentially be a great 
candidate to replace Storm or Spark Streaming. If you're trying to do advanced 
analytics on many pieces of data or create a stateful response to data then 
Storm or Spark Streaming would handle that better (but NiFi would do a great 
job getting the data from the edge to the other tech!).

There are others that can go into more depth but that's it in a nutshell,
Joe
- - - - - - 
Joseph Percivall
linkedin.com/in/Percivall
e: [email protected]





On Wednesday, October 28, 2015 8:02 PM, "Adaryl "Bob" Wakefield, MBA" 
<[email protected]> wrote:




I've been studying NiFi and there is something I'm not quite understanding. 
Can NiFi be used in place of Storm or Spark Streaming to process streaming 
data? 


B. 




Reply via email to