I suspect Charlie's writeup is accurate from a traditional relational
DB ETL perspective.  I think you'll see CDC mechanisms increasingly
available through NiFi and you'll see us increasingly add features
around these use cases but oriented again from the perspective of
getting the data to systems such as those in the Hadoop ecosystem.  I
think the idea of NiFi supporting wide and shallow capabilities for
these DB use cases is probably about right even as we go forward.
That said if there are cases we can support well let's discuss them.


Thanks
Joe

On Thu, Oct 29, 2015 at 2:49 AM, Charlie Frasure
<[email protected]> wrote:
> Bob,
>
> I can relate to your background, and thought process for NiFi.  My limited
> experience with it has highlighted the "simple event processing" portion of
> the tool.  It can replace some of the things that SSIS, Informatica, or
> AbInitio are used for, but has a much wider and more shallow focus.  The ETL
> tools are focused on transformations like joins, aggregates, columns to
> rows, or rows to columns, change data capture etc.  NiFi could do some of
> these things, but seems to be designed more for iterative processing of file
> objects.  It's a lightweight data processing tool, or a lightweight service
> bus, depending on your perspective.
>
> You could probably use the sql processor to insert records, assign new DW
> keys etc, and if you already have your logic in stored procedures, you might
> even be able to use NiFi as a process manager, passing the return values as
> needed until you eventually use the email processor to send a notification
> that the warehouse load is complete.  But I don't think that you want to try
> to manage  the tasks associated with relational and dimensional models from
> within NiFi itself.
>
> Charlie
>
>
> On Wed, Oct 28, 2015 at 10:29 PM, Adaryl "Bob" Wakefield, MBA
> <[email protected]> wrote:
>>
>> My work is mostly old world ETL. I create data pipelines using SSIS where
>> I move data usually from flat files into a data warehouse. Since I
>> discovered the Hadoop ecosystem, I’ve been looking for ways to speed up data
>> warehouse load even to a point where I’m loading data in real time. I’ve
>> seen Storm and Spark used as a way to do that. However, I’m not a java
>> developer (yet) and Storm has a pretty steep learning curve for me. When
>> NiFi got announced, I looked at it and said, “hey this looks like SSIS for
>> big data”. So I’ve been kind of looking at it through the lens of visual
>> data flow development.
>>
>> While I’ve seen stacks where Storm is used to load data warehouses but
>> warehouse loads aren’t that complex; using java to cleanse data kind of
>> seams like over kill. 95% of my work can get done using SQL and SSIS is just
>> the traffic cop that tells which stored procs to execute.
>>
>> I guess a better question to ask is, can NiFi be used in place of SSIS
>> (Data Stage, Informatica, etc)? So instead of:
>> SSIS –> Warehouse (batch processing paradigm)
>> you have
>> Kafka –> Nifi –> Warehouse (real time processing)
>>
>> Am I even thinking about this correctly? I know we’re talking about moving
>> data between systems but frequently the move I deal with is on the same box
>> and there is some other piece of software that drops the files to be
>> processed into a folder.
>>
>> B.
>>
>> From: Joe Percivall
>> Sent: Wednesday, October 28, 2015 7:43 PM
>> To: [email protected]
>> Subject: Re: NiFi for CEP
>>
>> Hey Bob,
>>
>> It really depends on your definition of CEP (complex event processing). If
>> what you're trying to do is advanced processing on a single piece of data
>> (anything from small sensor data to huge medical data)  NiFi could
>> potentially be a great candidate to replace Storm or Spark Streaming. If
>> you're trying to do advanced analytics on many pieces of data or create a
>> stateful response to data then Storm or Spark Streaming would handle that
>> better (but NiFi would do a great job getting the data from the edge to the
>> other tech!).
>>
>> There are others that can go into more depth but that's it in a nutshell,
>> Joe
>> - - - - - -
>> Joseph Percivall
>> linkedin.com/in/Percivall
>> e: [email protected]
>>
>>
>>
>>
>> On Wednesday, October 28, 2015 8:02 PM, "Adaryl "Bob" Wakefield, MBA"
>> <[email protected]> wrote:
>>
>>
>> I've been studying NiFi and there is something I'm not quite
>> understanding.
>> Can NiFi be used in place of Storm or Spark Streaming to process streaming
>> data?
>>
>>
>> B.
>>
>>
>>
>

Reply via email to