All,

It seems like we get this sort of question a lot, and Simon's answer here
was really good.  We've had similar for discussions for Kafka[1], Storm and
Spark[2]. Should we think about adding a comparison to other technologies /
applications to the FAQ?  Not in a sales sheet sort of way, but in a way
that emphasizes how these technologies compliment each other.  Obviously we
don't need to go out and find every comparable technology, but having a
place to put answers like Simon's that are easier to reference than the
Apache mail archive might be beneficial.

Brandon

[1] https://groups.google.com/forum/#!topic/confluent-platform/JKeccNEhwaQ
[2]
http://www.zdnet.com/article/hortonworks-cto-on-apache-nifi-what-is-it-and-why-does-it-matter-to-iot/


On Fri, May 6, 2016 at 6:09 AM Simon Ball <[email protected]> wrote:

> ExecuteSQL can certainly deal with millions of rows. Sqoop currently makes
> more sense if you want to distribute the query processing across a large
> number of nodes (if you have 100s millions of rows 10-100GBs+ or TBs of
> data), and write direct into hadoop. If you’re looking for functionality
> like swoop’s incremental imports, then checkout QueryDatabaseTable. As long
> as you set a sensible fetch size on that (1000ish usually good, but depends
> on row size) then I’ve seen very small NiFi instances (AWS t2.small) cope
> with a few millions of rows in the order of 10 seconds.
>
> SpringXD is really a different beast to NiFi. It’s a code->deploy pattern
> rather than a command and control of data flow pattern. Once you deploy a
> SpringXD flow, it’s fixed (more like spark, storm etc compile, deploy,
> never change.) SpringXD recently added some visual design, but Flo is
> primarily a retrospective development environment (monitor a flow, not
> design it).
>
> Nifi also runs out to the edge, and gets the data. SpringXD runs in a core
> cluster (e.g. on YARN). So in this scenario, SpringXD is more like Beam or
> spark steaming. Nifi however, with site-to-site can be used to run right
> out at the edge, secure and transport data and track from origin. This
> means NiFi is actually a complement to technology like SpringXD and Beam.
> NiFi feeds these heavier weight streaming frameworks, handles the data
> movement and simple event processing, then ingesting for more complex
> analytics with the like of XD.
>
> So in short, the technologies are complementary. NiFi has the edge of
> reaching out to collect data, XD may be better for complex analytics.
>
> Simon
>
>
> > On May 6, 2016, at 6:04 AM, nehakaushik86 <[email protected]>
> wrote:
> >
> > Hi,
> >
> > We are designing a system where we need data ingestion framework. The
> data
> > will be consumed from various data systems - DB, social feeds, text
> files,
> > CRM etc. Can you let me know how Apache Nifi fares as compared to Spring
> XD
> > and what are the best use cases where it should be used?
> >
> >
> > Also, I would like to understand the difference between Apache Nifi's
> > ExecuteSQL vs Apache Sqoop. We are planning to ingest huge amount of data
> > from DB - millions of records. Will ExecuteSQL be able to load such huge
> > volume?
> >
> >
> >
> > --
> > View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Apache-Nifi-Vs-Spring-XD-which-one-is-better-tp9963.html
> > Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
> >
>
>

Reply via email to