ExecuteSQL can certainly deal with millions of rows. Sqoop currently makes more sense if you want to distribute the query processing across a large number of nodes (if you have 100s millions of rows 10-100GBs+ or TBs of data), and write direct into hadoop. If you’re looking for functionality like swoop’s incremental imports, then checkout QueryDatabaseTable. As long as you set a sensible fetch size on that (1000ish usually good, but depends on row size) then I’ve seen very small NiFi instances (AWS t2.small) cope with a few millions of rows in the order of 10 seconds.
SpringXD is really a different beast to NiFi. It’s a code->deploy pattern rather than a command and control of data flow pattern. Once you deploy a SpringXD flow, it’s fixed (more like spark, storm etc compile, deploy, never change.) SpringXD recently added some visual design, but Flo is primarily a retrospective development environment (monitor a flow, not design it). Nifi also runs out to the edge, and gets the data. SpringXD runs in a core cluster (e.g. on YARN). So in this scenario, SpringXD is more like Beam or spark steaming. Nifi however, with site-to-site can be used to run right out at the edge, secure and transport data and track from origin. This means NiFi is actually a complement to technology like SpringXD and Beam. NiFi feeds these heavier weight streaming frameworks, handles the data movement and simple event processing, then ingesting for more complex analytics with the like of XD. So in short, the technologies are complementary. NiFi has the edge of reaching out to collect data, XD may be better for complex analytics. Simon > On May 6, 2016, at 6:04 AM, nehakaushik86 <[email protected]> wrote: > > Hi, > > We are designing a system where we need data ingestion framework. The data > will be consumed from various data systems - DB, social feeds, text files, > CRM etc. Can you let me know how Apache Nifi fares as compared to Spring XD > and what are the best use cases where it should be used? > > > Also, I would like to understand the difference between Apache Nifi's > ExecuteSQL vs Apache Sqoop. We are planning to ingest huge amount of data > from DB - millions of records. Will ExecuteSQL be able to load such huge > volume? > > > > -- > View this message in context: > http://apache-nifi-developer-list.39713.n7.nabble.com/Apache-Nifi-Vs-Spring-XD-which-one-is-better-tp9963.html > Sent from the Apache NiFi Developer List mailing list archive at Nabble.com. >
