Matt Burgess created NIFI-2126:
----------------------------------

             Summary: Add processors to enable distributed fetching of database 
tables
                 Key: NIFI-2126
                 URL: https://issues.apache.org/jira/browse/NIFI-2126
             Project: Apache NiFi
          Issue Type: New Feature
            Reporter: Matt Burgess
            Assignee: Matt Burgess
             Fix For: 1.0.0


To enable NiFi to migrate/move data from RDBMS source tables to other target 
systems (other RDMBS, HDFS, etc.), one approach is to be able to distribute the 
fetching of large tables across various tasks/nodes, rather than a single 
ExecuteSQL processor (which for large tables can run out of memory and get 
slow).

The idea would be to generate flow files containing SQL statements that would 
fetch a portion (or "page") of a table. These flow files can be distributed in 
NiFi to many ExecuteSQL processors, each of which would grab a page and emit 
the results. The flow(s) can then continue in parallel/distributed fashion 
until the data is in the target location(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to