[ 
https://issues.apache.org/jira/browse/NIFI-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-2157:
-------------------------------
    Description: This processor would presumably operate like 
QueryDatabaseTable, except it will contain a "Partition Size" property, and 
rather than executing the SQL statement(s) to fetch rows, it would generate 
flow files containing SQL statements that will select rows from a table. If the 
partition size is indicated, then the SELECT statements will refer to a range 
of rows, such that each statement will grab only a portion of the table. If 
max-value columns are specified, then only rows whose observed values for those 
columns exceed the current maximum will be fetched (i.e. like 
QueryDatabaseTable). These flow files (due to NIFI-1973) can be passed to 
ExecuteSQL processors for the actual fetching of rows, and ExecuteSQL can be 
distributed across cluster nodes and/or multiple tasks. These features enable 
distributed incremental fetching of rows from database table(s).  (was: This 
processor would presumably have a ListDatabaseTables in front of it and will 
use the same DatabaseConnectionPool service. It will read the aforementioned 
attributes along with an optional "Partition Size" property (which accepts 
Expression Language). The information is used to generate flow files containing 
SQL statements that will select rows from a table. If the partition size is 
indicated, then the SELECT statements will refer to a range of rows, such that 
each statement will grab only a portion of the table. These flow files (due to 
NIFI-1973) can be passed to ExecuteSQL processors for the actual fetching of 
rows.)

> Add GenerateTableFetch processor
> --------------------------------
>
>                 Key: NIFI-2157
>                 URL: https://issues.apache.org/jira/browse/NIFI-2157
>             Project: Apache NiFi
>          Issue Type: Sub-task
>            Reporter: Matt Burgess
>            Assignee: Matt Burgess
>             Fix For: 1.0.0
>
>
> This processor would presumably operate like QueryDatabaseTable, except it 
> will contain a "Partition Size" property, and rather than executing the SQL 
> statement(s) to fetch rows, it would generate flow files containing SQL 
> statements that will select rows from a table. If the partition size is 
> indicated, then the SELECT statements will refer to a range of rows, such 
> that each statement will grab only a portion of the table. If max-value 
> columns are specified, then only rows whose observed values for those columns 
> exceed the current maximum will be fetched (i.e. like QueryDatabaseTable). 
> These flow files (due to NIFI-1973) can be passed to ExecuteSQL processors 
> for the actual fetching of rows, and ExecuteSQL can be distributed across 
> cluster nodes and/or multiple tasks. These features enable distributed 
> incremental fetching of rows from database table(s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to