[
https://issues.apache.org/jira/browse/NIFI-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Burgess updated NIFI-2157:
-------------------------------
Description: This processor would presumably operate like
QueryDatabaseTable, except it will contain a "Partition Size" property, and
rather than executing the SQL statement(s) to fetch rows, it would generate
flow files containing SQL statements that will select rows from a table. If the
partition size is indicated, then the SELECT statements will refer to a range
of rows, such that each statement will grab only a portion of the table. If
max-value columns are specified, then only rows whose observed values for those
columns exceed the current maximum will be fetched (i.e. like
QueryDatabaseTable). These flow files (due to NIFI-1973) can be passed to
ExecuteSQL processors for the actual fetching of rows, and ExecuteSQL can be
distributed across cluster nodes and/or multiple tasks. These features enable
distributed incremental fetching of rows from database table(s). (was: This
processor would presumably have a ListDatabaseTables in front of it and will
use the same DatabaseConnectionPool service. It will read the aforementioned
attributes along with an optional "Partition Size" property (which accepts
Expression Language). The information is used to generate flow files containing
SQL statements that will select rows from a table. If the partition size is
indicated, then the SELECT statements will refer to a range of rows, such that
each statement will grab only a portion of the table. These flow files (due to
NIFI-1973) can be passed to ExecuteSQL processors for the actual fetching of
rows.)
> Add GenerateTableFetch processor
> --------------------------------
>
> Key: NIFI-2157
> URL: https://issues.apache.org/jira/browse/NIFI-2157
> Project: Apache NiFi
> Issue Type: Sub-task
> Reporter: Matt Burgess
> Assignee: Matt Burgess
> Fix For: 1.0.0
>
>
> This processor would presumably operate like QueryDatabaseTable, except it
> will contain a "Partition Size" property, and rather than executing the SQL
> statement(s) to fetch rows, it would generate flow files containing SQL
> statements that will select rows from a table. If the partition size is
> indicated, then the SELECT statements will refer to a range of rows, such
> that each statement will grab only a portion of the table. If max-value
> columns are specified, then only rows whose observed values for those columns
> exceed the current maximum will be fetched (i.e. like QueryDatabaseTable).
> These flow files (due to NIFI-1973) can be passed to ExecuteSQL processors
> for the actual fetching of rows, and ExecuteSQL can be distributed across
> cluster nodes and/or multiple tasks. These features enable distributed
> incremental fetching of rows from database table(s).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)