[
https://issues.apache.org/jira/browse/NIFI-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15449998#comment-15449998
]
Matt Burgess commented on NIFI-2712:
------------------------------------
Proposed solution is to replace > with >= for all but the first max-value column
> Database Fetch processors' max-value columns don't work as expected
> -------------------------------------------------------------------
>
> Key: NIFI-2712
> URL: https://issues.apache.org/jira/browse/NIFI-2712
> Project: Apache NiFi
> Issue Type: Bug
> Reporter: Matt Burgess
> Assignee: Matt Burgess
>
> Currently, for QueryDatabaseTable and GenerateTableFetch, the user can enter
> any number of maximum-value columns, which are used to generate a SQL query
> that will fetch all records whose values are greater than the last-observed
> maximum values for those columns.
> However this makes multiple max-value columns not very useful, since they
> will both have to increase in lockstep or records will be lost/skipped. In
> such a case, using one or the other (but not both) would suffice, making
> multiple max-value columns useless.
> The more likely use case is that there are multiple columns whose values are
> strictly increasing, but at different rates. This is common with very large
> tables where a column could be for "date_created" and also a "bucket number"
> that strictly increases once a day. Queries for a day's worth of data are
> more efficient if they can be filtered on "bucket" (in this case), then on
> timestamp. However the generated SQL queries would have to reflect that
> "bucket" may remain the same as timestamp is increasing, but once the bucket
> value has increased, then only the (new) timestamps for that bucket should be
> fetched.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)