Matt Burgess created NIFI-2713:
----------------------------------

             Summary: Database Fetch processors' max-value columns don't work 
as expected
                 Key: NIFI-2713
                 URL: https://issues.apache.org/jira/browse/NIFI-2713
             Project: Apache NiFi
          Issue Type: Bug
            Reporter: Matt Burgess
            Assignee: Matt Burgess


Currently, for QueryDatabaseTable and GenerateTableFetch, the user can enter 
any number of maximum-value columns, which are used to generate a SQL query 
that will fetch all records whose values are greater than the last-observed 
maximum values for those columns.

However this makes multiple max-value columns not very useful, since they will 
both have to increase in lockstep or records will be lost/skipped. In such a 
case, using one or the other (but not both) would suffice, making multiple 
max-value columns useless.

The more likely use case is that there are multiple columns whose values are 
strictly increasing, but at different rates. This is common with very large 
tables where a column could be for "date_created" and also a "bucket number" 
that strictly increases once a day. Queries for a day's worth of data are more 
efficient if they can be filtered on "bucket" (in this case), then on 
timestamp. However the generated SQL queries would have to reflect that 
"bucket" may remain the same as timestamp is increasing, but once the bucket 
value has increased, then only the (new) timestamps for that bucket should be 
fetched.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to