Github user ijokarumawak commented on a diff in the pull request: https://github.com/apache/nifi/pull/1407#discussion_r95716780 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/GenerateTableFetch.java --- @@ -115,20 +128,36 @@ public GenerateTableFetch() { @OnScheduled public void setup(final ProcessContext context) { + // The processor is invalid if there is an incoming connection and max-value columns are defined + if (context.getProperty(MAX_VALUE_COLUMN_NAMES).isSet() && context.hasIncomingConnection()) { + throw new ProcessException("If an incoming connection is supplied, no max-value column names may be specified"); --- End diff -- I understand the concerns. For backward compatibility, I think we should provide that so that existing flow can keep fetching rows based on the stored state even after upgrade. I've done the similar thing before with [TailFile processor](https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/TailFile.java#L348). Checking the state key name format to determine if it's current or older format, then migrate the values. I concern this statement in your previous comment, If we support Max-value column with incoming files: > and just document that all the specified tables must contain the max-value columns. Max-value column hasn't been required, I guess that is for users who want to fetch all rows periodically and don't have to track the max value. Maybe for things like master configuration tables. Then I think we need to keep supporting empty max-value column. An example flow I thought that might be useful is, using GenerateFlowFile or FetchFile to pass a configuration text such as: ``` # Table name : MAX value column(s) USERS:LAST_UPDATED ITEMS PURCHASE_HISTORIES:LAST_UPDATED ``` Then pass it to SplitText and ExtractText to generate flow files with attributes `tableName` and `maxColumns`. Then pass it to GenerateTableFetch processor to generate fetch SQL dynamically. This way, user can easily modify which table to fetch. Maybe after processing these incoming flow files, GenerateTableFetch would have state like this (Table `ITEMS` doesn't have max value column): |KEY|VALUE| |----|-------| |USERS.LAST_UPDATED|2017.01.12 11:42:00| |PURCHASE_HISTORIES.LAST_UPDATED|2017.01.12 11:59:32| How do you think? Thanks!
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---