[ https://issues.apache.org/jira/browse/NIFI-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16198363#comment-16198363 ]
ASF GitHub Bot commented on NIFI-1706: -------------------------------------- Github user patricker commented on a diff in the pull request: https://github.com/apache/nifi/pull/2162#discussion_r143664137 --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/AbstractDatabaseFetchProcessor.java --- @@ -240,7 +254,14 @@ public void setup(final ProcessContext context) { // Try a query that returns no rows, for the purposes of getting metadata about the columns. It is possible // to use DatabaseMetaData.getColumns(), but not all drivers support this, notably the schema-on-read // approach as in Apache Drill - String query = dbAdapter.getSelectStatement(tableName, maxValueColumnNames, "1 = 0", null, null, null); + String query; + + if(StringUtils.isEmpty(sqlQuery)) { + query = dbAdapter.getSelectStatement(tableName, maxValueColumnNames, "1 = 0", null, null, null); + } else { + query=sqlQuery + " WHERE 1=0"; --- End diff -- @mattyb149 > If they specify a max-value column in the other property, and it is not available in this query, then the getSelectStatement() below doesn't seem like it would work as expected What about if I take the list of maxValueColumns and use those to build the equivelant of the "1=0" condition. For example, if our max value column names are `x` and `y` I could build a where expression `x <> x AND y <> y`. This would ensure the column names were present, and if they aren't an exception would be thrown. > Extend QueryDatabaseTable to support arbitrary queries > ------------------------------------------------------ > > Key: NIFI-1706 > URL: https://issues.apache.org/jira/browse/NIFI-1706 > Project: Apache NiFi > Issue Type: Improvement > Components: Core Framework > Affects Versions: 1.4.0 > Reporter: Paul Bormans > Assignee: Peter Wicks > Labels: features > > The QueryDatabaseTable is able to observe a configured database table for new > rows and yield these into the flowfile. The model of an rdbms however is > often (if not always) normalized so you would need to join various tables in > order to "flatten" the data into useful events for a processing pipeline as > can be build with nifi or various tools within the hadoop ecosystem. > The request is to extend the processor to specify an arbitrary sql query > instead of specifying the table name + columns. > In addition (this may be another issue?) it is desired to limit the number of > rows returned per run. Not just because of bandwidth issue's from the nifi > pipeline onwards but mainly because huge databases may not be able to return > so many records within a reasonable time. -- This message was sent by Atlassian JIRA (v6.4.14#64029)