[
https://issues.apache.org/jira/browse/NIFI-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bojan updated NIFI-15002:
-------------------------
Attachment: nifi-db-processors-patch.patch
> Query construction issue for MSSQL connection
> ---------------------------------------------
>
> Key: NIFI-15002
> URL: https://issues.apache.org/jira/browse/NIFI-15002
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 2.5.0
> Reporter: Dmitry
> Priority: Major
> Attachments: Screenshot1.png, nifi-db-processors-patch.patch
>
>
> Hello NiFi Team,
> As far as I can see, in NiFi 2.5.0 option "Custom Query" in
> QueryDatabaseTable processor does not work properly at least for MSSQL
> Server 2019 when "Maximum-value Columns" is used.
> Here is an example using MSSQL server sample database AdventureWorks2019.
> The following query was used in "Custom Query":
> {code:sql}
> SELECT ModifiedDate
> FROM [Production].[Product]
> {code}
> with field ModifiedDate as "Maximum-value Columns" property.
> !Screenshot1.png|width=605,height=499!
> In order to check queries on the server side, the following query was used:
> {code:sql}
> SELECT creation_time, last_execution_time, execution_count, text
> FROM sys.dm_exec_query_stats AS deqs
> CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
> WHERE dest.text LIKE '%Production%'
> ORDER BY last_execution_time DESC;
> {code}
> After we ran QueryDatabaseTable first time, the output of server side queries
> check is the following:
> {code:json}
> [ {
> "creation_time" : "2025-09-23 13:17:18.367",
> "last_execution_time" : "2025-09-23 13:17:18.373",
> "execution_count" : 2,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product"
> } ]
> {code}
> So, I assume on the server side query "SELECT * FROM (SELECT ModifiedDate
> FROM [Production].[Product]) AS Product" runs twice one after another.
> When we run QueryDatabaseTable second time (now we have already state value
> from previous query and no flowfile will be generated since no new data was
> added), server side queries check gives the following output:
> {code:json}
> [ {
> "creation_time" : "2025-09-23 13:21:01.573",
> "last_execution_time" : "2025-09-23 13:21:01.583",
> "execution_count" : 1,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product WHERE ModifiedDate > '2014-02-08 10:03:55.51'"
> }, {
> "creation_time" : "2025-09-23 13:17:18.367",
> "last_execution_time" : "2025-09-23 13:21:01.57",
> "execution_count" : 3,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product"
> }, {
> "creation_time" : "2025-09-23 13:17:21.017",
> "last_execution_time" : "2025-09-23 13:17:21.02",
> "execution_count" : 1,
> "text" : "SELECT creation_time, last_execution_time, execution_count, text
> \nFROM sys.dm_exec_query_stats AS deqs\nCROSS APPLY
> sys.dm_exec_sql_text(deqs.sql_handle) AS dest\nWHERE dest.text LIKE
> '%Production%'\nORDER BY last_execution_time DESC;"
> } ]
> {code}
> So, as far as I can see, when processor QueryDatabaseTable ran the second
> time, again two queries were executed on the server side: "SELECT * FROM
> (SELECT ModifiedDate FROM [Production].[Product]) AS Product" and "SELECT *
> FROM (SELECT ModifiedDate FROM [Production].[Product]) AS Product WHERE
> ModifiedDate > '2014-02-08 10:03:55.51'" right after it. I assume, since
> processor produces valid result (in out case no flowfile), output from the
> first query is ignored and output of the second query is considered as what
> needs to be returned by processor. The issue is that first query makes SELECT
> for all rows, so if table is reasonably big, processor my run for hours (of
> fail, depending on time out configuration) in order to return maybe just a
> few rows of new data.
> In comparison how it looked in NiFi 1.27 where this issue is not observed.
> After first run, the output of server side queries check is the following:
> {code:json}
> [ {
> "creation_time" : "2025-09-23 13:34:12.757",
> "last_execution_time" : "2025-09-23 13:34:12.757",
> "execution_count" : 1,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product"
> }, {
> "creation_time" : "2025-09-23 13:34:12.653",
> "last_execution_time" : "2025-09-23 13:34:12.657",
> "execution_count" : 1,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product WHERE 1=0"
> } ]
> {code}
> So first, query "SELECT * FROM (SELECT ModifiedDate \nFROM
> [Production].[Product]) AS Product WHERE 1=0" runs and then "SELECT * FROM
> (SELECT ModifiedDate \nFROM [Production].[Product]) AS Product".
> After second run, the output of server side queries check is the following:
> {code:sql}
> [ {
> "creation_time" : "2025-09-23 13:36:01.07",
> "last_execution_time" : "2025-09-23 13:36:01.08",
> "execution_count" : 1,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product WHERE ModifiedDate > '2014-02-08 10:03:55.51'"
> }, {
> "creation_time" : "2025-09-23 13:34:12.653",
> "last_execution_time" : "2025-09-23 13:36:01.06",
> "execution_count" : 2,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product WHERE 1=0"
> }, {
> "creation_time" : "2025-09-23 13:34:22.97",
> "last_execution_time" : "2025-09-23 13:34:22.973",
> "execution_count" : 1,
> "text" : "SELECT creation_time, last_execution_time, execution_count, text
> \nFROM sys.dm_exec_query_stats AS deqs\nCROSS APPLY
> sys.dm_exec_sql_text(deqs.sql_handle) AS dest\nWHERE dest.text LIKE
> '%Production%'\nORDER BY last_execution_time DESC;"
> }, {
> "creation_time" : "2025-09-23 13:34:12.757",
> "last_execution_time" : "2025-09-23 13:34:12.757",
> "execution_count" : 1,
> "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product])
> AS Product"
> } ]
> {code}
> So in the second time, query "SELECT * FROM (SELECT ModifiedDate \nFROM
> [Production].[Product]) AS Product WHERE 1=0" runs and right after it "SELECT
> * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) AS Product WHERE
> ModifiedDate > '2014-02-08 10:03:55.51'".
> So, I believe, condition " WHERE 1=0" does not work properly in NiFi 2.5.0.
> Driver for MSSQL connection was used from here
> [https://learn.microsoft.com/en-us/sql/connect/jdbc/download-microsoft-jdbc-driver-for-sql-server]
> and since it is only jre8 and jre11 there, just in case we tried also to use
> manually created jre21 driver ([https://github.com/Microsoft/mssql-jdbc]) but
> it did not help, behavior for NiFi 2.5.0 was the same.
> Could you please, have a look and confirm the issue?
> Best regards,
> Dmitry
--
This message was sent by Atlassian Jira
(v8.20.10#820010)