[ 
https://issues.apache.org/jira/browse/NIFI-15002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bojan updated NIFI-15002:
-------------------------
    Attachment:     (was: nifi-db-processors-patch-1.patch)

> Query construction issue for MSSQL connection
> ---------------------------------------------
>
>                 Key: NIFI-15002
>                 URL: https://issues.apache.org/jira/browse/NIFI-15002
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 2.5.0
>            Reporter: Dmitry
>            Priority: Major
>         Attachments: Screenshot1.png, nifi-db-processors-patch.patch
>
>
> Hello NiFi Team,
> As far as I can see, in NiFi 2.5.0 option "Custom Query" in
> QueryDatabaseTable processor does not work properly at least for MSSQL
> Server 2019 when "Maximum-value Columns" is used.
> Here is an example using MSSQL server sample database AdventureWorks2019.
> The following query was used in "Custom Query":
> {code:sql}
> SELECT ModifiedDate 
> FROM [Production].[Product]
> {code}
> with field ModifiedDate as "Maximum-value Columns" property.
> !Screenshot1.png|width=605,height=499!
> In order to check queries on the server side, the following query was used:
> {code:sql}
> SELECT creation_time, last_execution_time, execution_count, text 
> FROM sys.dm_exec_query_stats AS deqs
> CROSS APPLY sys.dm_exec_sql_text(deqs.sql_handle) AS dest
> WHERE dest.text LIKE '%Production%'
> ORDER BY last_execution_time DESC;
> {code}
> After we ran QueryDatabaseTable first time, the output of server side queries 
> check is the following:
> {code:json}
> [ {
>   "creation_time" : "2025-09-23 13:17:18.367",
>   "last_execution_time" : "2025-09-23 13:17:18.373",
>   "execution_count" : 2,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product"
> } ]
> {code}
> So, I assume on the server side query "SELECT * FROM (SELECT ModifiedDate 
> FROM [Production].[Product]) AS Product" runs twice one after another.
> When we run QueryDatabaseTable second time (now we have already state value 
> from previous query and no flowfile will be generated since no new data was 
> added), server side queries check gives the following output:
> {code:json}
> [ {
>   "creation_time" : "2025-09-23 13:21:01.573",
>   "last_execution_time" : "2025-09-23 13:21:01.583",
>   "execution_count" : 1,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product WHERE ModifiedDate > '2014-02-08 10:03:55.51'"
> }, {
>   "creation_time" : "2025-09-23 13:17:18.367",
>   "last_execution_time" : "2025-09-23 13:21:01.57",
>   "execution_count" : 3,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product"
> }, {
>   "creation_time" : "2025-09-23 13:17:21.017",
>   "last_execution_time" : "2025-09-23 13:17:21.02",
>   "execution_count" : 1,
>   "text" : "SELECT creation_time, last_execution_time, execution_count, text 
> \nFROM sys.dm_exec_query_stats AS deqs\nCROSS APPLY 
> sys.dm_exec_sql_text(deqs.sql_handle) AS dest\nWHERE dest.text LIKE 
> '%Production%'\nORDER BY last_execution_time DESC;"
> } ]
> {code}
> So, as far as I can see, when processor QueryDatabaseTable ran the second 
> time, again two queries were executed on the server side: "SELECT * FROM 
> (SELECT ModifiedDate FROM [Production].[Product]) AS Product" and "SELECT * 
> FROM (SELECT ModifiedDate FROM [Production].[Product]) AS Product WHERE 
> ModifiedDate > '2014-02-08 10:03:55.51'" right after it. I assume, since 
> processor produces valid result (in out case no flowfile), output from the 
> first query is ignored and output of the second query is considered as what 
> needs to be returned by processor. The issue is that first query makes SELECT 
> for all rows, so if table is reasonably big, processor my run for hours (of 
> fail, depending on time out configuration) in order to return maybe just a 
> few rows of new data.
> In comparison how it looked in NiFi 1.27 where this issue is not observed.
> After first run, the output of server side queries check is the following:
> {code:json}
> [ {
>   "creation_time" : "2025-09-23 13:34:12.757",
>   "last_execution_time" : "2025-09-23 13:34:12.757",
>   "execution_count" : 1,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product"
> }, {
>   "creation_time" : "2025-09-23 13:34:12.653",
>   "last_execution_time" : "2025-09-23 13:34:12.657",
>   "execution_count" : 1,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product WHERE 1=0"
> } ]
> {code}
> So first, query "SELECT * FROM (SELECT ModifiedDate \nFROM 
> [Production].[Product]) AS Product WHERE 1=0" runs and then "SELECT * FROM 
> (SELECT ModifiedDate \nFROM [Production].[Product]) AS Product".
> After second run, the output of server side queries check is the following:
> {code:sql}
> [ {
>   "creation_time" : "2025-09-23 13:36:01.07",
>   "last_execution_time" : "2025-09-23 13:36:01.08",
>   "execution_count" : 1,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product WHERE ModifiedDate > '2014-02-08 10:03:55.51'"
> }, {
>   "creation_time" : "2025-09-23 13:34:12.653",
>   "last_execution_time" : "2025-09-23 13:36:01.06",
>   "execution_count" : 2,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product WHERE 1=0"
> }, {
>   "creation_time" : "2025-09-23 13:34:22.97",
>   "last_execution_time" : "2025-09-23 13:34:22.973",
>   "execution_count" : 1,
>   "text" : "SELECT creation_time, last_execution_time, execution_count, text 
> \nFROM sys.dm_exec_query_stats AS deqs\nCROSS APPLY 
> sys.dm_exec_sql_text(deqs.sql_handle) AS dest\nWHERE dest.text LIKE 
> '%Production%'\nORDER BY last_execution_time DESC;"
> }, {
>   "creation_time" : "2025-09-23 13:34:12.757",
>   "last_execution_time" : "2025-09-23 13:34:12.757",
>   "execution_count" : 1,
>   "text" : "SELECT * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) 
> AS Product"
> } ]
> {code}
> So in the second time, query "SELECT * FROM (SELECT ModifiedDate \nFROM 
> [Production].[Product]) AS Product WHERE 1=0" runs and right after it "SELECT 
> * FROM (SELECT ModifiedDate \nFROM [Production].[Product]) AS Product WHERE 
> ModifiedDate > '2014-02-08 10:03:55.51'".
> So, I believe, condition " WHERE 1=0" does not work properly in NiFi 2.5.0.
> Driver for MSSQL connection was used from here 
> [https://learn.microsoft.com/en-us/sql/connect/jdbc/download-microsoft-jdbc-driver-for-sql-server]
>  and since it is only jre8 and jre11 there, just in case we tried also to use 
> manually created jre21 driver ([https://github.com/Microsoft/mssql-jdbc]) but 
> it did not help, behavior for NiFi 2.5.0 was the same.
> Could you please, have a look and confirm the issue?
> Best regards,
> Dmitry



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to