[ 
https://issues.apache.org/jira/browse/NIFI-8249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17289340#comment-17289340
 ] 

Matt Burgess commented on NIFI-8249:
------------------------------------

Were the output flow files too large for downstream processing, or was the 
problem that all 100k rows were transferred downstream all at the same time? If 
the partition size is 100k and the fetch size is 100, you need 1000 fetches 
from the database for each partition, which could definitely slow things down. 
I'd only use something like that if there's some latency or timeout occurring 
while getting the rows from the DB. In general for this pattern I recommend the 
fetch size be set to the partition size where prudent, so all rows are returned 
in one fetch. Then if you need to break them up into smaller flowfiles (I'd be 
interested to hear about your flow) you could use SplitRecord.

In any case I will try to reproduce with Max Rows set and identify any bugs 
that exist. Thanks for the discussion and reproduction path, it definitely 
helps!

> ExecuteSQLRecord "administratively yields" and rollls back session when fetch 
> size < query partition size and an SQLException is thrown on subsequent next()
> ------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-8249
>                 URL: https://issues.apache.org/jira/browse/NIFI-8249
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.11.4
>            Reporter: David
>            Assignee: Matt Burgess
>            Priority: Major
>         Attachments: ExecuteSQLRecord.png, GenerateTableFetch.png, Screen 
> Shot 2021-02-23 at 3.00.47 PM.png, Screen Shot 2021-02-23 at 3.04.55 PM.png
>
>
> When ExecuteSQLRecord executes a query with a partition size > the fetch size 
> and one of the fetches throws an SQLException, the exception appears to get 
> "swallowed" and a more general "IOException: Could not obtain next record 
> from ResultSet: routing to failure" in addition to a "transfer relationship 
> not specified;Processor Administratively Yielded for 1 sec" errors are thrown.
> The problematic query/flowfile gets put back into the incoming queue and will 
> continue to fail indefinitely until it is manually removed from the queue.
> If the fetch size >= to the query partition size, the exact SQL error message 
> is thrown and the flowfile gets routed to the failure queue.
>  
> Steps to reliably reproduce:
>  # Create an DB table with an ID column (for partitioning queries against) 
> and a VARCHAR column
>  # Insert a row where the VARCHAR column data is a numeric value (ie '123')
>  # Insert a row where the VARCHAR column data is a string value (ie 'abc')
>  # Create a GenerateTableFetch processor where the "columns to return" 
> includes TO_NUMBER(<varchar column>)
>  # Create a ExecuteSQLRecord processor that accepts the "success" connection 
> from GenerateTableFetch
>  # Set ExecuteSQLRecord fetch size to 1
>  # Run flow
> What should happen:
>  * ExecuteSQLRecord should throw an java.sql.SQLSyntaxErrorException: 
> ORA-01722: invalid number
>  * Flowfile should get transferred to failure queue
> What happens:
>  * Nifi throws one error and prints another, then puts the flowfile back into 
> the incoming queue
>  ** ExecuteSQLRecord throws "Unable to execute SQL select query <query> ... 
> due to java.io.IOException: java.io.IOException: Could not obtain next record 
> from ResultSet"
>  ** Nifi prints errror "transfer relationship not specified; Processor 
> Administratively Yielded for 1 sec"
>  ** Flowfile is kept in current queue and penalized and re-processed again 
> after penalty duration expires
> Thus, the problematic query will get executed indefinitely.
>  
> Note: I believe the issue could be due to ResultSet .next() being called 
> first in the constructor to ResultSetRecordSetWithCallback
> Subsequent .next() is called from resultSetWriter.write(recordSet)
> [https://github.com/apache/nifi/blob/aa741cc5967f62c3c38c2a47e712b7faa6fe19ff/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/sql/RecordSqlWriter.java#L73|https://github.com/apache/nifi/blob/aa741cc5967f62c3c38c2a47e712b7faa6fe19ff/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/sql/RecordSqlWriter.java#L73]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to