[ 
https://issues.apache.org/jira/browse/NIFI-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16427166#comment-16427166
 ] 

Matt Burgess commented on NIFI-5045:
------------------------------------

The current code expects a SQLNonTransientException be thrown on parse errors, 
and the unit tests use Derby not Hive, the former correctly throws a 
SQLNonTransientException on parse error.

Also unfortunately, the ParseException is not available to the client code. 
However the SQLException's "vendor code" field is populated by Hive using the 
following guidelines:

10000 to 19999: Errors occurring during semantic analysis and compilation of 
the query.
20000 to 29999: Runtime errors where Hive believes that retries are unlikely to 
succeed.
30000 to 39999: Runtime errors which Hive thinks may be transient and retrying 
may succeed.
40000 to 49999: Errors where Hive is unable to advise about retries.

Unfortunately, there is no specific error code for parse errors, so the generic 
error code 40000 is returned. However according to the above doc, we can route 
the flow files to retry or failure based on that guidance. In our case (40000), 
since Hive is unable to advise about retries, we would route the flow files to 
failure.

This logic may change the existing behavior for other errors as well, but I 
think that's a good thing because we'd be adhering to the error code spec, 
rather than making our own best guess.

> PutHiveQL routes parse errors to retry instead of failure
> ---------------------------------------------------------
>
>                 Key: NIFI-5045
>                 URL: https://issues.apache.org/jira/browse/NIFI-5045
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>            Reporter: Matt Burgess
>            Priority: Minor
>
> When the input to PutHiveQL is an invalid HiveQL statement, the documentation 
> for the failure relationship should apply: "A FlowFile is routed to this 
> relationship if the database cannot be updated and retrying the operation 
> will also fail, such as an invalid query or an integrity constraint 
> violation". However, upon invalid HiveQL, the processor routes the flowfile 
> to 'retry', which is neither desired or logical (the query is bad, how could 
> it ever succeed on retry?)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to