[
https://issues.apache.org/jira/browse/NIFI-3415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15979102#comment-15979102
]
ASF GitHub Bot commented on NIFI-3415:
--------------------------------------
Github user mattyb149 commented on the issue:
https://github.com/apache/nifi/pull/1658
I tried the following use case:
- 3 FlowFiles containing SQL, the first two INSERT the same value (meaning
the second will fail), then third INSERTing a new value
- PutSQL with Supports Fragmented Transactions set to `true` (with no
fragment.* attributes set) and Batch Size 1
I was pleasantly surprised to see that even the first successful statement
was rolled back when the second one failed. This was because all 3 flow files
were already in the queue, and using Supports Fragmented Transactions without
the fragment attributes set will cause the `TransactionalFlowFileFilter` to
grab all the flow files (even though the Batch Size is 1). That is existing
behavior (although not documented). We can't count on that though, because we
don't know how many files will be in the queue when `pollFlowFiles()` is called.
However, when I set Supports Fragmented Transactions to `false` with a
Batch Size of 1, then the first and third flow files (the valid ones) were both
processed successfully, and the second flow file was retained in the queue. I
would've expected after the second file failed, the third one would not be
processed. What are your thoughts? Did I configure it incorrectly?
> Add "Rollback on Failure" property to PutHiveStreaming, PutHiveQL, and PutSQL
> -----------------------------------------------------------------------------
>
> Key: NIFI-3415
> URL: https://issues.apache.org/jira/browse/NIFI-3415
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Matt Burgess
> Assignee: Koji Kawamura
>
> Many Put processors (such as PutHiveStreaming, PutHiveQL, and PutSQL) offer
> "failure" and "retry" relationships for flow files that cannot be processed,
> perhaps due to issues with the external system or other errors.
> However there are use cases where if a Put fails, then no other flow files
> should be processed until the issue(s) have been resolved. This should be
> configurable for said processors, to enable both the current behavior and a
> "stop on failure" type of behavior.
> I propose a property be added to the Put processors (at a minimum the
> PutHiveStreaming, PutHiveQL, and PutSQL processors) called "Rollback on
> Failure", which offers true or false values. If set to true, then the
> "failure" and "retry" relationships should be removed from the processor
> instance, and if set to false, those relationships should be offered.
> If Rollback on Failure is false, then the processor should continue to behave
> as it has. If set to true, then if any error occurs while processing a flow
> file, the session should be rolled back rather than transferring the flow
> file to some error-handling relationship.
> It may also be the case that if Rollback on Failure is true, then the
> incoming connection must use a FIFO Prioritizer, but I'm not positive. The
> documentation should be updated to include any such requirements.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)