Hi,

Wondering if this might be helpful, but I just wanted to share.
I've been working on adding a more generic error handling pattern for
NiFi processors, that can handle exceptions by categorizing error
types. As a part of work for the JIRA:
Add "Rollback on Failure" property to PutHiveStreaming, PutHiveQL, and PutSQL
https://issues.apache.org/jira/browse/NIFI-3415

The idea is map an Exception to one of following error type, based on
the procedure that caused the exception:
ConfigurationError(ProcessException, Yield),
InvalidInput(Failure, None),
TemporalFailure(Retry, Yield),
TemporalInputFailure(Retry, Penalize),
Fatal(Retry, Yield);
https://github.com/ijokarumawak/nifi/blob/nifi-3415/nifi-commons/nifi-processor-utilities/src/main/java/org/apache/nifi/processor/util/pattern/ErrorTypes.java

And then wrap actual execution code with an ExceptionHandler.execute method.
This unit test class has detailed example on how it works.
https://github.com/ijokarumawak/nifi/blob/nifi-3415/nifi-commons/nifi-processor-utilities/src/test/java/org/apache/nifi/processor/util/pattern/TestExceptionHandler.java

With these classes custom processors can easily adopt error handling
practices with less code (more reusable code provided by NiFi
framework).
The branch is still work in progress, but any feedback is appreciated.

Thanks,
Koji


On Sun, Mar 26, 2017 at 8:14 AM, Bryan Bende <[email protected]> wrote:
> The cases where an unexpected exception happens should usually be
> things that could eventually be recovered from. For example, maybe
> calling session.write() throws an IOException because the processor
> can't write to the content repository because a disk is full. The idea
> is that you wouldn't expect this to happen, but if it does then you
> would stop NiFi, free up disk space, start back up and resume
> processing successfully. So you are expecting to recover from this
> problem.
>
> This is very different than a processor that tries to write data to a
> database, and for a specific flow file the data is rejected because
> the database says some value is considered invalid. In this case the
> flow file will never succeed no matter how many times you retry
> (unless someone modifies the database schema) so I would expect the
> processor to be catching these kinds of exceptions and routing them to
> failure.
>
> In the event that a flow file remains in a queue, you can always
> manually intervene and remove that flow file from the queue in the UI.
>
> On Sat, Mar 25, 2017 at 2:40 PM, Brian Jeltema <[email protected]> wrote:
>> So if a flowfile is in a state where it will never be successfully processed 
>> by
>> the custom processor, it will just stay queued forever? If this is a 
>> recurring
>> problem, will the bad flowfiles eventually fill up my disk space? What’s the
>> appropriate way to recover from this problem.
>>
>>> On Mar 25, 2017, at 11:10 AM, Bryan Bende <[email protected]> wrote:
>>>
>>> Hi Brian,
>>>
>>> You are correct in your understanding about exceptions that are not
>>> caught by the processor.
>>>
>>> Part of rolling back the session is putting the flow file back in the
>>> queue where it came from, so the flow file will remain in the queue
>>> before the processor and will continue to be submitted to the
>>> processor, but of course the processor will yield which will slow it
>>> down a little bit.
>>>
>>> Generally the above behavior is what you want if something completely
>>> unexpected happens, but if there are specific error conditions that
>>> you know of then you likely want to handle these by catching those
>>> exceptions and routing to specific relationships. For example, when
>>> sending data to an external system one error might be because the data
>>> you sent is malformed and it will always fail, and another error might
>>> be because the service was down in which case it would work if you
>>> retried later. For this case you could have a failure relationship for
>>> the data failures, and connection_failure relationship for the
>>> failures when the system is down.
>>>
>>> Hope this helps.
>>>
>>> -Bryan
>>>
>>>
>>> On Sat, Mar 25, 2017 at 10:11 AM, Brian Jeltema <[email protected]> wrote:
>>>> I’m writing a customer processor, and I’m confused about error handling. 
>>>> As I understand it,
>>>> if an Exception is thrown that is not handled by my processor, then it is 
>>>> handled by the
>>>> framework by doing a session rollback and administrative yield. What I 
>>>> don’t understand
>>>> it what happens to the original flowfile. Does it get resubmitted to the 
>>>> processor? If so,
>>>> and the Exception keeps being thrown, what ultimately happens to the 
>>>> flowfile?
>>>>
>>>> Thanks in advance
>>>> Brian
>>

Reply via email to