RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

Peter Wicks (pwicks) Tue, 30 Oct 2018 09:44:50 -0700

Sorry for the delayed response, I've been traveling.

Responses in order:


Matt, 
Right now our work around is to keep retrying errors, usually with a penalty or 
control rate processor. The problem is that we don't know why it failed, and 
thus don't know if retry is the correct option. I have not found a way, without 
code change, to be able to determine if retrying is the correct option or not.

Koji,
Detailed error handling would indeed be a good workaround to the problems 
raised by myself and Matt. I have not see this on other processors, but that 
does not mean we can't do it of course.  I agree that having some kind of 
hierarchy system for errors would be a much better solution.

Pierre,
My primary use case is as you described, a user friendly way to see what 
actually happened without going through the log files. But I while I know it's 
fragile, routing on exception text stored in an attribute still feels like a 
very legitimate use case. I know in many systems there are good exception types 
that can be used to route FlowFile's to appropriate failure relationships, but 
as Matt mentioned, JDBC has just a handful of exception types for a very large 
number of possible error types.

I think this is probably the same rational that was used to justify this 
feature for Execute Stream Command's inclusion of this feature in the past. To 
many possible failure conditions to handle with just a few failure conditions.

Uwe,
That is a fair question, but it doesn't feel like such a bad fit to me. It's 
like extra metadata on the lineage, "We followed this path through the flow 
because we had exception " .... " which caused the FlowFile to follow the 
failure route".
 
But I still prefer the attribute, it could be another option for Detailed error 
handling; instead of, or in addition to, additional relationships for failures, 
the exception text could be included in an attribute.

Thanks,
  Peter

-----Original Message-----
From: [email protected] [mailto:[email protected]] 
Sent: Saturday, October 27, 2018 10:46 AM
To: [email protected]
Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused failure 
in an attribute

Do you really want to mix provenance and data lineage with logging/error 
information?

Writing exception information/logging information within an attribute is not a 
bad idea in my opinion.
If a user wants to use this for routing, why not ... or whatever the user wants 
to do.

I could imagine that this can be switched on and off by a property via config. 
E.g. in development on and on production off.

Regards,
Uwe

> Am 26.10.2018 um 09:26 schrieb Pierre Villard <[email protected]>:
> 
> Adding another option to the list.
> 
> Peter - if I understand correctly and based on my own experience, the 
> idea is not to have an 'exception' attribute to perform custom routing 
> after the failure relationship but rather have a more user friendly 
> way to see what happened without going through all the logs for a given flow 
> file.
> 
> If that's correct, then could we add this information somehow to the 
> provenance event generated by the processor? Ideally adding a new 
> field to a provenance event or using the existing 'details' field?
> 
> Pierre
> 
> 
> Le ven. 26 oct. 2018 à 08:40, Koji Kawamura <[email protected]> a 
> écrit :
> 
>> Hi all,
>> 
>> I'd like to add another option to Matt's list of solutions:
>> 
>> 4) Add a processor property, 'Enable detailed error handling'
>> (defaults to false), then toggle available list of relationships. 
>> This way, existing flows such as Peter's don't have to change, while 
>> he can opt-in new relationships. RouteOnAttribute can be a reference 
>> implementation.
>> 
>> I like the idea of thinking relationships as potential exceptions. It 
>> can be better if relationships have hierarchy.
>> Some users need more granular relationships while others don't.
>> For NiFi 2.0 or later, supporting relationship hierarchy at framework 
>> can mitigate having additional property at each processor.
>> 
>> Thanks,
>> Koji
>> On Fri, Oct 26, 2018 at 11:49 AM Matt Burgess <[email protected]>
>> wrote:
>>> 
>>> Peter,
>>> 
>>> Totally agree, RDBMS/JDBC is in a weird class as always, there is a 
>>> teaspoon of exception types for an ocean of causes. For NiFi 1.x, it 
>>> seems like we need to pick from a set of less-than-ideal solutions:
>>> 
>>> 1) Add new relationships, but then your (possibly hundreds of) 
>>> processors are invalid
>>> 2) Add new auto-terminated relationships, but then your 
>>> previously-handled errors are "lost"
>>> 3) Add an attribute, but then each NiFi instance/release/flow is 
>>> responsible for parsing the error and handling it as desired.
>>> 
>>> We could mitigate 1-2 with a tool that updates your flow/template by 
>>> sending all new failure relationships to the same target as the 
>>> existing one, but then the tool itself suffers from maintainability 
>>> issues (as does option #3). If we could recognize that the new 
>>> relationships are self-terminated and then send the errors out to 
>>> the original failure relationship, that could be quite confusing to 
>>> the user, especially as time goes on (how to suppress the "new" 
>>> errors, e.g.).
>>> 
>>> IMHO I think we're between a rock and a hard place here, I guess 
>>> with great entropy comes great responsibility :P
>>> 
>>> P.S. For your use case, is the workaround to just keep retrying? Or 
>>> are there other constraints at play?
>>> 
>>> Regards,
>>> Matt
>>> 
>>> On Thu, Oct 25, 2018 at 10:27 PM Peter Wicks (pwicks) 
>>> <[email protected]>
>> wrote:
>>>> 
>>>> Matt,
>>>> 
>>>> If I were to split an existing failure relationship into several
>> relationships, I do not think I would want to auto-terminate in most cases.
>> Specifically, I'm interested in a failure relationship for a database 
>> disconnect during SQL execution (database was online when the 
>> connection was verified in the DBCP pool, but went down during 
>> execution). If I were to find a way to separate this into its own 
>> relationship, I do not think most users would appreciate it being a 
>> condition silently not handled by the normal failure path.
>>>> 
>>>> Thanks,
>>>>  Peter
>>>> 
>>>> -----Original Message-----
>>>> From: Matt Burgess [mailto:[email protected]]
>>>> Sent: Friday, October 26, 2018 10:18 AM
>>>> To: [email protected]
>>>> Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that
>> caused failure in an attribute
>>>> 
>>>> NiFi (as of the last couple releases I think) has the ability to 
>>>> set
>> auto-terminating relationships; this IMO is one of those use cases 
>> (for NiFi 1.x). If new relationships are added, they could default to 
>> auto-terminate; then the existing processors should remain valid.
>>>> However we might want an "omnibus Jira" to capture those 
>>>> relationships
>> we'd like to remove the auto-termination from in NiFi 2.0.
>>>> 
>>>> Regards,
>>>> Matt
>>>> On Thu, Oct 25, 2018 at 10:12 PM Peter Wicks (pwicks) <
>> [email protected]> wrote:
>>>>> 
>>>>> Mark,
>>>>> 
>>>>> I agree with you that this is the best option in general terms.
>> After thinking about it some more I think the biggest use case is for 
>> troubleshooting. If a file routes to failure, you need to be watching 
>> the UI to see what the exception was. An admin may have access to the 
>> NiFi log files and could grep the error out, but a normal user who 
>> checks in on the flow and sees a FlowFile in the error queue will not 
>> know what the cause was; this is especially frustrating if retrying 
>> the file works without failure the second time... Capturing the error 
>> message in an attribute makes this easy to find.
>>>>> 
>>>>> One thing I worry about too is adding new relationships to core
>> processors. After an upgrade, won't users need to go to each instance 
>> of that processor and handle the new relationship? Right now I'd 
>> swagger we have at least five thousand ExecuteSQL processors in our 
>> environment; and while we have strong scripting skills in my NiFi 
>> team, I would not want to encounter this without that.
>>>>> 
>>>>> Thanks,
>>>>>  Peter
>>>>> 
>>>>> -----Original Message-----
>>>>> From: Mark Payne [mailto:[email protected]]
>>>>> Sent: Thursday, October 25, 2018 10:38 PM
>>>>> To: [email protected]
>>>>> Subject: [EXT] Re: New Standard Pattern - Put Exception that 
>>>>> caused failure in an attribute
>>>>> 
>>>>> I agree - the notion of adding a "failure.reason" attribute is, in
>> my opinion, an anti-pattern that should be avoided. Relationships are 
>> not a workaround but rather the preferred approach in this scenario - 
>> an attribute I would consider a workaround. This is due to the fact 
>> that not only is it brittle and complex to add processors that route 
>> on such things, but there's no reason at all to assume that from 
>> release to release (even bug fix/increment releases) that the 
>> Exception type or message will be the same, so the flow could stop working 
>> at any time after upgrading nifi.
>>>>> Relationships offer a well-defined way to explicitly indicate 
>>>>> "these
>> are the possible outcomes,"
>>>>> similar IMO to Java Exception classes vs. throwing Strings in C.
>>>>> 
>>>>> 
>>>>>> On Oct 25, 2018, at 9:47 AM, Bryan Bende <[email protected]> wrote:
>>>>>> 
>>>>>> I think processors should really have well defined relationships
>> for
>>>>>> the error scenarios that need to be handled. Having the exception 
>>>>>> message is ok for a human who wants to see it, but in order to do 
>>>>>> anything with it in the flow you will have to have a bunch of 
>>>>>> parsing/interpreting of the message with a bunch of routing 
>>>>>> processors, which seems more brittle than just having the 
>>>>>> appropriate relationships.
>>>>>> On Thu, Oct 25, 2018 at 1:36 AM Peter Wicks (pwicks) <
>> [email protected]> wrote:
>>>>>>> 
>>>>>>> When a FlowFile is routed to failure, frequently there is no
>> clear reason without looking into the actual error message.
>>>>>>> Some processors work around this by creating many different
>> relationships, but even then frequently the generic Failure 
>> relationship also provides little guidance.
>>>>>>> 
>>>>>>> I've seen a few cases recently where processors are including 
>>>>>>> the
>> exception message as an attribute on the FlowFile when routing to 
>> failure (ExecuteStreamCommand, new PR for ExecuteSQL). Should this be 
>> a standard pattern so that it's easier for users to route failures?
>>>>>>> 
>>>>>>> --Peter
>>>>> 
>>

RE: [EXT] Re: New Standard Pattern - Put Exception that caused failure in an attribute

Reply via email to