Apologies if I've missed this in the discussion so far - we use the InvokeHTTP processor a lot, and the invokehttp.java.exception.message attribute is really handy diving into why things have failed without having to match up logs with flow files (from a system with hundreds of processors making thousands of requests). We also route on invokehttp.status.code (e.g. to retry 403s due to a race hazard in an external system) but I don't imagine we'd route on invokehttp.java.exception.* since (as others have mentioned) it looks pretty fragile.
-- James On Tue, 30 Oct 2018 at 16:44, Peter Wicks (pwicks) <[email protected]> wrote: > > Sorry for the delayed response, I've been traveling. > > Responses in order: > > Matt, > Right now our work around is to keep retrying errors, usually with a penalty > or control rate processor. The problem is that we don't know why it failed, > and thus don't know if retry is the correct option. I have not found a way, > without code change, to be able to determine if retrying is the correct > option or not. > > Koji, > Detailed error handling would indeed be a good workaround to the problems > raised by myself and Matt. I have not see this on other processors, but that > does not mean we can't do it of course. I agree that having some kind of > hierarchy system for errors would be a much better solution. > > Pierre, > My primary use case is as you described, a user friendly way to see what > actually happened without going through the log files. But I while I know > it's fragile, routing on exception text stored in an attribute still feels > like a very legitimate use case. I know in many systems there are good > exception types that can be used to route FlowFile's to appropriate failure > relationships, but as Matt mentioned, JDBC has just a handful of exception > types for a very large number of possible error types. > > I think this is probably the same rational that was used to justify this > feature for Execute Stream Command's inclusion of this feature in the past. > To many possible failure conditions to handle with just a few failure > conditions. > > Uwe, > That is a fair question, but it doesn't feel like such a bad fit to me. It's > like extra metadata on the lineage, "We followed this path through the flow > because we had exception " .... " which caused the FlowFile to follow the > failure route". > > But I still prefer the attribute, it could be another option for Detailed > error handling; instead of, or in addition to, additional relationships for > failures, the exception text could be included in an attribute. > > Thanks, > Peter > > -----Original Message----- > From: [email protected] [mailto:[email protected]] > Sent: Saturday, October 27, 2018 10:46 AM > To: [email protected] > Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that caused > failure in an attribute > > Do you really want to mix provenance and data lineage with logging/error > information? > > Writing exception information/logging information within an attribute is not > a bad idea in my opinion. > If a user wants to use this for routing, why not ... or whatever the user > wants to do. > > I could imagine that this can be switched on and off by a property via > config. E.g. in development on and on production off. > > Regards, > Uwe > > > Am 26.10.2018 um 09:26 schrieb Pierre Villard <[email protected]>: > > > > Adding another option to the list. > > > > Peter - if I understand correctly and based on my own experience, the > > idea is not to have an 'exception' attribute to perform custom routing > > after the failure relationship but rather have a more user friendly > > way to see what happened without going through all the logs for a given > > flow file. > > > > If that's correct, then could we add this information somehow to the > > provenance event generated by the processor? Ideally adding a new > > field to a provenance event or using the existing 'details' field? > > > > Pierre > > > > > > Le ven. 26 oct. 2018 à 08:40, Koji Kawamura <[email protected]> a > > écrit : > > > >> Hi all, > >> > >> I'd like to add another option to Matt's list of solutions: > >> > >> 4) Add a processor property, 'Enable detailed error handling' > >> (defaults to false), then toggle available list of relationships. > >> This way, existing flows such as Peter's don't have to change, while > >> he can opt-in new relationships. RouteOnAttribute can be a reference > >> implementation. > >> > >> I like the idea of thinking relationships as potential exceptions. It > >> can be better if relationships have hierarchy. > >> Some users need more granular relationships while others don't. > >> For NiFi 2.0 or later, supporting relationship hierarchy at framework > >> can mitigate having additional property at each processor. > >> > >> Thanks, > >> Koji > >> On Fri, Oct 26, 2018 at 11:49 AM Matt Burgess <[email protected]> > >> wrote: > >>> > >>> Peter, > >>> > >>> Totally agree, RDBMS/JDBC is in a weird class as always, there is a > >>> teaspoon of exception types for an ocean of causes. For NiFi 1.x, it > >>> seems like we need to pick from a set of less-than-ideal solutions: > >>> > >>> 1) Add new relationships, but then your (possibly hundreds of) > >>> processors are invalid > >>> 2) Add new auto-terminated relationships, but then your > >>> previously-handled errors are "lost" > >>> 3) Add an attribute, but then each NiFi instance/release/flow is > >>> responsible for parsing the error and handling it as desired. > >>> > >>> We could mitigate 1-2 with a tool that updates your flow/template by > >>> sending all new failure relationships to the same target as the > >>> existing one, but then the tool itself suffers from maintainability > >>> issues (as does option #3). If we could recognize that the new > >>> relationships are self-terminated and then send the errors out to > >>> the original failure relationship, that could be quite confusing to > >>> the user, especially as time goes on (how to suppress the "new" > >>> errors, e.g.). > >>> > >>> IMHO I think we're between a rock and a hard place here, I guess > >>> with great entropy comes great responsibility :P > >>> > >>> P.S. For your use case, is the workaround to just keep retrying? Or > >>> are there other constraints at play? > >>> > >>> Regards, > >>> Matt > >>> > >>> On Thu, Oct 25, 2018 at 10:27 PM Peter Wicks (pwicks) > >>> <[email protected]> > >> wrote: > >>>> > >>>> Matt, > >>>> > >>>> If I were to split an existing failure relationship into several > >> relationships, I do not think I would want to auto-terminate in most cases. > >> Specifically, I'm interested in a failure relationship for a database > >> disconnect during SQL execution (database was online when the > >> connection was verified in the DBCP pool, but went down during > >> execution). If I were to find a way to separate this into its own > >> relationship, I do not think most users would appreciate it being a > >> condition silently not handled by the normal failure path. > >>>> > >>>> Thanks, > >>>> Peter > >>>> > >>>> -----Original Message----- > >>>> From: Matt Burgess [mailto:[email protected]] > >>>> Sent: Friday, October 26, 2018 10:18 AM > >>>> To: [email protected] > >>>> Subject: Re: [EXT] Re: New Standard Pattern - Put Exception that > >> caused failure in an attribute > >>>> > >>>> NiFi (as of the last couple releases I think) has the ability to > >>>> set > >> auto-terminating relationships; this IMO is one of those use cases > >> (for NiFi 1.x). If new relationships are added, they could default to > >> auto-terminate; then the existing processors should remain valid. > >>>> However we might want an "omnibus Jira" to capture those > >>>> relationships > >> we'd like to remove the auto-termination from in NiFi 2.0. > >>>> > >>>> Regards, > >>>> Matt > >>>> On Thu, Oct 25, 2018 at 10:12 PM Peter Wicks (pwicks) < > >> [email protected]> wrote: > >>>>> > >>>>> Mark, > >>>>> > >>>>> I agree with you that this is the best option in general terms. > >> After thinking about it some more I think the biggest use case is for > >> troubleshooting. If a file routes to failure, you need to be watching > >> the UI to see what the exception was. An admin may have access to the > >> NiFi log files and could grep the error out, but a normal user who > >> checks in on the flow and sees a FlowFile in the error queue will not > >> know what the cause was; this is especially frustrating if retrying > >> the file works without failure the second time... Capturing the error > >> message in an attribute makes this easy to find. > >>>>> > >>>>> One thing I worry about too is adding new relationships to core > >> processors. After an upgrade, won't users need to go to each instance > >> of that processor and handle the new relationship? Right now I'd > >> swagger we have at least five thousand ExecuteSQL processors in our > >> environment; and while we have strong scripting skills in my NiFi > >> team, I would not want to encounter this without that. > >>>>> > >>>>> Thanks, > >>>>> Peter > >>>>> > >>>>> -----Original Message----- > >>>>> From: Mark Payne [mailto:[email protected]] > >>>>> Sent: Thursday, October 25, 2018 10:38 PM > >>>>> To: [email protected] > >>>>> Subject: [EXT] Re: New Standard Pattern - Put Exception that > >>>>> caused failure in an attribute > >>>>> > >>>>> I agree - the notion of adding a "failure.reason" attribute is, in > >> my opinion, an anti-pattern that should be avoided. Relationships are > >> not a workaround but rather the preferred approach in this scenario - > >> an attribute I would consider a workaround. This is due to the fact > >> that not only is it brittle and complex to add processors that route > >> on such things, but there's no reason at all to assume that from > >> release to release (even bug fix/increment releases) that the > >> Exception type or message will be the same, so the flow could stop working > >> at any time after upgrading nifi. > >>>>> Relationships offer a well-defined way to explicitly indicate > >>>>> "these > >> are the possible outcomes," > >>>>> similar IMO to Java Exception classes vs. throwing Strings in C. > >>>>> > >>>>> > >>>>>> On Oct 25, 2018, at 9:47 AM, Bryan Bende <[email protected]> wrote: > >>>>>> > >>>>>> I think processors should really have well defined relationships > >> for > >>>>>> the error scenarios that need to be handled. Having the exception > >>>>>> message is ok for a human who wants to see it, but in order to do > >>>>>> anything with it in the flow you will have to have a bunch of > >>>>>> parsing/interpreting of the message with a bunch of routing > >>>>>> processors, which seems more brittle than just having the > >>>>>> appropriate relationships. > >>>>>> On Thu, Oct 25, 2018 at 1:36 AM Peter Wicks (pwicks) < > >> [email protected]> wrote: > >>>>>>> > >>>>>>> When a FlowFile is routed to failure, frequently there is no > >> clear reason without looking into the actual error message. > >>>>>>> Some processors work around this by creating many different > >> relationships, but even then frequently the generic Failure > >> relationship also provides little guidance. > >>>>>>> > >>>>>>> I've seen a few cases recently where processors are including > >>>>>>> the > >> exception message as an attribute on the FlowFile when routing to > >> failure (ExecuteStreamCommand, new PR for ExecuteSQL). Should this be > >> a standard pattern so that it's easier for users to route failures? > >>>>>>> > >>>>>>> --Peter > >>>>> > >> >
