Re: Routing to Failure relationships and Route provenance events

Andy LoPresto Thu, 10 Nov 2016 10:50:40 -0800

Michael,

A temporary solution would be to insert an UpdateAttribute processor between 
the source processor (where the failure occurred) and your general failure 
handling flow. This processor could add an attribute noting the location of the 
failure and you could quickly determine that when debugging.


If this seems cumbersome, you could also put a single ExecuteScript processor 
at the beginning of your failure handling flow and query the provenance events 
for the incoming flowfile, detect the last event that occurred, and then write 
out an additional, arbitrary provenance event indicating the failure.

Neither are excellent solutions, and Mark is right that there should be a 
better option for diagnosing this. Please submit a Jira capturing your thoughts 
and we’ll see what is possible.


Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Nov 7, 2016, at 6:10 AM, Mark Payne <[email protected]> wrote:
> 
> Michal,
> 
> Currently, the guidance that we give is for processors not to emit any sort 
> of ROUTE event for
> routing a FlowFile to a 'failure' relationship. While this may seem 
> counter-intuitive, we do this because
> most of the time when a FlowFile is routed to 'failure', the failure 
> relationship is not pointing to some
> sort of 'failure' flow like you describe here but rather the failure 
> relationship is a self-loop so that the
> Processor tries again.
> 
> In the scenario described above, if PostHTTP were to route a FlowFile to 
> failure and failure looped back
> to PostHTTP, we may see that the FlowFile was routed to failure hundreds (or 
> more) of times. As a result,
> the Provenance lineage would not really be very easy to follow because it 
> would be filled with a huge number
> of 'ROUTE' events.
> 
> That being said, there are things that we could do to be smart about this at 
> the framework level. For instance,
> we could notice that the ROUTE event indicates that the FlowFile is being 
> routed back to the same queue that
> it came from, so we could just discard the ROUTE event.
> 
> Unfortunately, this doesn't always solve the problem, because we also often 
> see scenarios where there is perhaps
> a DistributeLoad processor that load balances between 5 different PostHTTP 
> processors for instance. If a PostHTTP
> fails, it routes back to the DistributeLoad. So we'd need to keep track of 
> the fact that it's been to this connection before,
> even though it wasn't the last connection, and so on.
> 
> So that was a really long-winded way to say: We intentionally do not emit 
> ROUTE events for 'failure' because it can create
> some very complicated, hard-to-follow lineages. But we can - and should - do 
> better.
> 
> If this is something that you are interested in digging into, in the 
> codebase, the community would be more than happy
> to help guide you along the way!
> 
> Also, if you have other feedback about how you think we can handle these 
> cases better, please feel free to elaborate on
> the thread.
> 
> Thanks
> -Mark
> 
> 
> 
>> On Nov 7, 2016, at 5:46 AM, Michal Klempa <[email protected]> wrote:
>> 
>> Hi,
>> I am maintaining several dataflows and I am facing this issue in practice:
>> Lets say, I have several points of possible failure within the
>> dataflow (nearly every processor have failure output), I route all of
>> these into my general failure handler subgroup, which basically does
>> some filtering and formatting before issuing a notification by email.
>> 
>> From my email notifications, I get the FlowFile UUID and in case i am
>> curious on what happened, I go into NiFi and search provenance events
>> for this particular FlowFile.
>> And here comes the point:
>> Sometimes I find hard to find, which processor was the first one which
>> sent the file into the 'Failure path'.
>> 
>> Shouldn't processor which does the 'failure' routing send a
>> ProvenanceEvent with type
>> ProvenanceEventType.Route to the flowfile history for Dataflow manager
>> to know when this unfortunate event happened? Is this the guideline
>> which Processors do not obey?
>> 
>> Or maybe, I do something wrong when search for events/history of the 
>> FlowFile.
>> 
>> To get into the concrete example, let me point out that PostHTTP
>> processor never issues any provenance event regarding the failure (nor
>> it fills any execution details into attributes, like does the
>> ExecuteStreamCommand do, for example, there you have execution.error
>> which contains the stderr). So locating the error to be in PostHTTP is
>> just heuristic from my side and I cannot find any HTTP -verbose output
>> (like in curl -v for example), with headers, response from server or
>> at least 'connection timeout' if that is the case...
>> 
>> Thanks for suggestions and opinions.
>> Michal Klempa
>

signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: Routing to Failure relationships and Route provenance events

Reply via email to