Michal,

Currently, the guidance that we give is for processors not to emit any sort of 
ROUTE event for
routing a FlowFile to a 'failure' relationship. While this may seem 
counter-intuitive, we do this because
most of the time when a FlowFile is routed to 'failure', the failure 
relationship is not pointing to some
sort of 'failure' flow like you describe here but rather the failure 
relationship is a self-loop so that the
Processor tries again.

In the scenario described above, if PostHTTP were to route a FlowFile to 
failure and failure looped back
to PostHTTP, we may see that the FlowFile was routed to failure hundreds (or 
more) of times. As a result,
the Provenance lineage would not really be very easy to follow because it would 
be filled with a huge number
of 'ROUTE' events.

That being said, there are things that we could do to be smart about this at 
the framework level. For instance,
we could notice that the ROUTE event indicates that the FlowFile is being 
routed back to the same queue that
it came from, so we could just discard the ROUTE event.

Unfortunately, this doesn't always solve the problem, because we also often see 
scenarios where there is perhaps
a DistributeLoad processor that load balances between 5 different PostHTTP 
processors for instance. If a PostHTTP
fails, it routes back to the DistributeLoad. So we'd need to keep track of the 
fact that it's been to this connection before,
even though it wasn't the last connection, and so on.

So that was a really long-winded way to say: We intentionally do not emit ROUTE 
events for 'failure' because it can create
some very complicated, hard-to-follow lineages. But we can - and should - do 
better.

If this is something that you are interested in digging into, in the codebase, 
the community would be more than happy
to help guide you along the way!

Also, if you have other feedback about how you think we can handle these cases 
better, please feel free to elaborate on
the thread.

Thanks
-Mark



> On Nov 7, 2016, at 5:46 AM, Michal Klempa <[email protected]> wrote:
> 
> Hi,
> I am maintaining several dataflows and I am facing this issue in practice:
> Lets say, I have several points of possible failure within the
> dataflow (nearly every processor have failure output), I route all of
> these into my general failure handler subgroup, which basically does
> some filtering and formatting before issuing a notification by email.
> 
> From my email notifications, I get the FlowFile UUID and in case i am
> curious on what happened, I go into NiFi and search provenance events
> for this particular FlowFile.
> And here comes the point:
> Sometimes I find hard to find, which processor was the first one which
> sent the file into the 'Failure path'.
> 
> Shouldn't processor which does the 'failure' routing send a
> ProvenanceEvent with type
> ProvenanceEventType.Route to the flowfile history for Dataflow manager
> to know when this unfortunate event happened? Is this the guideline
> which Processors do not obey?
> 
> Or maybe, I do something wrong when search for events/history of the FlowFile.
> 
> To get into the concrete example, let me point out that PostHTTP
> processor never issues any provenance event regarding the failure (nor
> it fills any execution details into attributes, like does the
> ExecuteStreamCommand do, for example, there you have execution.error
> which contains the stderr). So locating the error to be in PostHTTP is
> just heuristic from my side and I cannot find any HTTP -verbose output
> (like in curl -v for example), with headers, response from server or
> at least 'connection timeout' if that is the case...
> 
> Thanks for suggestions and opinions.
> Michal Klempa

Reply via email to