Hi, thank you both for responses. I understand the scenario with rerouting back to processor would cause infinite provenance history. It can also cause inifite loop, when the destination system is offline, therefore I am not using this approach in this case.
Generaly, I have problem identifying the 'last processor which routed the flowfile to failure before entering failure handling'. And yes, I was thinking of attaching UpdateAttribute right after each failure connection I need to handle and distinguish. This would be really messy. Therefore I was thinking I am doing something wrong in general. My thoughts were, that when I can identify where the FlowFile escaped standard execution through failure, I can then just save flowfile somewhere (e.g. HDFS) with metadata (attributes) and let this for future inspection and especially -> manually re-entering the flow from the point of failure. Is this a bad approach ? Or how do you design flows then? Is it possible to programmatically inspect flowfile to find a processor which was the last in the chain touching it (even though this processor did not emit any provenance event at all)? If so, tell me, I can afford coding my processor to acoomplish this task. Thanks. Michal. On Thu, Nov 10, 2016 at 7:50 PM, Andy LoPresto <[email protected]> wrote: > Michael, > > A temporary solution would be to insert an UpdateAttribute processor between > the source processor (where the failure occurred) and your general failure > handling flow. This processor could add an attribute noting the location of > the failure and you could quickly determine that when debugging. > > If this seems cumbersome, you could also put a single ExecuteScript > processor at the beginning of your failure handling flow and query the > provenance events for the incoming flowfile, detect the last event that > occurred, and then write out an additional, arbitrary provenance event > indicating the failure. > > Neither are excellent solutions, and Mark is right that there should be a > better option for diagnosing this. Please submit a Jira capturing your > thoughts and we’ll see what is possible. > > > Andy LoPresto > [email protected] > [email protected] > PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4 BACE 3C6E F65B 2F7D EF69 > > On Nov 7, 2016, at 6:10 AM, Mark Payne <[email protected]> wrote: > > Michal, > > Currently, the guidance that we give is for processors not to emit any sort > of ROUTE event for > routing a FlowFile to a 'failure' relationship. While this may seem > counter-intuitive, we do this because > most of the time when a FlowFile is routed to 'failure', the failure > relationship is not pointing to some > sort of 'failure' flow like you describe here but rather the failure > relationship is a self-loop so that the > Processor tries again. > > In the scenario described above, if PostHTTP were to route a FlowFile to > failure and failure looped back > to PostHTTP, we may see that the FlowFile was routed to failure hundreds (or > more) of times. As a result, > the Provenance lineage would not really be very easy to follow because it > would be filled with a huge number > of 'ROUTE' events. > > That being said, there are things that we could do to be smart about this at > the framework level. For instance, > we could notice that the ROUTE event indicates that the FlowFile is being > routed back to the same queue that > it came from, so we could just discard the ROUTE event. > > Unfortunately, this doesn't always solve the problem, because we also often > see scenarios where there is perhaps > a DistributeLoad processor that load balances between 5 different PostHTTP > processors for instance. If a PostHTTP > fails, it routes back to the DistributeLoad. So we'd need to keep track of > the fact that it's been to this connection before, > even though it wasn't the last connection, and so on. > > So that was a really long-winded way to say: We intentionally do not emit > ROUTE events for 'failure' because it can create > some very complicated, hard-to-follow lineages. But we can - and should - do > better. > > If this is something that you are interested in digging into, in the > codebase, the community would be more than happy > to help guide you along the way! > > Also, if you have other feedback about how you think we can handle these > cases better, please feel free to elaborate on > the thread. > > Thanks > -Mark > > > > On Nov 7, 2016, at 5:46 AM, Michal Klempa <[email protected]> wrote: > > Hi, > I am maintaining several dataflows and I am facing this issue in practice: > Lets say, I have several points of possible failure within the > dataflow (nearly every processor have failure output), I route all of > these into my general failure handler subgroup, which basically does > some filtering and formatting before issuing a notification by email. > > From my email notifications, I get the FlowFile UUID and in case i am > curious on what happened, I go into NiFi and search provenance events > for this particular FlowFile. > And here comes the point: > Sometimes I find hard to find, which processor was the first one which > sent the file into the 'Failure path'. > > Shouldn't processor which does the 'failure' routing send a > ProvenanceEvent with type > ProvenanceEventType.Route to the flowfile history for Dataflow manager > to know when this unfortunate event happened? Is this the guideline > which Processors do not obey? > > Or maybe, I do something wrong when search for events/history of the > FlowFile. > > To get into the concrete example, let me point out that PostHTTP > processor never issues any provenance event regarding the failure (nor > it fills any execution details into attributes, like does the > ExecuteStreamCommand do, for example, there you have execution.error > which contains the stderr). So locating the error to be in PostHTTP is > just heuristic from my side and I cannot find any HTTP -verbose output > (like in curl -v for example), with headers, response from server or > at least 'connection timeout' if that is the case... > > Thanks for suggestions and opinions. > Michal Klempa > > >
