Bryan, you managed to explain it in three sentences! :) now I get it! thanks a bunch
On Tue, Mar 6, 2018 at 4:23 PM, Bryan Bende <[email protected]> wrote: > Boris, > > "Penalty Duration" is per flow file, and "Yield" is for the processor. > > If the processor penalizes a flow file and transfers it to a queue, > whatever is processing that queue won't take that flow file from the > queue until the penalty duration has passed. > > If a processor yields, then the framework won't execute that processor > for up to the yield duration, which would mean it won't process any > flow files during that time, which would include new incoming flow > files as well as penalized flow files that might be in a self-loop. > > -Bryan > > > > On Tue, Mar 6, 2018 at 4:01 PM, Boris Tyukin <[email protected]> > wrote: > > Hi Mark, > > > > thanks for your response! especially because I saw your name in that > Jira :) > > > > I think it makes sense to "keep trying until you're successful". > > > > I am a bit confused by "yield" and "penalize" parameters. Can you give > me an > > example how they are used? Let's say, I use ExecuteSQL processor and > route > > failure rel to itself. Then I set Penalty duration to 5 minutes and Yield > > duration to 1 minute and my source database is down for maintenance at > the > > time. I am playing with different scenarios here but not sure I > understand > > what I am seeing. > > > > I've read the docs 5 times now and still confused :) > > > > And what do you think about that solution, proposed by Alessio? looks > simple > > and efficient to me and uses only one extra processor > > > > thanks, > > Boris > > > > On Tue, Mar 6, 2018 at 3:28 PM, Mark Payne <[email protected]> wrote: > >> > >> Hey Boris, > >> > >> Using the UpdateAttribute and RouteOnAttribute approach is only > necessary > >> when you want > >> to retry N number of times (or for some time period) and after that > >> elapses to treat the data > >> differently. Most of the time, though, what is used is to simply loop > the > >> 'failure' relationship back > >> to the processor itself. So failures would simply remain in the flow, > >> trying indefinitely. When a processor > >> is unable to communicate with some external service due to some > >> intermittent issue, that processor > >> generally should "yield", meaning that the processor will not be > triggered > >> for some amount of time > >> (by default it is 1 second). > >> > >> So in this way, it's very simple to just say "keep trying until you're > >> successful." You could also set "age-off" > >> to occur so that if the data is more than say 1 hour old you can have > nifi > >> automatically just discard the data. > >> > >> There are some situations, though, in which users will need to try for > say > >> 10 times and then route the data differently. > >> We could definitely improve that experience instead of having to use > >> UpdateAttribute / RouteOnAttribute. But from > >> my experience simply looping until successful is the most common > scenario > >> and so that's probably why we've not > >> really seen much traction there. > >> > >> Thanks > >> -Mark > >> > >> > >> > >> On Mar 6, 2018, at 3:02 PM, Boris Tyukin <[email protected]> wrote: > >> > >> Just found this Jira > >> https://issues.apache.org/jira/browse/NIFI-90 > >> > >> I am surprised it has not got any traction after 3 years...Having used > >> Apache Airflow for a while, I am looking to retry capabilities in NiFi > and > >> it seems it comes down to "build your own" flow approach, that would > handle > >> retries in a loop and then sleeping for some time. The best alternative > >> solution I found was suggested by Alessio Palma > >> https://community.hortonworks.com/questions/56167/is-there- > wait-processor-in-nifi.html > >> > >> IMHO it still would be nice to have retry capabilities like with Apache > >> Airflow. You can specify a global retry behavior for a flow or specify > retry > >> options per task/processor. This helps a lot to deal with intermittent > >> issues, like losing network connection or source database system, being > down > >> for maintenance. Airflow can also send an email on retry and supports a > >> bunch of other parameters around retries: > >> > >> https://airflow.apache.org/code.html#baseoperator > >> > >> retries (int) – the number of retries that should be performed before > >> failing the task > >> retry_delay (timedelta) – delay between retries > >> retry_exponential_backoff (bool) – allow progressive longer waits > between > >> retries by using exponential backoff algorithm on retry delay (delay > will be > >> converted into seconds) > >> max_retry_delay (timedelta) – maximum delay interval between retries > >> > >> on_retry_callback – much like the on_failure_callback except that it is > >> executed when retries occur. > >> > >> Is everyone using UpdateAttribute and RouteOnAttribute and Sleep method > to > >> implement retries? > >> > >> thanks, > >> Boris > >> > >> > > >
