Re: error handling flow in DAG
This is a great use case for the all_success and one_failed trigger rules. If we have "--S-->" be a dependency where the downstream has all_success trigger rule, and "--F-->" be a dependency where the downstream has one_failed as the trigger rule, you can do what you want with a DAG of the form: task_1 --F--> task_1_failure_a --?--> task_1_failure_b \--S--> task_2 \--S--> task_3 --F--> task_3_failure_a --?--> task_3_failure_b (please pardon the mediocre asking diagram and I hope it made it through the wire correctly) --George On Mon, Oct 8, 2018 at 12:14 PM James Meickle wrote: > > Anthony: > > Could you just have the "success" path be declared with "all_success" (the > default), and the "failure" side branches be declared with "all_failed" > depending on the previous task? This will have the same branching structure > you want but with less intermediary operators. > > -James M. > > On Mon, Oct 1, 2018 at 1:12 PM Anthony Brown > wrote: > > > Hi > >I am coding various data flows and one of the requirements we have is to > > have some error tasks happen when some of the tasks failure. These error > > tasks are specific to the task that failed and are not a generic to the > > whole DAG > > > >So for instance if I have a DAG that runs the following tasks > > > > task_1 > task_2 > task_3 > > > >If task_1 fails, then I want to run > > > > task_1_failure_a ---> task_1_failure_b > > > >If task_2 fails, I do not need to do anything specific, but if task_3 > > fails, I need to run > > > > task_3_failure_a ---> task_3_failure_b > > > >I already have a generic on_failure_callback task defined on all tasks > > that handles alerting, but am stuck on the best way of handling a failure > > flow for tasks > > > >The ways I have come up with of handling this are > > > > Have a branch operator between each task with trigger_rule set to all_done. > > The branch operator would then decide whether to go to next (success) task, > > or to go down the failure branch > > > > Put the failure tasks in a separate DAG with no schedule. Have a different > > on_failure_callback for each task that would trigger the failure DAG for > > that task and then do my generic error handling > > > >Does anybody have any thoughts on which of the above two approaches > > would be best, or suggest an alternative way of doing this > > > > Thanks > > > > -- > > -- > > > > Anthony Brown > > Data Engineer BI Team - John Lewis > > Tel : 0787 215 7305 > > ** > > This email is confidential and may contain copyright material of the John > > Lewis Partnership. > > If you are not the intended recipient, please notify us immediately and > > delete all copies of this message. > > (Please note that it is your responsibility to scan this message for > > viruses). Email to and from the > > John Lewis Partnership is automatically monitored for operational and > > lawful business reasons. > > ** > > > > John Lewis plc > > Registered in England 233462 > > Registered office 171 Victoria Street London SW1E 5NN > > > > Websites: https://www.johnlewis.com > > http://www.waitrose.com > > https://www.johnlewisfinance.com > > http://www.johnlewispartnership.co.uk > > > > ** > >
Re: error handling flow in DAG
Anthony: Could you just have the "success" path be declared with "all_success" (the default), and the "failure" side branches be declared with "all_failed" depending on the previous task? This will have the same branching structure you want but with less intermediary operators. -James M. On Mon, Oct 1, 2018 at 1:12 PM Anthony Brown wrote: > Hi >I am coding various data flows and one of the requirements we have is to > have some error tasks happen when some of the tasks failure. These error > tasks are specific to the task that failed and are not a generic to the > whole DAG > >So for instance if I have a DAG that runs the following tasks > > task_1 > task_2 > task_3 > >If task_1 fails, then I want to run > > task_1_failure_a ---> task_1_failure_b > >If task_2 fails, I do not need to do anything specific, but if task_3 > fails, I need to run > > task_3_failure_a ---> task_3_failure_b > >I already have a generic on_failure_callback task defined on all tasks > that handles alerting, but am stuck on the best way of handling a failure > flow for tasks > >The ways I have come up with of handling this are > > Have a branch operator between each task with trigger_rule set to all_done. > The branch operator would then decide whether to go to next (success) task, > or to go down the failure branch > > Put the failure tasks in a separate DAG with no schedule. Have a different > on_failure_callback for each task that would trigger the failure DAG for > that task and then do my generic error handling > >Does anybody have any thoughts on which of the above two approaches > would be best, or suggest an alternative way of doing this > > Thanks > > -- > -- > > Anthony Brown > Data Engineer BI Team - John Lewis > Tel : 0787 215 7305 > ** > This email is confidential and may contain copyright material of the John > Lewis Partnership. > If you are not the intended recipient, please notify us immediately and > delete all copies of this message. > (Please note that it is your responsibility to scan this message for > viruses). Email to and from the > John Lewis Partnership is automatically monitored for operational and > lawful business reasons. > ** > > John Lewis plc > Registered in England 233462 > Registered office 171 Victoria Street London SW1E 5NN > > Websites: https://www.johnlewis.com > http://www.waitrose.com > https://www.johnlewisfinance.com > http://www.johnlewispartnership.co.uk > > ** >
error handling flow in DAG
Hi I am coding various data flows and one of the requirements we have is to have some error tasks happen when some of the tasks failure. These error tasks are specific to the task that failed and are not a generic to the whole DAG So for instance if I have a DAG that runs the following tasks task_1 > task_2 > task_3 If task_1 fails, then I want to run task_1_failure_a ---> task_1_failure_b If task_2 fails, I do not need to do anything specific, but if task_3 fails, I need to run task_3_failure_a ---> task_3_failure_b I already have a generic on_failure_callback task defined on all tasks that handles alerting, but am stuck on the best way of handling a failure flow for tasks The ways I have come up with of handling this are Have a branch operator between each task with trigger_rule set to all_done. The branch operator would then decide whether to go to next (success) task, or to go down the failure branch Put the failure tasks in a separate DAG with no schedule. Have a different on_failure_callback for each task that would trigger the failure DAG for that task and then do my generic error handling Does anybody have any thoughts on which of the above two approaches would be best, or suggest an alternative way of doing this Thanks -- -- Anthony Brown Data Engineer BI Team - John Lewis Tel : 0787 215 7305 ** This email is confidential and may contain copyright material of the John Lewis Partnership. If you are not the intended recipient, please notify us immediately and delete all copies of this message. (Please note that it is your responsibility to scan this message for viruses). Email to and from the John Lewis Partnership is automatically monitored for operational and lawful business reasons. ** John Lewis plc Registered in England 233462 Registered office 171 Victoria Street London SW1E 5NN Websites: https://www.johnlewis.com http://www.waitrose.com https://www.johnlewisfinance.com http://www.johnlewispartnership.co.uk **