Is there a way to stop an airflow dag if enough of certain tasks fail? Eg.
have collection of tasks that all do same thing for different values
for dataset in list_of_datasets:
task_1 = BashOperator(task_id="task_1_%s" % dataset["id"], ...)
task_2 = BashOperator(task_id="task_2_%s" % dataset["id"], ...)
task_3 = BashOperator(task_id="task_3_%s" % dataset["id"], ...)
task_1 >> task_2 >> task_3
and if, say any 5 instances of task_2 fail, then it means something bigger
is wrong with the underlying process used for task_2 (as opposed to the
individual dataset being processed in the particular task instance) and
that that tasks is likely not going to succeed for any other instance of
that task, so the whole dag should stop or skip to a later /
alternative-branching task.
Is there a way to enforce this by setting something in the task
declarations? Any other common workarounds for this kind of situation
(something like a "some_failed" kind of trigger rule)?
--
This electronic message is intended only for the named
recipient, and may
contain information that is confidential or
privileged. If you are not the
intended recipient, you are
hereby notified that any disclosure, copying,
distribution or
use of the contents of this message is strictly
prohibited. If
you have received this message in error or are not the
named
recipient, please notify us immediately by contacting the
sender at
the electronic mail address noted above, and delete
and destroy all copies
of this message. Thank you.