[
https://issues.apache.org/jira/browse/AIRFLOW-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16965639#comment-16965639
]
Qian Yu commented on AIRFLOW-2279:
----------------------------------
I'm in the process of working on something I call ClearTaskOperator. In fact, I
already have a working PR. I was using it for a different purpose (rerunning
completed tasks). However, the exact same operator can be used to achieve the
intention of [~asoni-stripe] .
[https://github.com/apache/airflow/pull/6392]
For example, if we want to make sure when task_A on dag1 is cleared, it always
clears a task_B on a dag2, we can do make the dag1 look like this:
{code:python}
task_A >> clear_Task_B
{code}
where
{code:python}
clear_Task_B = ClearTaskOperator(external_task_id="task_B",
external_dag_id="dag2")
{code}
So the change is actually very simple, just adding a new operator, without even
touching the core Airflow code.
I saw [~gsilk] has an interesting requirement that I did not consider when
coming up with the PR, that is to limit the number tasks cleared at the same
time. But adding this is trivial because it is just an additional check in the
execute() function.
And some other improvements I can think of for the PR is to make it behave like
an ExternalTaskSensor, i.e. it only clears the target tasks if they are done.
If the target tasks are still running, it'll just wait and reschedule itself
for a later time to try clearing them again.
If there are interest in this new operator or these additional features, pls
mention that on my PR and I'll continue to develop it to make it better and
hopefully get more support from upstream.
> Clearing Tasks Across DAGs
> --------------------------
>
> Key: AIRFLOW-2279
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2279
> Project: Apache Airflow
> Issue Type: Improvement
> Reporter: Achal Soni
> Priority: Major
> Attachments: cross_dag_ui_screenshot.png
>
>
> At Stripe, we commonly have discrete dags that depend on each other by
> leveraging ExternalTaskSensors. We also find ourselves routinely wanting to
> not only clear tasks and their downstream tasks in a particular dag, but also
> their downstream tasks in their dependent dags (linked by
> ExternalTaskSensors).
> We currently have extended Airflow to handle this by modifying the webapp and
> cli tool to optionally clear dependent tasks across multiple dags (see
> attached screenshot).
> We want to open the floor for discussion with the larger Airflow community
> about the usage of ExternalTaskSensors and specifically how to handle
> clearing across dags. We are interested in learning more about the accepted
> practices in this regard, and are very open/willing to contribute in this
> area if there is interest!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)