XD-DENG commented on a change in pull request #3688: [AIRFLOW-2843]
ExternalTaskSensor-check if external task exists
URL: https://github.com/apache/incubator-airflow/pull/3688#discussion_r207606729
##########
File path: airflow/sensors/external_task_sensor.py
##########
@@ -70,9 +76,24 @@ def __init__(self,
self.execution_date_fn = execution_date_fn
self.external_dag_id = external_dag_id
self.external_task_id = external_task_id
+ self.check_existence = check_existence
@provide_session
def poke(self, context, session=None):
+ TI = TaskInstance
+
+ if self.check_existence:
+ existence = session.query(TI).filter(
+ TI.dag_id == self.external_dag_id,
+ TI.task_id == self.external_task_id,
+ ).count()
+ session.commit()
+ if existence == 0:
+ raise AirflowException('The external task "' +
Review comment:
There may be a few cases:
- The external DAG ID specified is wrong (due to reasons like typo);
- The external task specified doesn't exist in the corresponding DAG
(similar reason).
- ...
Starting on time or not doesn't matter much here, since only DAG ID& task ID
are used here for querying.
I consider this feature as a "guard" to prevent from entering wrong DAG ID
and task ID. Without guard, it may be hard to find this type of errors (the
sensor will keep waiting and eventually leaves an impression that it fails only
because the external task was not executed/finished yet).
Given the default value of this new argument is `False`, it will not change
the current behavior.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services