XD-DENG commented on a change in pull request #3688: [AIRFLOW-2843] ExternalTaskSensor-check if external task exists URL: https://github.com/apache/incubator-airflow/pull/3688#discussion_r207606729
########## File path: airflow/sensors/external_task_sensor.py ########## @@ -70,9 +76,24 @@ def __init__(self, self.execution_date_fn = execution_date_fn self.external_dag_id = external_dag_id self.external_task_id = external_task_id + self.check_existence = check_existence @provide_session def poke(self, context, session=None): + TI = TaskInstance + + if self.check_existence: + existence = session.query(TI).filter( + TI.dag_id == self.external_dag_id, + TI.task_id == self.external_task_id, + ).count() + session.commit() + if existence == 0: + raise AirflowException('The external task "' + Review comment: There may be a few cases: - The external DAG ID specified is wrong (due to reasons like typo); - The external task specified doesn't exist in the corresponding DAG (similar reason). - ... Starting on time or not doesn't matter much here, since only DAG ID& task ID are used here for querying. I consider this feature as a "guard" to prevent from entering wrong DAG ID and task ID. Without guard, it may be hard to find this type of errors (the sensor will keep waiting and eventually leaves an impression that it fails only because the external task was not executed/finished yet). Given the default value of this new argument is `False`, it will not change the current behavior. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services