eladkal commented on code in PR #51719:
URL: https://github.com/apache/airflow/pull/51719#discussion_r2147527825
##########
providers/amazon/src/airflow/providers/amazon/aws/triggers/mwaa.py:
##########
@@ -106,6 +106,87 @@ def hook(self) -> AwsGenericHook:
)
+class MwaaTaskCompletedTrigger(AwsBaseWaiterTrigger):
+ """
+ Trigger when an MWAA Task is complete.
+
+ :param external_env_name: The external MWAA environment name that contains
the DAG Run you want to wait for
+ (templated)
+ :param external_dag_id: The DAG ID in the external MWAA environment that
contains the DAG Run you want to wait for
+ (templated)
+ :param external_dag_run_id: The DAG Run ID in the external MWAA
environment that you want to wait for (templated)
+ :param external_task_id: The Task ID in the external MWAA environment that
you want to wait for (templated)
+ :param success_states: Collection of task instance states that would make
this task marked as successful, default is
+ ``{airflow.utils.state.TaskInstanceState.SUCCESS}`` (templated)
+ :param failure_states: Collection of task instance states that would make
this task marked as failed and raise an
+ AirflowException, default is
``{airflow.utils.state.TaskInstanceState.FAILED}`` (templated)
+ :param waiter_delay: The amount of time in seconds to wait between
attempts. (default: 60)
+ :param waiter_max_attempts: The maximum number of attempts to be made.
(default: 720)
+ :param aws_conn_id: The Airflow connection used for AWS credentials.
+ """
+
+ def __init__(
+ self,
+ *,
+ external_env_name: str,
+ external_dag_id: str,
+ external_dag_run_id: str,
+ external_task_id: str,
+ success_states: Collection[str] | None = None,
+ failure_states: Collection[str] | None = None,
+ waiter_delay: int = 60,
+ waiter_max_attempts: int = 720,
+ aws_conn_id: str | None = None,
Review Comment:
This isn't consistent with what was discussed on
https://github.com/apache/airflow/pull/51196 is it?
##########
providers/amazon/docs/operators/mwaa.rst:
##########
@@ -76,6 +76,24 @@ In the following example, the task ``wait_for_dag_run``
waits for the DAG run cr
:start-after: [START howto_sensor_mwaa_dag_run]
:end-before: [END howto_sensor_mwaa_dag_run]
+.. _howto/sensor:MwaaDagRunSensor:
+
+Wait on the state of an AWS MWAA Task
+========================================
+
+To wait for a Task Instance within a DAG Run running on Amazon MWAA until it
reaches one of the given states, you can use the
Review Comment:
We should clarify that this is intended to be used for sensing a task
executed on a different MWAA environment.
also, the title is `Task` yet the description is `DAG Run`. It's not the
same.
Does the sensor offer both capabilities?
##########
providers/amazon/src/airflow/providers/amazon/aws/triggers/mwaa.py:
##########
@@ -106,6 +106,87 @@ def hook(self) -> AwsGenericHook:
)
+class MwaaTaskCompletedTrigger(AwsBaseWaiterTrigger):
+ """
+ Trigger when an MWAA Task is complete.
+
+ :param external_env_name: The external MWAA environment name that contains
the DAG Run you want to wait for
+ (templated)
+ :param external_dag_id: The DAG ID in the external MWAA environment that
contains the DAG Run you want to wait for
+ (templated)
+ :param external_dag_run_id: The DAG Run ID in the external MWAA
environment that you want to wait for (templated)
+ :param external_task_id: The Task ID in the external MWAA environment that
you want to wait for (templated)
+ :param success_states: Collection of task instance states that would make
this task marked as successful, default is
+ ``{airflow.utils.state.TaskInstanceState.SUCCESS}`` (templated)
+ :param failure_states: Collection of task instance states that would make
this task marked as failed and raise an
+ AirflowException, default is
``{airflow.utils.state.TaskInstanceState.FAILED}`` (templated)
+ :param waiter_delay: The amount of time in seconds to wait between
attempts. (default: 60)
+ :param waiter_max_attempts: The maximum number of attempts to be made.
(default: 720)
+ :param aws_conn_id: The Airflow connection used for AWS credentials.
+ """
+
+ def __init__(
+ self,
+ *,
+ external_env_name: str,
+ external_dag_id: str,
+ external_dag_run_id: str,
+ external_task_id: str,
+ success_states: Collection[str] | None = None,
+ failure_states: Collection[str] | None = None,
+ waiter_delay: int = 60,
+ waiter_max_attempts: int = 720,
+ aws_conn_id: str | None = None,
+ ) -> None:
+ self.success_states = set(success_states) if success_states else
{TaskInstanceState.SUCCESS.value}
+ self.failure_states = set(failure_states) if failure_states else
{TaskInstanceState.FAILED.value}
+
+ if len(self.success_states & self.failure_states):
+ raise ValueError("success_states and failure_states must not have
any values in common")
+
+ in_progress_states = {s.value for s in TaskInstanceState} -
self.success_states - self.failure_states
Review Comment:
is this logic right? If we fall back to the defaults it will consider
`skipped` and `removed` as in progress state.
Also, there is no protection here against possible future addition of new
state to task instance. For example we are discussing
https://github.com/apache/airflow/issues/12199
I suggest to add defensive test around adding more states so we'll know to
modify code here or maybe we can consider adding more classes to categorized
states similar to
https://github.com/apache/airflow/blob/083e03a909f923527d9d2f8d978962ddfb6e5b7a/task-sdk/src/airflow/sdk/api/datamodels/_generated.py#L415-L419
##########
providers/amazon/tests/system/amazon/aws/example_mwaa.py:
##########
@@ -66,6 +66,22 @@ def unpause_dag(env_name: str, dag_id: str):
return not response["RestApiResponse"]["is_paused"]
+# Can only be run after 'trigger_dag_run' task is run.
+@task
+def get_task_id(env_name: str, dag_id: str):
+ mwaa_hook = MwaaHook()
+ dag_runs = mwaa_hook.invoke_rest_api(env_name=env_name,
path=f"/dags/{dag_id}/dagRuns", method="GET")
+ dag_run_id = dag_runs["RestApiResponse"]["dag_runs"][0]["dag_run_id"]
+
+ response = mwaa_hook.invoke_rest_api(
+ env_name=env_name,
path=f"/dags/{dag_id}/dagRuns/{dag_run_id}/taskInstances", method="GET"
+ )
+ print("test debug")
+ print(response)
Review Comment:
leftovers?
##########
providers/amazon/src/airflow/providers/amazon/aws/sensors/mwaa.py:
##########
@@ -159,3 +159,137 @@ def execute(self, context: Context):
)
else:
super().execute(context=context)
+
+
+class MwaaTaskSensor(AwsBaseSensor[MwaaHook]):
+ """
+ Waits for a task in an MWAA Environment to complete.
+
+ If the task fails, an AirflowException is thrown.
+
+ .. seealso::
+ For more information on how to use this sensor, take a look at the
guide:
+ :ref:`howto/sensor:MwaaTaskSensor`
+
+ :param external_env_name: The external MWAA environment name that contains
the DAG Run you want to wait for
+ (templated)
+ :param external_dag_id: The DAG ID in the external MWAA environment that
contains the DAG Run you want to wait for
+ (templated)
+ :param external_dag_run_id: The DAG Run ID in the external MWAA
environment that you want to wait for (templated)
+ :param external_task_id: The Task ID in the external MWAA environment that
you want to wait for (templated)
+ :param success_states: Collection of task instance states that would make
this task marked as successful, default is
+ ``{airflow.utils.state.TaskInstanceState.SUCCESS}`` (templated)
+ :param failure_states: Collection of task instance states that would make
this task marked as failed and raise an
+ AirflowException, default is
``{airflow.utils.state.TaskInstanceState.FAILED}`` (templated)
+ :param deferrable: If True, the sensor will operate in deferrable mode.
This mode requires aiobotocore
+ module to be installed.
+ (default: False, but can be overridden in config file by setting
default_deferrable to True)
+ :param poke_interval: Polling period in seconds to check for the status of
the job. (default: 60)
+ :param max_retries: Number of times before returning the current state.
(default: 720)
+ :param aws_conn_id: The Airflow connection used for AWS credentials.
+ If this is ``None`` or empty then the default boto3 behaviour is used.
If
+ running Airflow in a distributed manner and aws_conn_id is None or
+ empty, then default boto3 configuration would be used (and must be
+ maintained on each worker node).
+ :param region_name: AWS region_name. If not specified then the default
boto3 behaviour is used.
+ :param verify: Whether or not to verify SSL certificates. See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
+ :param botocore_config: Configuration dictionary (key-values) for botocore
client. See:
+
https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
+ """
+
+ aws_hook_class = MwaaHook
+ template_fields: Sequence[str] = aws_template_fields(
+ "external_env_name",
+ "external_dag_id",
+ "external_dag_run_id",
+ "external_task_id",
+ "success_states",
+ "failure_states",
+ "deferrable",
+ "max_retries",
+ "poke_interval",
+ )
+
+ def __init__(
+ self,
+ *,
+ external_env_name: str,
+ external_dag_id: str,
+ external_dag_run_id: str,
+ external_task_id: str,
+ success_states: Collection[str] | None = None,
+ failure_states: Collection[str] | None = None,
+ deferrable: bool = conf.getboolean("operators", "default_deferrable",
fallback=False),
+ poke_interval: int = 60,
+ max_retries: int = 720,
+ **kwargs,
+ ):
+ super().__init__(**kwargs)
+
+ self.success_states = set(success_states) if success_states else
{TaskInstanceState.SUCCESS.value}
+ self.failure_states = set(failure_states) if failure_states else
{TaskInstanceState.FAILED.value}
+
+ if len(self.success_states & self.failure_states):
+ raise ValueError("success_states and failure_states must not have
any values in common")
+
+ self.external_env_name = external_env_name
+ self.external_dag_id = external_dag_id
+ self.external_dag_run_id = external_dag_run_id
+ self.external_task_id = external_task_id
+ self.deferrable = deferrable
+ self.poke_interval = poke_interval
+ self.max_retries = max_retries
+
+ def poke(self, context: Context) -> bool:
+ self.log.info(
+ "Poking for task %s of DAG run %s of DAG %s in MWAA environment
%s",
+ self.external_task_id,
+ self.external_dag_run_id,
+ self.external_dag_id,
+ self.external_env_name,
+ )
+ response = self.hook.invoke_rest_api(
+ env_name=self.external_env_name,
+
path=f"/dags/{self.external_dag_id}/dagRuns/{self.external_dag_run_id}/taskInstances/{self.external_task_id}",
+ method="GET",
+ )
+
+ # If RestApiStatusCode == 200, the RestApiResponse must have the
"state" key, otherwise something terrible has
+ # happened in the API and KeyError would be raised
+ # If RestApiStatusCode >= 300, a botocore exception would've already
been raised during the
+ # self.hook.invoke_rest_api call
+ # The scope of this sensor is going to only be raising
AirflowException due to failure of the task
+
+ state = response["RestApiResponse"]["state"]
+
+ if state in self.failure_states:
+ raise AirflowException(
+ f"The task {self.external_task_id} of DAG run
{self.external_dag_run_id} of DAG {self.external_dag_id} in MWAA environment
{self.external_env_name} "
+ f"failed with state: {state}"
+ )
+
+ return state in self.success_states
+
+ def execute_complete(self, context: Context, event: dict[str, Any] | None
= None) -> None:
+ validate_execute_complete_event(event)
+
+ def execute(self, context: Context):
+ if self.deferrable:
+ self.defer(
+ trigger=MwaaTaskCompletedTrigger(
+ external_env_name=self.external_env_name,
+ external_dag_id=self.external_dag_id,
+ external_dag_run_id=self.external_dag_run_id,
+ external_task_id=self.external_task_id,
+ success_states=self.success_states,
+ failure_states=self.failure_states,
+ # somehow the type of poke_interval is derived as float ??
Review Comment:
please clarify? We don't normally leave such todos in the code.
##########
providers/amazon/src/airflow/providers/amazon/aws/sensors/mwaa.py:
##########
@@ -159,3 +159,137 @@ def execute(self, context: Context):
)
else:
super().execute(context=context)
+
+
+class MwaaTaskSensor(AwsBaseSensor[MwaaHook]):
+ """
+ Waits for a task in an MWAA Environment to complete.
+
+ If the task fails, an AirflowException is thrown.
+
+ .. seealso::
+ For more information on how to use this sensor, take a look at the
guide:
+ :ref:`howto/sensor:MwaaTaskSensor`
+
+ :param external_env_name: The external MWAA environment name that contains
the DAG Run you want to wait for
+ (templated)
+ :param external_dag_id: The DAG ID in the external MWAA environment that
contains the DAG Run you want to wait for
+ (templated)
+ :param external_dag_run_id: The DAG Run ID in the external MWAA
environment that you want to wait for (templated)
+ :param external_task_id: The Task ID in the external MWAA environment that
you want to wait for (templated)
+ :param success_states: Collection of task instance states that would make
this task marked as successful, default is
+ ``{airflow.utils.state.TaskInstanceState.SUCCESS}`` (templated)
+ :param failure_states: Collection of task instance states that would make
this task marked as failed and raise an
+ AirflowException, default is
``{airflow.utils.state.TaskInstanceState.FAILED}`` (templated)
+ :param deferrable: If True, the sensor will operate in deferrable mode.
This mode requires aiobotocore
+ module to be installed.
+ (default: False, but can be overridden in config file by setting
default_deferrable to True)
+ :param poke_interval: Polling period in seconds to check for the status of
the job. (default: 60)
+ :param max_retries: Number of times before returning the current state.
(default: 720)
+ :param aws_conn_id: The Airflow connection used for AWS credentials.
+ If this is ``None`` or empty then the default boto3 behaviour is used.
If
+ running Airflow in a distributed manner and aws_conn_id is None or
+ empty, then default boto3 configuration would be used (and must be
+ maintained on each worker node).
+ :param region_name: AWS region_name. If not specified then the default
boto3 behaviour is used.
+ :param verify: Whether or not to verify SSL certificates. See:
+
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/core/session.html
+ :param botocore_config: Configuration dictionary (key-values) for botocore
client. See:
+
https://botocore.amazonaws.com/v1/documentation/api/latest/reference/config.html
+ """
+
+ aws_hook_class = MwaaHook
+ template_fields: Sequence[str] = aws_template_fields(
+ "external_env_name",
+ "external_dag_id",
+ "external_dag_run_id",
+ "external_task_id",
+ "success_states",
+ "failure_states",
+ "deferrable",
+ "max_retries",
+ "poke_interval",
+ )
+
+ def __init__(
+ self,
+ *,
+ external_env_name: str,
+ external_dag_id: str,
+ external_dag_run_id: str,
+ external_task_id: str,
+ success_states: Collection[str] | None = None,
+ failure_states: Collection[str] | None = None,
+ deferrable: bool = conf.getboolean("operators", "default_deferrable",
fallback=False),
+ poke_interval: int = 60,
+ max_retries: int = 720,
+ **kwargs,
+ ):
+ super().__init__(**kwargs)
+
+ self.success_states = set(success_states) if success_states else
{TaskInstanceState.SUCCESS.value}
+ self.failure_states = set(failure_states) if failure_states else
{TaskInstanceState.FAILED.value}
+
+ if len(self.success_states & self.failure_states):
+ raise ValueError("success_states and failure_states must not have
any values in common")
+
+ self.external_env_name = external_env_name
+ self.external_dag_id = external_dag_id
+ self.external_dag_run_id = external_dag_run_id
+ self.external_task_id = external_task_id
+ self.deferrable = deferrable
+ self.poke_interval = poke_interval
+ self.max_retries = max_retries
+
+ def poke(self, context: Context) -> bool:
+ self.log.info(
+ "Poking for task %s of DAG run %s of DAG %s in MWAA environment
%s",
+ self.external_task_id,
+ self.external_dag_run_id,
+ self.external_dag_id,
+ self.external_env_name,
+ )
+ response = self.hook.invoke_rest_api(
+ env_name=self.external_env_name,
+
path=f"/dags/{self.external_dag_id}/dagRuns/{self.external_dag_run_id}/taskInstances/{self.external_task_id}",
+ method="GET",
+ )
+
+ # If RestApiStatusCode == 200, the RestApiResponse must have the
"state" key, otherwise something terrible has
+ # happened in the API and KeyError would be raised
+ # If RestApiStatusCode >= 300, a botocore exception would've already
been raised during the
+ # self.hook.invoke_rest_api call
+ # The scope of this sensor is going to only be raising
AirflowException due to failure of the task
+
+ state = response["RestApiResponse"]["state"]
+
+ if state in self.failure_states:
+ raise AirflowException(
+ f"The task {self.external_task_id} of DAG run
{self.external_dag_run_id} of DAG {self.external_dag_id} in MWAA environment
{self.external_env_name} "
+ f"failed with state: {state}"
+ )
+
+ return state in self.success_states
+
+ def execute_complete(self, context: Context, event: dict[str, Any] | None
= None) -> None:
+ validate_execute_complete_event(event)
+
+ def execute(self, context: Context):
+ if self.deferrable:
+ self.defer(
+ trigger=MwaaTaskCompletedTrigger(
+ external_env_name=self.external_env_name,
+ external_dag_id=self.external_dag_id,
+ external_dag_run_id=self.external_dag_run_id,
+ external_task_id=self.external_task_id,
+ success_states=self.success_states,
+ failure_states=self.failure_states,
+ # somehow the type of poke_interval is derived as float ??
+ waiter_delay=self.poke_interval, # type: ignore[arg-type]
Review Comment:
why do we need the type ignore? this may hide an issue. I don't see we have
such ignores in any of the other cases in the code.
Maybe it just requires
`waiter_delay=int(self.poke_interval)`
?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]