yaningz commented on issue #43717:
URL: https://github.com/apache/airflow/issues/43717#issuecomment-2707742021
Hi @o-nikolas, I've taken a look at our other tasks and logs.
> I need to refresh my memory on how this impacts ECS tasks other than with
our image executing DBT, but the logs in the original issue are always
truncated, never missing from the middle or beginning of the task.
We are seeing this log truncation against other images. For proprietary
reasons, I can't share the logs from the other task, but a simplified version
of this other operator call is below. We do use similar patterns for our calls
into the ECSRunTaskOperator regardless of the image, so it may be something in
our configuration or infrastructure setup.
```
class OtherCustomOperator(EcsRunTaskOperator):
def __init__(
self,
image_tag: str,
command: str,
aws_account_name: str,
airflow_env_name: str = DEFAULT_AIRFLOW_ENV_NAME,
aws_conn_id: Optional[str] = "aws_default",
**kwargs: Any,
) -> None:
command_list = command.split()
super().__init__(
task_definition=ECS_NAME,
cluster=airflow_env_name,
overrides={
"containerOverrides": [
{
"name": "other-custom-task",
"command": command_list,
"environment": [
{
"name": "AIRFLOW__VAR__AWS_ACCOUNT_NAME",
"value": aws_account_name,
}
],
},
],
},
tags={
"Local": str(is_local_airflow_instance())
},
launch_type="FARGATE",
network_configuration={
"awsvpcConfiguration": {
"subnets": [
os.environ.get("AIRFLOW__VAR__PRIMARY_SUBNET_ID"),
os.environ.get("AIRFLOW__VAR__SECONDARY_SUBNET_ID"),
os.environ.get("AIRFLOW__VAR__TERTIARY_SUBNET_ID"),
],
"securityGroups": [
os.environ.get("AIRFLOW__VAR__SECURITY_GROUP_ID")
],
"assignPublicIp": "DISABLED",
},
},
awslogs_group=f"/ecs/{airflow_env_name}/{airflow_env_name}",
awslogs_region="us-east-1",
awslogs_stream_prefix=f"{airflow_env_name}/other-custom-task",
awslogs_fetch_interval=timedelta(seconds=30),
propagate_tags="TASK_DEFINITION",
**kwargs,
)
```
For our infrastructure, our MWAA environment is managed by Terraform, as is
an ECS cluster specific to that MWAA environment. The `awslogs_group` that we
pass into the operator call is a log group created separate from the ECS
cluster, specific to task output logs. The target log group is specified within
our taskdef (not configured on the cluster directly).
I don't know of anything specific to our VPC setup that could be causing
this or should be adding delay. I would have to ask one of our infrastructure
engineers to help clarify if you have specific concerns about our networking
that could be a cause.
Please let me know if there is any more information I can provide about our
setup that may help in debugging.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]