MrManicotti opened a new issue #13824:
URL: https://github.com/apache/airflow/issues/13824


   **Apache Airflow version**:
   2.0.0
   
   **Environment**:
   Docker Stack
   Celery Executor w/ Redis
   3 Workers, Scheduler + Webserver
   Cloudwatch remote config turned on
   
   **What happened**:
   Following execution of a DAG when using cloudwatch integration, the state of 
the Task Instance is being externally set, causing SIGTERM/SIGKILL signals to 
be sent. This causes error logs in Workers, which is a nuisance for alert 
monitoring
   
   ```
   *** Reading remote log from Cloudwatch log_group: dev1-airflow-task 
log_stream: xxxx/task/2021-01-21T17_59_19.643994+00_00/1.log.
   Dependencies all met for <TaskInstance: xxxx.task  
2021-01-21T17:59:19.643994+00:00 [queued]>
   Dependencies all met for <TaskInstance: xxxx.task  
2021-01-21T17:59:19.643994+00:00 [queued]>
   
--------------------------------------------------------------------------------
   Starting attempt 1 of 1
   
--------------------------------------------------------------------------------
   Executing <Task(TaskVerificationOperator): task > on 
2021-01-21T17:59:19.643994+00:00
   Started process 654 to run task
   Running <TaskInstance: xxxx.task  2021-01-21T17:59:19.643994+00:00 
[running]> on host 88f99fbc97a8
   Exporting the following env vars:
   AIRFLOW_CTX_DAG_EMAIL=xxxxxxxx
   AIRFLOW_CTX_DAG_OWNER=xxxxxxxx
   AIRFLOW_CTX_DAG_ID=xxxxxxxx
   AIRFLOW_CTX_TASK_ID=xxxxxx
   AIRFLOW_CTX_EXECUTION_DATE=2021-01-21T17:59:19.643994+00:00
   AIRFLOW_CTX_DAG_RUN_ID=85
   Set new audit correlation_id xxxxxxxxxx-xxxxxx-xxxxxxxxx
   Using connection to: id: xxxxx. Host: xxxxxxx, Port: 5432, Schema: xxxxxx, 
Login: xxxxxx, Password: XXXXXXXX, extra: None
   Marking task as SUCCESS. dag_id=xxxxxx, task_id=xxxxxx, 
execution_date=20210121T175919, start_date=20210121T175936, 
end_date=20210121T175938
   1 downstream tasks scheduled from follow-on schedule check
   ```
   
   However following the completion of the DAG, the following is appended to 
the logs:
   ```
   State of this instance has been externally set to success. Terminating 
instance.
   Sending Signals.SIGTERM to GPID 654
   process psutil.Process(pid=654, name='xxxxx', status='sleeping', 
started='17:59:36') did not respond to SIGTERM. Trying SIGKILL
   Process psutil.Process(pid=654, name='xxxxx', status='terminated', 
exitcode=<Negsignal.SIGKILL: -9>, started='17:59:36') (654) terminated with 
exit code Negsignal.SIGKILL
   Task exited with return code Negsignal.SIGKILL
   ```
   
   This is a problem, because it causes the following to appear in Worker logs:
   ```
   [2021-01-21 15:00:01,102: WARNING/ForkPoolWorker-8] Running <TaskInstance: 
xxxx.task 2021-01-21T14:00:00+00:00 [queued]> on host 
ip-172-31-3-210.ec2.internal
   ...
   [2021-01-21 15:00:06,599: ERROR/ForkPoolWorker-8] Failed to execute task 
Task received SIGTERM signal.
   ```
   
   **What you expected to happen**:
   No errors to appear in Worker logs, if this SIGTERM/SIGKILL is intended
   
   **How to reproduce it**:
   Use Airflow w/ Celery Executor and Cloudwatch Remote Logging
   
   
   **Anything else we need to know**:
   Occurs every time, every task in DAG
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to