[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17220106#comment-17220106
 ] 

Richard Williamson commented on AIRFLOW-5071:
---------------------------------------------

We were seeing this issue sporadically the last few weeks and found part of the 
root cause to be excessive nbr of tcp TIME_WAITs (exceeding nbr of ports on 
server)- after tuning some kernel parameters (and reducing nbr of DB calls) we 
have reduced the occurrence of these hangs to only one since then which was 
recoverable by scheduler restart (before we did the tuning even a scheduler 
restart was not helping) - we use LocalExecuter (via modified clairvoyant HA 
solution) with MariaDB 10.3.10 Galera backend

> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5071
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, scheduler
>    Affects Versions: 1.10.3
>            Reporter: msempere
>            Priority: Critical
>             Fix For: 1.10.12
>
>         Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance <TaskInstance: X 
> 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance <TaskInstance: X 
> 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to