[ 
https://issues.apache.org/jira/browse/AIRFLOW-5071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153251#comment-17153251
 ] 

Szymon Grzemski commented on AIRFLOW-5071:
------------------------------------------

In my case the "killed externally" message was seen right after Executor 
reported success of a task, even though it was just queued and the task was 
still in a running state.
{code:bash}
[2020-07-06 13:24:46,459] {base_executor.py:58} INFO - Adding to queue: 
['airflow', 'run', 'dedupe-emr-job-flow', 'check_signals_table', 
'2020-07-06T12:00:00+00:00', '--local', '--pool', 'default_pool', '-sd', 
'/var/lib/airflow/dags/etl-airflow-dags/dedupe_emr_job_flow.py']
[2020-07-06 13:24:48,181] {scheduler_job.py:1311} INFO - Executor reports 
execution of dedupe-emr-job-flow.check_signals_table execution_date=2020-07-06 
12:00:00+00:00 exited with status success for try_number 1
[2020-07-06 13:24:48,189] {scheduler_job.py:1328} ERROR - Executor reports task 
instance <TaskInstance: dedupe-emr-job-flow.check_signals_table 2020-07-06 
12:00:00+00:00 [queued]> finished (success) although the task says its queued. 
Was the task killed externally?
{code}
It was confirmed as a success after 2s, but in the Flower I could see it was 
running till 13:25:06...

 

My WORKAROUND:

I've did a LOT of debugging of this issue, because the issue was causing an 
impact at our core pipeline. After analysing the behaviour in the debug mode, 
I've checked Dag Processor's logs and it occured that DagBag refresh takes a 
little bit over 10s for the longest dag. As a result, I set:
{code:bash}
min_file_process_interval = 15
{code}
under the [scheduler] section and restarted scheduler. I've stopped 
experiencing this error for almost 24h now:
!image-2020-07-08-07-58-42-972.png|width=543,height=218!

Maybe it could be a hint for [~kaxilnaik], [~potiuk] or other developers to fix 
this issue :)

> Thousand os Executor reports task instance X finished (success) although the 
> task says its queued. Was the task killed externally?
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5071
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5071
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, scheduler
>    Affects Versions: 1.10.3
>            Reporter: msempere
>            Priority: Critical
>         Attachments: image-2020-01-27-18-10-29-124.png, 
> image-2020-07-08-07-58-42-972.png
>
>
> I'm opening this issue because since I update to 1.10.3 I'm seeing thousands 
> of daily messages like the following in the logs:
>  
> ```
>  {{__init__.py:1580}} ERROR - Executor reports task instance <TaskInstance: X 
> 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> {{jobs.py:1484}} ERROR - Executor reports task instance <TaskInstance: X 
> 2019-07-29 00:00:00+00:00 [queued]> finished (success) although the task says 
> its queued. Was the task killed externally?
> ```
> -And looks like this is triggering also thousand of daily emails because the 
> flag to send email in case of failure is set to True.-
> I have Airflow setup to use Celery and Redis as a backend queue service.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to