[
https://issues.apache.org/jira/browse/AIRFLOW-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508405#comment-16508405
]
Dan Fowler commented on AIRFLOW-1104:
-------------------------------------
[~saguziel] [~TaoFeng] any updates on this ticket? We are seeing a lot of noise
with our more concurrent jobs (using Airflow 1.9.0). In the logs we see:
{code:java}
FIXME: Rescheduling due to concurrency limits reached at task runtime.
{code}
We are also seeing noise from successful jobs sending emails out with the
following error message:
{code:java}
Exception:
Executor reports task instance %s finished (%s) although the task says its %s.
Was the task killed externally?
{code}
I believe these messages are related to the fact that the tasks get scheduled
then un-scheduled when they are over the concurrency limit.
I resolved the issue by adding `State.QUEUED` to the
`states_to_count_as_running` list in `airflow/jobs.py`. [~saguziel] it looks
like you were concerned with making that change (looking at the PR you linked).
Are those concerns still an issue with Airflow's current state? If so, what
else needs to happen/other changes to be made to resolve the issue? Thanks in
advance!
> Concurrency check in scheduler should count queued tasks as well as running
> ---------------------------------------------------------------------------
>
> Key: AIRFLOW-1104
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1104
> Project: Apache Airflow
> Issue Type: Bug
> Environment: see https://github.com/apache/incubator-airflow/pull/2221
> "Tasks with the QUEUED state should also be counted below, but for now we
> cannot count them. This is because there is no guarantee that queued tasks in
> failed dagruns will or will not eventually run and queued tasks that will
> never run will consume slots and can stall a DAG. Once we can guarantee that
> all queued tasks in failed dagruns will never run (e.g. make sure that all
> running/newly queued TIs have running dagruns), then we can include QUEUED
> tasks here, with the constraint that they are in running dagruns."
> Reporter: Alex Guziel
> Assignee: Tao Feng
> Priority: Minor
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)