[
https://issues.apache.org/jira/browse/AIRFLOW-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17278682#comment-17278682
]
xujuntao commented on AIRFLOW-5881:
-----------------------------------
We also encountered this issue when we upgrad 1.10.12,Finally solved by
increasing "dag_file_processor_timeout"
> Dag gets stuck in "Scheduled" State when scheduling a large number of tasks
> ---------------------------------------------------------------------------
>
> Key: AIRFLOW-5881
> URL: https://issues.apache.org/jira/browse/AIRFLOW-5881
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.10.6
> Reporter: David Hartig
> Priority: Critical
> Attachments: 2 (1).log, airflow.cnf
>
>
> Running with the KubernetesExecutor in and AKS cluster, when we upgraded to
> version 1.10.6 we noticed that the all the Dags stop making progress but
> start running and immediate exiting with the following message:
> "Instance State' FAILED: Task is in the 'scheduled' state which is not a
> valid state for execution. The task must be cleared in order to be run."
> See attached log file for the worker. Nothing seems out of the ordinary in
> the Scheduler log.
> Reverting to 1.10.5 clears the problem.
> Note that at the time of the failure maybe 100 or so tasks are in this state,
> with 70 coming from one highly parallelized dag. Clearing the scheduled tasks
> just makes them reappear shortly thereafter. Marking them "up_for_retry"
> results in one being executed but then the system is stuck in the original
> zombie state.
> Attached is the also a redacted airflow config flag.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)