[jira] [Comment Edited] (AIRFLOW-1327) LocalExecutor won't reschedule on concurrency limit hit

pranav agrawal (JIRA) Tue, 13 Nov 2018 07:32:29 -0800


    [ 
https://issues.apache.org/jira/browse/AIRFLOW-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16685374#comment-16685374
 ]


pranav agrawal edited comment on AIRFLOW-1327 at 11/13/18 3:31 PM:
-------------------------------------------------------------------

we are also hitting this issue many times in our Airflow setup, please assist 
workaround.
 airflow version: 1.9.0

using CeleryExecutore


was (Author: pranav.agrawal1):
we are also hitting this issue many times in our Airflow setup, please assist 
workaround.
airflow version: 1.10.0

using CeleryExecutore

> LocalExecutor won't reschedule on concurrency limit hit
> -------------------------------------------------------
>
>                 Key: AIRFLOW-1327
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1327
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: scheduler
>    Affects Versions: 1.8.1
>         Environment: LocalExecutor
>            Reporter: Dennis Muth
>            Priority: Major
>         Attachments: Airflow_logs.png, ti_unscheduled.png
>
>
> For several days we are trying to migrate from airflow 1.7.1.3 to 1.8.1.
> Unfortunately we ran into a serious issue that seems to be scheduler related 
> (we are using the LocalExecutor one).
> When running a SubDag some Task instances get queued (queues are defined), 
> switch to running and some time later finish. Well, thats how it should be.
> But: Some task instances get queued up, print some cryptic warning message 
> (we get to this in a sec) and then get no state (NONE).
> The warning message:
> {code}
> FIXME: Rescheduling due to concurrency limits reached at task runtime. 
> Attempt 1 of 2. State set to NONE.
> {code}
> This suggests that a limit is too low and that this instance will be picked 
> up later by the scheduler for processing, when there are probably more slots 
> available. 
> We have waited for quite some time now, but the task is not re-scheduled.
> When I rerun the subdag some previous failed task instances (state = None) 
> will now succeed, but other - previously successful ones - will fail. Weird...
> I've attached some screenshots to make this more transparent to you, too.
> Is this a bug or just on purpose? Do we need to switch to the CeleryExecutor?
> Please do not hesitate if you need additional logs or other stuff.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Comment Edited] (AIRFLOW-1327) LocalExecutor won't reschedule on concurrency limit hit

Reply via email to