[
https://issues.apache.org/jira/browse/AIRFLOW-470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aizhamal Nurmamat kyzy resolved AIRFLOW-470.
--------------------------------------------
Resolution: Duplicate
> Frequent multiple dispatching of the same task to celery
> --------------------------------------------------------
>
> Key: AIRFLOW-470
> URL: https://issues.apache.org/jira/browse/AIRFLOW-470
> Project: Apache Airflow
> Issue Type: Bug
> Components: celery, scheduler
> Affects Versions: 1.7.1.3
> Reporter: Jasmine Tsai
> Priority: Critical
>
> We are seeing a lot of frequent dispatching of the same task to celery within
> a very short time frame (same task instance by Airflow conditions, but a
> different celery task uuid), which is causing a lot of unexpected behavior
> for us. Most of these are annoying but harmless — sometimes they clear xcom
> data and overwrite logs, but for the most part they are able to rely on the
> db metadata and not try to run itself multiple times. We are seeing this
> behavior frequent, some tasks are getting scheduled 5 times within the span
> of two minutes. The issue seems to be exacerbated by the use of pools.
> We have even seen the same task being dispatched twice within a second apart,
> causing real race conditions because the second try didn't see the task
> instance starting to run yet in the metadata db.
> It seems from other issues submitted here that people definitely see problems
> with the same tasks running multiple times, but this problem seems to be
> getting worse for us. Is it a known issue for the multiple dispatching to be
> so frequent/severe? (or maybe even the intentional design/side effect?) Are
> there things that we could be doing that might make this worse? (One of our
> primary suspect is the scheduler, which we have set its num_runs to 1)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)