[ https://issues.apache.org/jira/browse/AIRFLOW-462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bolke de Bruin updated AIRFLOW-462: ----------------------------------- Priority: Major (was: Blocker) > Concurrent Scheduler Jobs pushing the same task to queue > -------------------------------------------------------- > > Key: AIRFLOW-462 > URL: https://issues.apache.org/jira/browse/AIRFLOW-462 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler > Affects Versions: Airflow 1.7.0 > Reporter: Yogesh > > Hi, > We are using airflow version 1.7.0 and we tried to implement high > availability for airflow daemons in our production environment. > Detailed high availability approach: > - Airflow running on two different machines with all the > daemons(webserver, scheduler, execueor) > - Single mysql db repository pointed by two schedulers > - Replicated dag files in both the machines > - Running Single Rabbitmq Instance as message broker > While doing so we came across below problem: > - A particular task was sent to executor twice (two entries in message > queue) by two different schedulers. But, we see only single entry for the > task instance in database which is correct. > We just checked out the code and found below fact: > - before sending the task to executor it checks for task state in > database and if its not already QUEUED it pushes that task to queue > issue: > As there is no locking implemented on the task instance in the database and > both the Scheduler jobs are running so close that the second one might check > for the status in the db before the first one updates that to QUEUED. > We are not sure if in recent release this issue have been taken care of. > Would you please help with some appropriate approach so that the high > availability can be achieved. > Thanks > Yogesh -- This message was sent by Atlassian JIRA (v6.3.15#6346)