Ah this is the more interesting case. Are you getting tasks into SCHEDULED and then the scheduler itself gets stuck? Or do the workers not execute anything anymore? How do you run your scheduler? With num_runs? A later patch checks for these “orphaned_tasks” at scheduler start up.
In other words can you provide some more information :-). Bolke > Op 7 sep. 2016, om 20:08 heeft Jeff Balogh <[email protected]> het > volgende geschreven: > > Ah yep, we're on > https://github.com/apache/incubator-airflow/commits/54b361d2a. > > On Wed, Sep 7, 2016 at 10:13 AM, Bolke de Bruin <[email protected]> wrote: >> Hi Jeff, >> >> That is kind of impossible for 1.7.1.3 as the SCHEDULED state was introduced >> after release. Are you sure you are on 1.7.1.3 and not on master? >> >> Bolke >> >>> Op 7 sep. 2016, om 18:37 heeft Jeff Balogh <[email protected]> het >>> volgende geschreven: >>> >>> When we bumped to 1.7.1.3 we found that tasks would go into the new >>> SCHEDULED state and get stuck there. We haven't determined why this >>> happens. >>> >>> We put a hacky patch into our scheduler that sets state to None for >>> any tasks that are SCHEDULED at the beginning of the schedule loop. >>> >>> Name: airflow >>> Version: 1.7.1.3 >>> Name: celery >>> Version: 3.1.23 >>> Name: kombu >>> Version: 3.0.35 >>> >>> redis_version:2.6.13 >>> >>> On Sun, Sep 4, 2016 at 6:34 AM, Bolke de Bruin <[email protected]> wrote: >>>> Hi All, >>>> >>>> We have had some reports on this list and sometimes on Jira that the >>>> scheduler sometimes seems to get stuck. I would like to track down this >>>> issue, but until now much of the reporting has been a bit light on the >>>> details. >>>> >>>> First and foremost I am assuming that getting “stuck” is only happening >>>> when using a CeleryExecutor. To further track down the issue I would like >>>> to know the following >>>> >>>> - Airflow version (pip show airflow) >>>> - Celery version (pip show celery) >>>> - Kombu version (pip show kombu) >>>> >>>> - Redis version (if applicable) >>>> - RabbitMQ version (if applicable) >>>> >>>> - Sanitized airflow configuration >>>> - Sanitized broker configuration >>>> >>>> If possible supply, preferably debug, logs of broker, scheduler and worker. >>>> >>>> Thanks! >>>> Bolke >>>> >>
