Alexandre talked about this being a known issue at least as far back as 10 months ago.
On Thu, 1 Sep 2016, 21:46 Bolke de Bruin, <[email protected]> wrote: > Again please create a jira and add as much info as possible. Including > debug logs, executor logs, broker logs. If possible database dump. > > Note airflow version, celery version, rabbitmq/redis etc. provide config > details. > > We really need more info to hint this down as it has been quite elusive. > And I/we have not been able to replicate it. > > Bolke > > > Sent from my iPhone > > > On 1 sep. 2016, at 20:45, [email protected] wrote: > > > > > > > > We face exactly the same issue... > > I tried to describe it here this week, > > But no one had a solution. > > > > ב-1 בספט׳ 2016, בשעה 17:54, Sergei Iakhnin <[email protected]> > כתב/ה: > > > >> As far as I know even Airbnb themselves restart their schedulers every > 30 > >> minutes because of this issue. I ended up doing it as well with a cron > job > >> after giving up hope that it would be fixed in the short term. > >> > >>> On Thu, 1 Sep 2016, 16:03 Charalampos Paravalos, <[email protected]> > wrote: > >>> > >>> Hi, > >>> > >>> I am writting to ask for advise in an issue that I have with airflow > and > >>> til now I have not managed to resolve. Wondering if someone else had > >>> something similar in the past. > >>> > >>> So, we use airflow to schedule DAGs that will run some jobs > periodically > >>> (every 30min/1hr). Jobs run as normal etc., but there are some times > that > >>> suddenly after DAGs are finished, the next scheduled jobs do not start > at > >>> all. It seems like the server does not kick off the scheduled jobs at > all, > >>> for any of the DAGs defined (so no jobs are running on our server). > When > >>> that happens I have to restart the scheduler so jobs are kicked on > >>> automatically after restart. And the jobs run until this issue appears > >>> again (I noticed it happening every 1 or 2 days, it is quite often). > >>> > >>> This is very strange, tried to upgrade to 1.7.1.3 version but still > that > >>> issue is here. We use 32 concurrent jobs with celery workers, the > server is > >>> able to manage the load well. > >>> > >>> I believe it has to do with the scheduler, but can't understand why. > >>> Backfilled jobs maybe? Can this be? > >>> > >>> I am looking forward to hearing back from someone that has any ideas. > >>> Please let me know what information you might need about my setup > anytime. > >>> > >>> Thanks for your help! > >>> > >>> Regards, > >>> Babis > >> -- > >> > >> Sergei > -- Sergei
