We have max runs set and still hit this. Our solution is dumber: monitoring log output, and kill the scheduler if it stops emitting. Works like a charm.
> On Mar 24, 2017, at 5:50 PM, F. Hakan Koklu <fhakan.ko...@gmail.com> wrote: > > Some solutions to this problem is restarting the scheduler frequently or > some sort of monitoring on the scheduler. We have set up a dag that pings > cronitor <https://cronitor.io/> (a dead man's snitch type of service) every > 10 minutes and the snitch pages you when the scheduler dies and does not > send a ping to it. > > On Fri, Mar 24, 2017 at 1:49 PM, Andrew Phillips <aphill...@qrmedia.com> > wrote: > >> We use celery and run into it from time to time. >>> >> >> Bang goes my theory ;-) At least, assuming it's the same underlying >> cause... >> >> Regards >> >> ap >>