I also like "stop" idea. Also to answer a bit my own question and explain current behaviour.
We know that if you use systemd or similar (or simply run airflow in terminal and press ^C) the webserver and scheduler will be killed nicely. But I think we miss the case when you want to kill the webserver process itself using the pid (even if we handle the --pid) command. Not everyone knows that, but pressing ^C actually sends INT signal to the foreground process group and not to the main process. This is a surprise for many people who even know how signals work in Unix so I wanted to mention it here. You can read more about it here: https://unix.stackexchange.com/questions/149741/why-is-sigint-not-propagated-to-child-process-when-sent-to-its-parent-process Systemd uses "control-group" KillMode that basically does the same - that's why systemd integration works well for airflow. But if you use manually started webserver/scheduler with -D mode and even specify --pid file then even if you kill -INT <webserver pid > or kill -INT <scheduler pid>. Then (if we do not propagate the signal) - only main process is killed. Child process are moved to be owned by init and they continue running. I looked briefly at the code and - unless I missed something - it seems that in -D mode we are not setting our own signal handlers. In the interactive mode we are setting signal handlers that simply do sys.exit(0). I just wonder if others now/looked in the past in how it is done and have some thoughts about it. One of the ways how we could improve it (it worked for me in the past) - we could have Webserver/Scheduler start all the processes in their own new process group and propagate all signals to that group before handling them. That would work nicely in both - interactive and daemon mode. Both systemd integration and manually sending signal to webserver/scheduler would kill all the processes spawned by webserver/scheduler. Let me know what you think about it. J. On Sat, Jan 4, 2020 at 12:38 PM Kaxil Naik <[email protected]> wrote: > That is a good idea I think. > > On Sat, Jan 4, 2020 at 11:33 AM Tomasz Urbaszek <[email protected]> > wrote: > > > From some time I think about adding "stop" commands like "airflow > scheduler > > stop", "airflow celery worker stop". > > What do you think? I have already done this in native executor POC and > it's > > helpful. > > > > T. > > > > On Sat, Jan 4, 2020 at 12:22 PM Kaxil Naik <[email protected]> wrote: > > > > > Systemd integrations have worked nicely for me: > > > https://airflow.apache.org/docs/stable/howto/run-with-systemd.html > > > > > > > > > > > > On Sat, Jan 4, 2020 at 11:01 AM Jarek Potiuk <[email protected] > > > > > wrote: > > > > > > > I would like to bring the subject from user@ group > > > > > > > > > > > > > > https://lists.apache.org/thread.html/5add5e8a19cb86ef2141d9d0634bd01c12d74a7655c4eddfa7b8e75a%40%3Cusers.airflow.apache.org%3E > > > > > > > > > > > > Seems some people have problems with nicely killing airflow > > > > scheduler/webserver with signals and I was wondering if this already > > > > implemented/or someone has some insight/experience with it and can > > share > > > > thoughts about it, before we dig deeper? > > > > > > > > I know Tomek had recently some experience with killing workers nicely > > and > > > > is looking at it, but I think it would be great to have working and > > > > described scheduler/webserver killing scenarios - which signals work, > > how > > > > threads/processes behave when the signals are received etc. > > > > > > > > Does anyone have any insight into it ? > > > > > > > > J. > > > > -- > > > > > > > > Jarek Potiuk > > > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > > > > > M: +48 660 796 129 <+48660796129> > > > > [image: Polidea] <https://www.polidea.com/> > > > > > > > > > > -- Jarek Potiuk Polidea <https://www.polidea.com/> | Principal Software Engineer M: +48 660 796 129 <+48660796129> [image: Polidea] <https://www.polidea.com/>
