> I don't think it's a problem with that particular app - it was basically > a vanilla django install - and it works fine after the restart. > > the real problem is that it cascades. once one vassal starts to > experience the problem, then any new vassals created from that point on, > or any restarted, also start to see problems... > > just happened again :/
One thing i fear is not clear, when you say that every new vassals created/restarted sees the problem, you mean that you get the kill() error or you simply do not see anythin in the logs ? > > > -- > Harry Percival > Developer > [email protected] > > PythonAnywhere - a fully browser-based Python development and hosting > environment > <http://www.pythonanywhere.com/> > > PythonAnywhere LLP > 17a Clerkenwell Road, London EC1M 5RD, UK > VAT No.: GB 893 5643 79 > Registered in England and Wales as company number OC378414. > Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK > > On 10/03/14 13:56, Roberto De Ioris wrote: >>> Hi there, >>> >>> Happened again today, I tried to snapshot some more debug info: >>> >>> here are the logs from the emperor, when i try to reload the vassal: >>> >>> 2014-03-10 12:19:28 +0000 EMPEROR - [emperor] kill: No such >>> process >>> [core/emperor.c line 1699] >>> 2014-03-10 12:19:31 +0000 EMPEROR - emperor_respawn/write(): >>> Broken >>> pipe [core/emperor.c line 656] >>> 2014-03-10 12:19:31 +0000 EMPEROR - [emperor] reload the uwsgi >>> instance redacted.pythonanywhere.com.ini >>> 2014-03-10 12:19:31 +0000 EMPEROR - [emperor] kill: No such >>> process >>> [core/emperor.c line 1699] >>> 2014-03-10 12:19:34 +0000 EMPEROR - [emperor] kill: No such >>> process >>> [core/emperor.c line 1699] >>> 2014-03-10 12:19:37 +0000 EMPEROR - [emperor] kill: No such >>> process >>> [core/emperor.c line 1699] >>> >>> You can see the "no such process" error keeps happening, every couple >>> of >>> seconds >>> >>> here are the logs from the vassal server log: >>> >>> 2014-03-10 11:58:51 VACUUM: unix socket >>> /var/sockets/redacted.pythonanywhere.com/socket removed. >>> 2014-03-10 11:58:53 *** Starting uWSGI 2.0 (64bit) on [Mon Mar 10 >>> 11:58:52 2014] *** >>> 2014-03-10 11:58:53 compiled with version: 4.8.1 on 07 February >>> 2014 >>> 19:06:17 >>> 2014-03-10 11:58:53 os: Linux-3.11.0-15-generic #25-Ubuntu SMP Thu >>> Jan 30 17:22:01 UTC 2014 >>> 2014-03-10 11:58:53 nodename: giles-liveweb2 >>> 2014-03-10 11:58:53 machine: x86_64 >>> 2014-03-10 11:58:53 clock source: unix >>> 2014-03-10 11:58:53 pcre jit disabled >>> 2014-03-10 11:58:53 detected number of CPU cores: 4 >>> 2014-03-10 11:58:53 current working directory: /etc/uwsgi/vassals >>> 2014-03-10 11:58:53 detected binary path: /usr/local/bin/uwsgi >>> 2014-03-10 11:58:53 using Linux cgroup >>> /mnt/cgroups/cpu/user_types/free with mode 700 >>> 2014-03-10 11:58:53 assigned process 16789 to cgroup >>> /mnt/cgroups/cpu/user_types/free/tasks >>> 2014-03-10 11:58:53 using Linux cgroup >>> /mnt/cgroups/cpuacct/users/Redacted with mode 700 >>> 2014-03-10 11:58:53 assigned process 16789 to cgroup >>> /mnt/cgroups/cpuacct/users/Redacted/tasks >>> 2014-03-10 11:58:53 using Linux cgroup >>> /mnt/cgroups/memory/user_types/free with mode 700 >>> 2014-03-10 11:58:53 assigned process 16789 to cgroup >>> /mnt/cgroups/memory/user_types/free/tasks >>> 2014-03-10 11:58:53 uWSGI running as root, you can use >>> --uid/--gid/--chroot options >>> 2014-03-10 11:58:53 chroot() to /mnt/chroots/Redacted >>> 2014-03-10 11:58:53 setgid() to 60000 >>> 2014-03-10 11:58:53 setuid() to 231762 >>> 2014-03-10 11:58:53 limiting number of processes to 64... >>> 2014-03-10 11:58:53 your processes number limit is 64 >>> 2014-03-10 11:58:53 your memory page size is 4096 bytes >>> 2014-03-10 11:58:53 detected max file descriptor number: 123456 >>> 2014-03-10 11:58:53 building mime-types dictionary from file >>> /etc/mime.types... >>> 2014-03-10 11:58:53 536 entry found >>> 2014-03-10 11:58:53 lock engine: pthread robust mutexes >>> 2014-03-10 11:58:53 thunder lock: disabled (you can enable it with >>> --thunder-lock) >>> 2014-03-10 11:58:53 uwsgi socket 0 bound to UNIX address >>> /var/sockets/redacted.pythonanywhere.com/socket fd 7 >>> 2014-03-10 11:58:53 Python version: 2.7.5+ (default, Sep 19 2013, >>> 13:52:09) [GCC 4.8.1] >>> 2014-03-10 11:58:53 *** Python threads support is disabled. You >>> can >>> enable it with --enable-threads *** >>> 2014-03-10 11:58:53 Python main interpreter initialized at >>> 0x1021bb0 >>> 2014-03-10 11:58:53 your server socket listen backlog is limited >>> to >>> 100 connections >>> 2014-03-10 11:58:53 your mercy for graceful operations on workers >>> is >>> 60 seconds >>> 2014-03-10 11:58:53 setting request body buffering size to 65536 >>> bytes >>> 2014-03-10 11:58:53 mapped 333936 bytes (326 KB) for 1 cores >>> 2014-03-10 11:58:53 *** Operational MODE: single process *** >>> 2014-03-10 11:58:53 WSGI app 0 (mountpoint='') ready in 1 seconds >>> on >>> interpreter 0x1021bb0 pid: 16789 (default app) >>> 2014-03-10 11:58:53 *** uWSGI is running in multiple interpreter >>> mode *** >>> 2014-03-10 11:58:53 spawned uWSGI master process (pid: 16789) >>> 2014-03-10 11:58:53 spawned uWSGI worker 1 (pid: 16790, cores: 1) >>> 2014-03-10 11:58:53 spawned 2 offload threads for uWSGI worker 1 >>> 2014-03-10 11:58:57 announcing my loyalty to the Emperor... >>> 2014-03-10 12:01:14 Mon Mar 10 12:01:14 2014 - received message 0 >>> from emperor >>> 2014-03-10 12:01:14 SIGINT/SIGQUIT received...killing workers... >>> 2014-03-10 12:01:15 worker 1 buried after 1 seconds >>> 2014-03-10 12:01:15 goodbye to uWSGI. >>> 2014-03-10 12:01:15 chdir(): No such file or directory >>> [core/uwsgi.c >>> line 1472] >>> 2014-03-10 12:01:15 VACUUM: unix socket >>> /var/sockets/redacted.pythonanywhere.com/socket removed. >>> >>> You'll notice the logs are from an earlier reload. later reloads don't >>> seem to even log any more. >>> >>> And here is the vassal config: >>> >>> [uwsgi] >>> plugins = python27 >>> uid = 231762 >>> gid = 60000 >>> >>> if-not-exists = /mnt/chroots/Redacted/bin/ls >>> exec-pre-jail = python >>> /home/anywhere/django/anywhere/jails/create.py Redacted >>> endif = >>> chroot = /mnt/chroots/Redacted >>> limit-nproc = 64 >>> # shutdown app (but not master) after 26hrs of no hits >>> idle=93600 >>> # kill any requests that take too long process >>> harakiri = 300 >>> buffer-size = 32768 >>> post-buffering = 65536 >>> vacuum = >>> # chrooted master cannot reload itself, so just exit >>> exit-on-reload = true >>> # file lock prevents respawning vassals from racing dying ones >>> flock = %p >>> >>> log-encoder = format redacted.pythonanywhere.com ${strftime:%%F >>> %%T} >>> ${msg} >>> logger = rsyslog:10.124.106.197:10515,uwsgi,142 >>> >>> workers = 1 >>> cgroup = /mnt/cgroups/cpu/user_types/free >>> cgroup = /mnt/cgroups/cpuacct/users/Redacted >>> cgroup = /mnt/cgroups/memory/user_types/free >>> >>> auto-procname >>> procname-prefix-spaced = Redacted Redacted.pythonanywhere.com >>> disable-logging = true >>> >>> check-static=/var/www/static >>> >>> static-map = >>> >>> /static/admin/=/home/Redacted/.virtualenvs/django16/lib/python2.7/site-packages/django/contrib/admin/static/admin >>> >>> static-index = index.html >>> offload-threads = 2 >>> >>> touch-reload = /var/www/redacted_pythonanywhere_com_wsgi.py >>> socket = /var/sockets/redacted.pythonanywhere.com/socket >>> chmod-socket = 666 >>> chdir = /var/www >>> env = HOST_NAME=redacted.pythonanywhere.com >>> env = WSGI_MODULE=redacted_pythonanywhere_com_wsgi >>> >>> env = no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com >>> >>> env = HOME=/home/Redacted >>> >>> env = http_proxy=http://proxy.server:3128 >>> >>> env = PYENCHANT_LIBRARY_PATH=/usr/lib/libenchant.so.1 >>> >>> env = https_proxy=http://proxy.server:3128 >>> >>> env = PATH=/home/Redacted/.local/bin:/usr/local/bin:/usr/bin:/bin >>> unenv = UWSGI_EMPEROR_FD >>> unenv = SHLVL >>> unenv = SSH_TTY >>> unenv = PWD >>> unenv = UWSGI_RELOADS >>> unenv = SSH_CLIENT >>> unenv = LOGNAME >>> unenv = UWSGI_ORIGINAL_PROC_NAME >>> unenv = MAIL >>> unenv = SSH_CONNECTION >>> unenv = _ >>> >>> file = /bin/user_wsgi_wrapper.py >>> >>> >>> I've checked the stats server, there aren't any vassals in the >>> blacklist. >>> >>> >>> Bouncing UWSGI fixes the problem, but obviously it involves downtime, >>> so >>> we'd rather avoid it if poss. >>> >>> >> >> Hi Harry, do not do it, basically your vassal is not removed from the >> linked list as the process mapped to it is no more available (hard to >> say >> the reason). Removing the file (well rename it to .off) from the vassal >> dir should be enough. >> >> By the way, latest code improved that coner-case too: >> >> https://github.com/unbit/uwsgi/commit/c118c75bfe5ed6b26668aa48ae076dddcf31a5b9 >> >> >> basically if killing the process is not possible the memory area is >> removed from the list (so it can be restarted). If for some reason the >> pid >> is changed, you will get a zombie, but the master will clear it soon or >> later. >> >> If you use the pid namespace (this is very easy, just add >> emperor-use-clone = pid in your emperor config) you can be sure that >> once >> the vassal master is dead no more user processes (even the daemons >> eventually spawned by your customers) are left (as the master is the new >> init for the vassal) >> >> >> Let me know >> >> > > _______________________________________________ > uWSGI mailing list > [email protected] > http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi > -- Roberto De Ioris http://unbit.it _______________________________________________ uWSGI mailing list [email protected] http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi
