> Hi there,
>
> Happened again today, I tried to snapshot some more debug info:
>
> here are the logs from the emperor, when i try to reload the vassal:
>
> 2014-03-10 12:19:28 +0000 EMPEROR - [emperor] kill: No such process
> [core/emperor.c line 1699]
> 2014-03-10 12:19:31 +0000 EMPEROR - emperor_respawn/write(): Broken
> pipe [core/emperor.c line 656]
> 2014-03-10 12:19:31 +0000 EMPEROR - [emperor] reload the uwsgi
> instance redacted.pythonanywhere.com.ini
> 2014-03-10 12:19:31 +0000 EMPEROR - [emperor] kill: No such process
> [core/emperor.c line 1699]
> 2014-03-10 12:19:34 +0000 EMPEROR - [emperor] kill: No such process
> [core/emperor.c line 1699]
> 2014-03-10 12:19:37 +0000 EMPEROR - [emperor] kill: No such process
> [core/emperor.c line 1699]
>
> You can see the "no such process" error keeps happening, every couple of
> seconds
>
> here are the logs from the vassal server log:
>
> 2014-03-10 11:58:51 VACUUM: unix socket
> /var/sockets/redacted.pythonanywhere.com/socket removed.
> 2014-03-10 11:58:53 *** Starting uWSGI 2.0 (64bit) on [Mon Mar 10
> 11:58:52 2014] ***
> 2014-03-10 11:58:53 compiled with version: 4.8.1 on 07 February 2014
> 19:06:17
> 2014-03-10 11:58:53 os: Linux-3.11.0-15-generic #25-Ubuntu SMP Thu
> Jan 30 17:22:01 UTC 2014
> 2014-03-10 11:58:53 nodename: giles-liveweb2
> 2014-03-10 11:58:53 machine: x86_64
> 2014-03-10 11:58:53 clock source: unix
> 2014-03-10 11:58:53 pcre jit disabled
> 2014-03-10 11:58:53 detected number of CPU cores: 4
> 2014-03-10 11:58:53 current working directory: /etc/uwsgi/vassals
> 2014-03-10 11:58:53 detected binary path: /usr/local/bin/uwsgi
> 2014-03-10 11:58:53 using Linux cgroup
> /mnt/cgroups/cpu/user_types/free with mode 700
> 2014-03-10 11:58:53 assigned process 16789 to cgroup
> /mnt/cgroups/cpu/user_types/free/tasks
> 2014-03-10 11:58:53 using Linux cgroup
> /mnt/cgroups/cpuacct/users/Redacted with mode 700
> 2014-03-10 11:58:53 assigned process 16789 to cgroup
> /mnt/cgroups/cpuacct/users/Redacted/tasks
> 2014-03-10 11:58:53 using Linux cgroup
> /mnt/cgroups/memory/user_types/free with mode 700
> 2014-03-10 11:58:53 assigned process 16789 to cgroup
> /mnt/cgroups/memory/user_types/free/tasks
> 2014-03-10 11:58:53 uWSGI running as root, you can use
> --uid/--gid/--chroot options
> 2014-03-10 11:58:53 chroot() to /mnt/chroots/Redacted
> 2014-03-10 11:58:53 setgid() to 60000
> 2014-03-10 11:58:53 setuid() to 231762
> 2014-03-10 11:58:53 limiting number of processes to 64...
> 2014-03-10 11:58:53 your processes number limit is 64
> 2014-03-10 11:58:53 your memory page size is 4096 bytes
> 2014-03-10 11:58:53 detected max file descriptor number: 123456
> 2014-03-10 11:58:53 building mime-types dictionary from file
> /etc/mime.types...
> 2014-03-10 11:58:53 536 entry found
> 2014-03-10 11:58:53 lock engine: pthread robust mutexes
> 2014-03-10 11:58:53 thunder lock: disabled (you can enable it with
> --thunder-lock)
> 2014-03-10 11:58:53 uwsgi socket 0 bound to UNIX address
> /var/sockets/redacted.pythonanywhere.com/socket fd 7
> 2014-03-10 11:58:53 Python version: 2.7.5+ (default, Sep 19 2013,
> 13:52:09) [GCC 4.8.1]
> 2014-03-10 11:58:53 *** Python threads support is disabled. You can
> enable it with --enable-threads ***
> 2014-03-10 11:58:53 Python main interpreter initialized at 0x1021bb0
> 2014-03-10 11:58:53 your server socket listen backlog is limited to
> 100 connections
> 2014-03-10 11:58:53 your mercy for graceful operations on workers is
> 60 seconds
> 2014-03-10 11:58:53 setting request body buffering size to 65536 bytes
> 2014-03-10 11:58:53 mapped 333936 bytes (326 KB) for 1 cores
> 2014-03-10 11:58:53 *** Operational MODE: single process ***
> 2014-03-10 11:58:53 WSGI app 0 (mountpoint='') ready in 1 seconds on
> interpreter 0x1021bb0 pid: 16789 (default app)
> 2014-03-10 11:58:53 *** uWSGI is running in multiple interpreter
> mode ***
> 2014-03-10 11:58:53 spawned uWSGI master process (pid: 16789)
> 2014-03-10 11:58:53 spawned uWSGI worker 1 (pid: 16790, cores: 1)
> 2014-03-10 11:58:53 spawned 2 offload threads for uWSGI worker 1
> 2014-03-10 11:58:57 announcing my loyalty to the Emperor...
> 2014-03-10 12:01:14 Mon Mar 10 12:01:14 2014 - received message 0
> from emperor
> 2014-03-10 12:01:14 SIGINT/SIGQUIT received...killing workers...
> 2014-03-10 12:01:15 worker 1 buried after 1 seconds
> 2014-03-10 12:01:15 goodbye to uWSGI.
> 2014-03-10 12:01:15 chdir(): No such file or directory [core/uwsgi.c
> line 1472]
> 2014-03-10 12:01:15 VACUUM: unix socket
> /var/sockets/redacted.pythonanywhere.com/socket removed.
>
> You'll notice the logs are from an earlier reload. later reloads don't
> seem to even log any more.
>
> And here is the vassal config:
>
> [uwsgi]
> plugins = python27
> uid = 231762
> gid = 60000
>
> if-not-exists = /mnt/chroots/Redacted/bin/ls
> exec-pre-jail = python
> /home/anywhere/django/anywhere/jails/create.py Redacted
> endif =
> chroot = /mnt/chroots/Redacted
> limit-nproc = 64
> # shutdown app (but not master) after 26hrs of no hits
> idle=93600
> # kill any requests that take too long process
> harakiri = 300
> buffer-size = 32768
> post-buffering = 65536
> vacuum =
> # chrooted master cannot reload itself, so just exit
> exit-on-reload = true
> # file lock prevents respawning vassals from racing dying ones
> flock = %p
>
> log-encoder = format redacted.pythonanywhere.com ${strftime:%%F %%T}
> ${msg}
> logger = rsyslog:10.124.106.197:10515,uwsgi,142
>
> workers = 1
> cgroup = /mnt/cgroups/cpu/user_types/free
> cgroup = /mnt/cgroups/cpuacct/users/Redacted
> cgroup = /mnt/cgroups/memory/user_types/free
>
> auto-procname
> procname-prefix-spaced = Redacted Redacted.pythonanywhere.com
> disable-logging = true
>
> check-static=/var/www/static
>
> static-map =
>
> /static/admin/=/home/Redacted/.virtualenvs/django16/lib/python2.7/site-packages/django/contrib/admin/static/admin
>
> static-index = index.html
> offload-threads = 2
>
> touch-reload = /var/www/redacted_pythonanywhere_com_wsgi.py
> socket = /var/sockets/redacted.pythonanywhere.com/socket
> chmod-socket = 666
> chdir = /var/www
> env = HOST_NAME=redacted.pythonanywhere.com
> env = WSGI_MODULE=redacted_pythonanywhere_com_wsgi
>
> env = no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com
>
> env = HOME=/home/Redacted
>
> env = http_proxy=http://proxy.server:3128
>
> env = PYENCHANT_LIBRARY_PATH=/usr/lib/libenchant.so.1
>
> env = https_proxy=http://proxy.server:3128
>
> env = PATH=/home/Redacted/.local/bin:/usr/local/bin:/usr/bin:/bin
> unenv = UWSGI_EMPEROR_FD
> unenv = SHLVL
> unenv = SSH_TTY
> unenv = PWD
> unenv = UWSGI_RELOADS
> unenv = SSH_CLIENT
> unenv = LOGNAME
> unenv = UWSGI_ORIGINAL_PROC_NAME
> unenv = MAIL
> unenv = SSH_CONNECTION
> unenv = _
>
> file = /bin/user_wsgi_wrapper.py
>
>
> I've checked the stats server, there aren't any vassals in the blacklist.
>
>
> Bouncing UWSGI fixes the problem, but obviously it involves downtime, so
> we'd rather avoid it if poss.
>
>
Hi Harry, do not do it, basically your vassal is not removed from the
linked list as the process mapped to it is no more available (hard to say
the reason). Removing the file (well rename it to .off) from the vassal
dir should be enough.
By the way, latest code improved that coner-case too:
https://github.com/unbit/uwsgi/commit/c118c75bfe5ed6b26668aa48ae076dddcf31a5b9
basically if killing the process is not possible the memory area is
removed from the list (so it can be restarted). If for some reason the pid
is changed, you will get a zombie, but the master will clear it soon or
later.
If you use the pid namespace (this is very easy, just add
emperor-use-clone = pid in your emperor config) you can be sure that once
the vassal master is dead no more user processes (even the daemons
eventually spawned by your customers) are left (as the master is the new
init for the vassal)
Let me know
--
Roberto De Ioris
http://unbit.it
_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi