it's hard to tell whether it's the kill error, because the logs in the
main log don't say which vassal it's for, but those vassals are then
failing to start up correctly, at least that's how it looks from their
individual logs:
2014-03-10 15:08:06 chdir(): No such file or directory [core/uwsgi.c line 1472]
2014-03-10 15:08:06 VACUUM: unix socket
/var/sockets/project003.pythonanywhere.com/socket removed.
--
Harry Percival
Developer
[email protected]
PythonAnywhere - a fully browser-based Python development and hosting
environment
<http://www.pythonanywhere.com/>
PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
On 10/03/14 15:14, Roberto De Ioris wrote:
I don't think it's a problem with that particular app - it was basically
a vanilla django install - and it works fine after the restart.
the real problem is that it cascades. once one vassal starts to
experience the problem, then any new vassals created from that point on,
or any restarted, also start to see problems...
just happened again :/
One thing i fear is not clear, when you say that every new vassals
created/restarted sees the problem, you mean that you get the kill() error
or you simply do not see anythin in the logs ?
--
Harry Percival
Developer
[email protected]
PythonAnywhere - a fully browser-based Python development and hosting
environment
<http://www.pythonanywhere.com/>
PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK
On 10/03/14 13:56, Roberto De Ioris wrote:
Hi there,
Happened again today, I tried to snapshot some more debug info:
here are the logs from the emperor, when i try to reload the vassal:
2014-03-10 12:19:28 +0000 EMPEROR - [emperor] kill: No such
process
[core/emperor.c line 1699]
2014-03-10 12:19:31 +0000 EMPEROR - emperor_respawn/write():
Broken
pipe [core/emperor.c line 656]
2014-03-10 12:19:31 +0000 EMPEROR - [emperor] reload the uwsgi
instance redacted.pythonanywhere.com.ini
2014-03-10 12:19:31 +0000 EMPEROR - [emperor] kill: No such
process
[core/emperor.c line 1699]
2014-03-10 12:19:34 +0000 EMPEROR - [emperor] kill: No such
process
[core/emperor.c line 1699]
2014-03-10 12:19:37 +0000 EMPEROR - [emperor] kill: No such
process
[core/emperor.c line 1699]
You can see the "no such process" error keeps happening, every couple
of
seconds
here are the logs from the vassal server log:
2014-03-10 11:58:51 VACUUM: unix socket
/var/sockets/redacted.pythonanywhere.com/socket removed.
2014-03-10 11:58:53 *** Starting uWSGI 2.0 (64bit) on [Mon Mar 10
11:58:52 2014] ***
2014-03-10 11:58:53 compiled with version: 4.8.1 on 07 February
2014
19:06:17
2014-03-10 11:58:53 os: Linux-3.11.0-15-generic #25-Ubuntu SMP Thu
Jan 30 17:22:01 UTC 2014
2014-03-10 11:58:53 nodename: giles-liveweb2
2014-03-10 11:58:53 machine: x86_64
2014-03-10 11:58:53 clock source: unix
2014-03-10 11:58:53 pcre jit disabled
2014-03-10 11:58:53 detected number of CPU cores: 4
2014-03-10 11:58:53 current working directory: /etc/uwsgi/vassals
2014-03-10 11:58:53 detected binary path: /usr/local/bin/uwsgi
2014-03-10 11:58:53 using Linux cgroup
/mnt/cgroups/cpu/user_types/free with mode 700
2014-03-10 11:58:53 assigned process 16789 to cgroup
/mnt/cgroups/cpu/user_types/free/tasks
2014-03-10 11:58:53 using Linux cgroup
/mnt/cgroups/cpuacct/users/Redacted with mode 700
2014-03-10 11:58:53 assigned process 16789 to cgroup
/mnt/cgroups/cpuacct/users/Redacted/tasks
2014-03-10 11:58:53 using Linux cgroup
/mnt/cgroups/memory/user_types/free with mode 700
2014-03-10 11:58:53 assigned process 16789 to cgroup
/mnt/cgroups/memory/user_types/free/tasks
2014-03-10 11:58:53 uWSGI running as root, you can use
--uid/--gid/--chroot options
2014-03-10 11:58:53 chroot() to /mnt/chroots/Redacted
2014-03-10 11:58:53 setgid() to 60000
2014-03-10 11:58:53 setuid() to 231762
2014-03-10 11:58:53 limiting number of processes to 64...
2014-03-10 11:58:53 your processes number limit is 64
2014-03-10 11:58:53 your memory page size is 4096 bytes
2014-03-10 11:58:53 detected max file descriptor number: 123456
2014-03-10 11:58:53 building mime-types dictionary from file
/etc/mime.types...
2014-03-10 11:58:53 536 entry found
2014-03-10 11:58:53 lock engine: pthread robust mutexes
2014-03-10 11:58:53 thunder lock: disabled (you can enable it with
--thunder-lock)
2014-03-10 11:58:53 uwsgi socket 0 bound to UNIX address
/var/sockets/redacted.pythonanywhere.com/socket fd 7
2014-03-10 11:58:53 Python version: 2.7.5+ (default, Sep 19 2013,
13:52:09) [GCC 4.8.1]
2014-03-10 11:58:53 *** Python threads support is disabled. You
can
enable it with --enable-threads ***
2014-03-10 11:58:53 Python main interpreter initialized at
0x1021bb0
2014-03-10 11:58:53 your server socket listen backlog is limited
to
100 connections
2014-03-10 11:58:53 your mercy for graceful operations on workers
is
60 seconds
2014-03-10 11:58:53 setting request body buffering size to 65536
bytes
2014-03-10 11:58:53 mapped 333936 bytes (326 KB) for 1 cores
2014-03-10 11:58:53 *** Operational MODE: single process ***
2014-03-10 11:58:53 WSGI app 0 (mountpoint='') ready in 1 seconds
on
interpreter 0x1021bb0 pid: 16789 (default app)
2014-03-10 11:58:53 *** uWSGI is running in multiple interpreter
mode ***
2014-03-10 11:58:53 spawned uWSGI master process (pid: 16789)
2014-03-10 11:58:53 spawned uWSGI worker 1 (pid: 16790, cores: 1)
2014-03-10 11:58:53 spawned 2 offload threads for uWSGI worker 1
2014-03-10 11:58:57 announcing my loyalty to the Emperor...
2014-03-10 12:01:14 Mon Mar 10 12:01:14 2014 - received message 0
from emperor
2014-03-10 12:01:14 SIGINT/SIGQUIT received...killing workers...
2014-03-10 12:01:15 worker 1 buried after 1 seconds
2014-03-10 12:01:15 goodbye to uWSGI.
2014-03-10 12:01:15 chdir(): No such file or directory
[core/uwsgi.c
line 1472]
2014-03-10 12:01:15 VACUUM: unix socket
/var/sockets/redacted.pythonanywhere.com/socket removed.
You'll notice the logs are from an earlier reload. later reloads don't
seem to even log any more.
And here is the vassal config:
[uwsgi]
plugins = python27
uid = 231762
gid = 60000
if-not-exists = /mnt/chroots/Redacted/bin/ls
exec-pre-jail = python
/home/anywhere/django/anywhere/jails/create.py Redacted
endif =
chroot = /mnt/chroots/Redacted
limit-nproc = 64
# shutdown app (but not master) after 26hrs of no hits
idle=93600
# kill any requests that take too long process
harakiri = 300
buffer-size = 32768
post-buffering = 65536
vacuum =
# chrooted master cannot reload itself, so just exit
exit-on-reload = true
# file lock prevents respawning vassals from racing dying ones
flock = %p
log-encoder = format redacted.pythonanywhere.com ${strftime:%%F
%%T}
${msg}
logger = rsyslog:10.124.106.197:10515,uwsgi,142
workers = 1
cgroup = /mnt/cgroups/cpu/user_types/free
cgroup = /mnt/cgroups/cpuacct/users/Redacted
cgroup = /mnt/cgroups/memory/user_types/free
auto-procname
procname-prefix-spaced = Redacted Redacted.pythonanywhere.com
disable-logging = true
check-static=/var/www/static
static-map =
/static/admin/=/home/Redacted/.virtualenvs/django16/lib/python2.7/site-packages/django/contrib/admin/static/admin
static-index = index.html
offload-threads = 2
touch-reload = /var/www/redacted_pythonanywhere_com_wsgi.py
socket = /var/sockets/redacted.pythonanywhere.com/socket
chmod-socket = 666
chdir = /var/www
env = HOST_NAME=redacted.pythonanywhere.com
env = WSGI_MODULE=redacted_pythonanywhere_com_wsgi
env = no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com
env = HOME=/home/Redacted
env = http_proxy=http://proxy.server:3128
env = PYENCHANT_LIBRARY_PATH=/usr/lib/libenchant.so.1
env = https_proxy=http://proxy.server:3128
env = PATH=/home/Redacted/.local/bin:/usr/local/bin:/usr/bin:/bin
unenv = UWSGI_EMPEROR_FD
unenv = SHLVL
unenv = SSH_TTY
unenv = PWD
unenv = UWSGI_RELOADS
unenv = SSH_CLIENT
unenv = LOGNAME
unenv = UWSGI_ORIGINAL_PROC_NAME
unenv = MAIL
unenv = SSH_CONNECTION
unenv = _
file = /bin/user_wsgi_wrapper.py
I've checked the stats server, there aren't any vassals in the
blacklist.
Bouncing UWSGI fixes the problem, but obviously it involves downtime,
so
we'd rather avoid it if poss.
Hi Harry, do not do it, basically your vassal is not removed from the
linked list as the process mapped to it is no more available (hard to
say
the reason). Removing the file (well rename it to .off) from the vassal
dir should be enough.
By the way, latest code improved that coner-case too:
https://github.com/unbit/uwsgi/commit/c118c75bfe5ed6b26668aa48ae076dddcf31a5b9
basically if killing the process is not possible the memory area is
removed from the list (so it can be restarted). If for some reason the
pid
is changed, you will get a zombie, but the master will clear it soon or
later.
If you use the pid namespace (this is very easy, just add
emperor-use-clone = pid in your emperor config) you can be sure that
once
the vassal master is dead no more user processes (even the daemons
eventually spawned by your customers) are left (as the master is the new
init for the vassal)
Let me know
_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi
_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi