it's hard to tell whether it's the kill error, because the logs in the main log don't say which vassal it's for, but those vassals are then failing to start up correctly, at least that's how it looks from their individual logs:

2014-03-10 15:08:06 chdir(): No such file or directory [core/uwsgi.c line 1472]
2014-03-10 15:08:06 VACUUM: unix socket 
/var/sockets/project003.pythonanywhere.com/socket removed.


--
Harry Percival
Developer
[email protected]

PythonAnywhere - a fully browser-based Python development and hosting 
environment
<http://www.pythonanywhere.com/>

PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK

On 10/03/14 15:14, Roberto De Ioris wrote:
I don't think it's a problem with that particular app - it was basically
a vanilla django install - and it works fine after the restart.

the real problem is that it cascades.  once one vassal starts to
experience the problem, then any new vassals created from that point on,
or any restarted, also start to see problems...

just happened again :/
One thing i fear is not clear, when you say that every new vassals
created/restarted sees the problem, you mean that you get the kill() error
or you simply do not see anythin in the logs ?



--
Harry Percival
Developer
[email protected]

PythonAnywhere - a fully browser-based Python development and hosting
environment
<http://www.pythonanywhere.com/>

PythonAnywhere LLP
17a Clerkenwell Road, London EC1M 5RD, UK
VAT No.: GB 893 5643 79
Registered in England and Wales as company number OC378414.
Registered address: 28 Ely Place, 3rd Floor, London EC1N 6TD, UK

On 10/03/14 13:56, Roberto De Ioris wrote:
Hi there,

Happened again today, I tried to snapshot some more debug info:

here are the logs from the emperor, when i try to reload the vassal:

      2014-03-10 12:19:28 +0000 EMPEROR - [emperor] kill: No such
process
      [core/emperor.c line 1699]
      2014-03-10 12:19:31 +0000 EMPEROR - emperor_respawn/write():
Broken
      pipe [core/emperor.c line 656]
      2014-03-10 12:19:31 +0000 EMPEROR - [emperor] reload the uwsgi
      instance redacted.pythonanywhere.com.ini
      2014-03-10 12:19:31 +0000 EMPEROR - [emperor] kill: No such
process
      [core/emperor.c line 1699]
      2014-03-10 12:19:34 +0000 EMPEROR - [emperor] kill: No such
process
      [core/emperor.c line 1699]
      2014-03-10 12:19:37 +0000 EMPEROR - [emperor] kill: No such
process
      [core/emperor.c line 1699]

You can see the "no such process" error keeps happening, every couple
of
seconds

here are the logs from the vassal server log:

      2014-03-10 11:58:51 VACUUM: unix socket
      /var/sockets/redacted.pythonanywhere.com/socket removed.
      2014-03-10 11:58:53 *** Starting uWSGI 2.0 (64bit) on [Mon Mar 10
      11:58:52 2014] ***
      2014-03-10 11:58:53 compiled with version: 4.8.1 on 07 February
2014
      19:06:17
      2014-03-10 11:58:53 os: Linux-3.11.0-15-generic #25-Ubuntu SMP Thu
      Jan 30 17:22:01 UTC 2014
      2014-03-10 11:58:53 nodename: giles-liveweb2
      2014-03-10 11:58:53 machine: x86_64
      2014-03-10 11:58:53 clock source: unix
      2014-03-10 11:58:53 pcre jit disabled
      2014-03-10 11:58:53 detected number of CPU cores: 4
      2014-03-10 11:58:53 current working directory: /etc/uwsgi/vassals
      2014-03-10 11:58:53 detected binary path: /usr/local/bin/uwsgi
      2014-03-10 11:58:53 using Linux cgroup
      /mnt/cgroups/cpu/user_types/free with mode 700
      2014-03-10 11:58:53 assigned process 16789 to cgroup
      /mnt/cgroups/cpu/user_types/free/tasks
      2014-03-10 11:58:53 using Linux cgroup
      /mnt/cgroups/cpuacct/users/Redacted with mode 700
      2014-03-10 11:58:53 assigned process 16789 to cgroup
      /mnt/cgroups/cpuacct/users/Redacted/tasks
      2014-03-10 11:58:53 using Linux cgroup
      /mnt/cgroups/memory/user_types/free with mode 700
      2014-03-10 11:58:53 assigned process 16789 to cgroup
      /mnt/cgroups/memory/user_types/free/tasks
      2014-03-10 11:58:53 uWSGI running as root, you can use
      --uid/--gid/--chroot options
      2014-03-10 11:58:53 chroot() to /mnt/chroots/Redacted
      2014-03-10 11:58:53 setgid() to 60000
      2014-03-10 11:58:53 setuid() to 231762
      2014-03-10 11:58:53 limiting number of processes to 64...
      2014-03-10 11:58:53 your processes number limit is 64
      2014-03-10 11:58:53 your memory page size is 4096 bytes
      2014-03-10 11:58:53 detected max file descriptor number: 123456
      2014-03-10 11:58:53 building mime-types dictionary from file
      /etc/mime.types...
      2014-03-10 11:58:53 536 entry found
      2014-03-10 11:58:53 lock engine: pthread robust mutexes
      2014-03-10 11:58:53 thunder lock: disabled (you can enable it with
      --thunder-lock)
      2014-03-10 11:58:53 uwsgi socket 0 bound to UNIX address
      /var/sockets/redacted.pythonanywhere.com/socket fd 7
      2014-03-10 11:58:53 Python version: 2.7.5+ (default, Sep 19 2013,
      13:52:09)  [GCC 4.8.1]
      2014-03-10 11:58:53 *** Python threads support is disabled. You
can
      enable it with --enable-threads ***
      2014-03-10 11:58:53 Python main interpreter initialized at
0x1021bb0
      2014-03-10 11:58:53 your server socket listen backlog is limited
to
      100 connections
      2014-03-10 11:58:53 your mercy for graceful operations on workers
is
      60 seconds
      2014-03-10 11:58:53 setting request body buffering size to 65536
bytes
      2014-03-10 11:58:53 mapped 333936 bytes (326 KB) for 1 cores
      2014-03-10 11:58:53 *** Operational MODE: single process ***
      2014-03-10 11:58:53 WSGI app 0 (mountpoint='') ready in 1 seconds
on
      interpreter 0x1021bb0 pid: 16789 (default app)
      2014-03-10 11:58:53 *** uWSGI is running in multiple interpreter
      mode ***
      2014-03-10 11:58:53 spawned uWSGI master process (pid: 16789)
      2014-03-10 11:58:53 spawned uWSGI worker 1 (pid: 16790, cores: 1)
      2014-03-10 11:58:53 spawned 2 offload threads for uWSGI worker 1
      2014-03-10 11:58:57 announcing my loyalty to the Emperor...
      2014-03-10 12:01:14 Mon Mar 10 12:01:14 2014 - received message 0
      from emperor
      2014-03-10 12:01:14 SIGINT/SIGQUIT received...killing workers...
      2014-03-10 12:01:15 worker 1 buried after 1 seconds
      2014-03-10 12:01:15 goodbye to uWSGI.
      2014-03-10 12:01:15 chdir(): No such file or directory
[core/uwsgi.c
      line 1472]
      2014-03-10 12:01:15 VACUUM: unix socket
      /var/sockets/redacted.pythonanywhere.com/socket removed.

You'll notice the logs are from an earlier reload.  later reloads don't
seem to even log any more.

And here is the vassal config:

      [uwsgi]
      plugins = python27
      uid = 231762
      gid = 60000

      if-not-exists = /mnt/chroots/Redacted/bin/ls
      exec-pre-jail = python
      /home/anywhere/django/anywhere/jails/create.py Redacted
      endif =
      chroot = /mnt/chroots/Redacted
      limit-nproc = 64
      # shutdown app (but not master) after 26hrs of no hits
      idle=93600
      # kill any requests that take too long process
      harakiri = 300
      buffer-size = 32768
      post-buffering = 65536
      vacuum =
      # chrooted master cannot reload itself, so just exit
      exit-on-reload = true
      # file lock prevents respawning vassals from racing dying ones
      flock = %p

      log-encoder = format redacted.pythonanywhere.com ${strftime:%%F
%%T}
      ${msg}
      logger = rsyslog:10.124.106.197:10515,uwsgi,142

      workers = 1
      cgroup = /mnt/cgroups/cpu/user_types/free
      cgroup = /mnt/cgroups/cpuacct/users/Redacted
      cgroup = /mnt/cgroups/memory/user_types/free

      auto-procname
      procname-prefix-spaced = Redacted Redacted.pythonanywhere.com
      disable-logging = true

      check-static=/var/www/static

      static-map =
      
/static/admin/=/home/Redacted/.virtualenvs/django16/lib/python2.7/site-packages/django/contrib/admin/static/admin

      static-index = index.html
      offload-threads = 2

      touch-reload = /var/www/redacted_pythonanywhere_com_wsgi.py
      socket = /var/sockets/redacted.pythonanywhere.com/socket
      chmod-socket = 666
      chdir = /var/www
      env = HOST_NAME=redacted.pythonanywhere.com
      env = WSGI_MODULE=redacted_pythonanywhere_com_wsgi

      env = no_proxy=localhost,127.0.0.1,localaddress,.localdomain.com

      env = HOME=/home/Redacted

      env = http_proxy=http://proxy.server:3128

      env = PYENCHANT_LIBRARY_PATH=/usr/lib/libenchant.so.1

      env = https_proxy=http://proxy.server:3128

      env = PATH=/home/Redacted/.local/bin:/usr/local/bin:/usr/bin:/bin
      unenv = UWSGI_EMPEROR_FD
      unenv = SHLVL
      unenv = SSH_TTY
      unenv = PWD
      unenv = UWSGI_RELOADS
      unenv = SSH_CLIENT
      unenv = LOGNAME
      unenv = UWSGI_ORIGINAL_PROC_NAME
      unenv = MAIL
      unenv = SSH_CONNECTION
      unenv = _

      file = /bin/user_wsgi_wrapper.py


I've checked the stats server, there aren't any vassals in the
blacklist.


Bouncing UWSGI fixes the problem, but obviously it involves downtime,
so
we'd rather avoid it if poss.


Hi Harry, do not do it, basically your vassal is not removed from the
linked list as the process mapped to it is no more available (hard to
say
the reason). Removing the file (well rename it to .off) from the vassal
dir should be enough.

By the way, latest code improved that coner-case too:

https://github.com/unbit/uwsgi/commit/c118c75bfe5ed6b26668aa48ae076dddcf31a5b9


basically if killing the process is not possible the memory area is
removed from the list (so it can be restarted). If for some reason the
pid
is changed, you will get a zombie, but the master will clear it soon or
later.

If you use the pid namespace (this is very easy, just add
emperor-use-clone = pid in your emperor config) you can be sure that
once
the vassal master is dead no more user processes (even the daemons
eventually spawned by your customers) are left (as the master is the new
init for the vassal)


Let me know


_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi



_______________________________________________
uWSGI mailing list
[email protected]
http://lists.unbit.it/cgi-bin/mailman/listinfo/uwsgi

Reply via email to