Re: [modwsgi] Timeout when reading response header from daemon process.

Graham Dumpleton Sun, 05 Feb 2023 14:32:32 -0800

Two points here to clarify.

In your case the final error is:


  Daemon process deadlock timer expired

This means that the full Python interpreter locked up. In this case the request 
timeout may not apply and the feature where by stack traces can be dumped might 
not happen. Depends on what lead up to the issue.

By default the deadlock timeout is 300 seconds.

deadlock-timeout=sss
Defines the maximum number of seconds allowed to pass before the daemon process 
is shutdown and restarted after a potential deadlock on the Python GIL has been 
detected. The default is 300 seconds.

This option exists to combat the problem of a daemon process freezing as the 
result of a rogue Python C extension module which doesn’t properly release the 
Python GIL when entering into a blocking or long running operation.

This can occur when you are using third party Python packages which aren't 
designed to work in Python sub interpreters. More details in:

https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api
The solution as you found is to set:

  WSGIApplicationGroup %{GLOBAL}

So likely when you updated Python versions, some third party package you are 
using has shifted to a newer version which breaks in this respect, or which 
runs code different when using newer Python version.

As to request timeout if it were coming into play, the timeout is 30 seconds 
when you have daemon processes which are single threaded.

For multithreaded daemon process the timeout can trigger after 30 seconds as 
uses an average across all active request handler threads.

request-timeout=sss
Defines the maximum number of seconds that a request is allowed to run before 
the daemon process is restarted. This can be used to recover from a scenario 
where a request blocks indefinitely, and where if all request threads were 
consumed in this way, would result in the whole WSGI application process being 
blocked.

How this option is seen to behave is different depending on whether a daemon 
process uses only one thread, or more than one thread for handling requests, as 
set by the threads option.

If there is only a single thread, and so the process can only handle one 
request at a time, as soon as the timeout has passed, a restart of the process 
will be initiated.

If there is more than one thread, the request timeout is applied to the average 
running time for any requests, across all threads. This means that a request 
can run longer than the request timeout. This is done to reduce the possibility 
of interupting other running requests, and causing a user to see a failure. So 
where there is still capacity to handle more requests, restarting of the 
process will be delayed if possible.


In your case though because hitting a full lock up of the Python interpreter, 
even though the request timeout may still have triggered, it couldn't dump 
stack traces as doing so requires getting the global Python interpreter lock, 
which wasn't being released and thus threads to dump stack traces got stuck.

To understand the cause further and identify which third party package may be 
the problem, I would need to see what extra Python packages you are installing. 
Note that numpy can cause this, so if using anything which relies on that, that 
could be the reason.

Graham

> On 6 Feb 2023, at 3:35 am, Carsten Fuchs <carsten.fu...@cafu.de> wrote:
> 
> Adding to my recent post:
> 
> It seems that the `request-timeout=30` is effective, after all: The error 
> messages have changed, but there is no stack trace.
> For completeness, here is the site config and error log excerpt:
> 
> Site config:
> 
> <VirtualHost *:80>
>     ServerAdmin webmaster@localhost
>     DocumentRoot /var/www/html
> 
>     # Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
>     # error, crit, alert, emerg.
>     # It is also possible to configure the loglevel for particular
>     # modules, e.g.
>     #LogLevel info ssl:warn
>     LogLevel info
> 
>     ErrorLog ${APACHE_LOG_DIR}/error.log
>     CustomLog ${APACHE_LOG_DIR}/access.log combined
> 
>     # Siehe https://bz.apache.org/bugzilla/show_bug.cgi?id=45023
>     # Hierfür muss mod_headers aktiv sein: sudo a2enmod headers
>     RequestHeader edit "If-None-Match" '^"((.*)-gzip)"$' '"$1", "$2"'
> 
>     Alias /static/ /var/www/HallCam-static/
>     <Directory /var/www/HallCam-static>
>         Require all granted
>     </Directory>
> 
>     Alias /media/ /var/www/HallCam-media/
>     <Directory /var/www/HallCam-media>
>         Require all granted
>     </Directory>
> 
>     WSGIDaemonProcess cf-hallcam-site request-timeout=30 user=carsten 
> group=carsten processes=2 display-name=%{GROUP} 
> python-home=/home/carsten/.virtualenvs/HallCam-web 
> python-path=/home/carsten/HallCam/web
>     WSGIProcessGroup cf-hallcam-site
>     # WSGIApplicationGroup %{GLOBAL}
> 
>     WSGIScriptAlias / /home/carsten/HallCam/web/HallCam/wsgi.py
>     <Directory /home/carsten/HallCam/web/HallCam>
>         <Files wsgi.py>
>             Require all granted
>         </Files>
>     </Directory>
> </VirtualHost>
> 
> 
> From `/var/log/apache2/error.log`:
> 
> [Sun Feb 05 17:18:55.470532 2023] [mpm_event:notice] [pid 635:tid 
> 140277991982976] AH00489: Apache/2.4.52 (Ubuntu) mod_wsgi/4.9.0 Python/3.10 
> configured -- resuming normal operations
> [Sun Feb 05 17:18:55.690430 2023] [core:notice] [pid 635:tid 140277991982976] 
> AH00094: Command line: '/usr/sbin/apache2'
> [Sun Feb 05 17:18:56.338443 2023] [wsgi:info] [pid 638:tid 140277991982976] 
> mod_wsgi (pid=638): Attach interpreter ''.
> [Sun Feb 05 17:18:56.342866 2023] [wsgi:info] [pid 636:tid 140277991982976] 
> mod_wsgi (pid=636): Attach interpreter ''.
> [Sun Feb 05 17:18:56.366079 2023] [wsgi:info] [pid 636:tid 140277991982976] 
> mod_wsgi (pid=636): Adding '/home/carsten/HallCam/web' to path.
> [Sun Feb 05 17:18:56.371777 2023] [wsgi:info] [pid 638:tid 140277991982976] 
> mod_wsgi (pid=638): Adding '/home/carsten/HallCam/web' to path.
> [Sun Feb 05 17:19:24.823860 2023] [wsgi:info] [pid 636:tid 140277959865920] 
> mod_wsgi (pid=636): Create interpreter '192.168.1.222:32228|'.
> [Sun Feb 05 17:19:24.837012 2023] [wsgi:info] [pid 636:tid 140277959865920] 
> mod_wsgi (pid=636): Adding '/home/carsten/HallCam/web' to path.
> [Sun Feb 05 17:19:24.837752 2023] [wsgi:info] [pid 636:tid 140277959865920] 
> [remote 88.75.25.178:50558] mod_wsgi (pid=636, process='cf-hallcam-site', 
> application='192.168.1.222:32228|'): Loading Python script file 
> '/home/carsten/HallCam/web/HallCam/wsgi.py'.
> [Sun Feb 05 17:19:43.356895 2023] [wsgi:info] [pid 638:tid 140277867546176] 
> mod_wsgi (pid=638): Create interpreter '192.168.1.222:32228|'.
> [Sun Feb 05 17:19:43.369787 2023] [wsgi:info] [pid 638:tid 140277867546176] 
> mod_wsgi (pid=638): Adding '/home/carsten/HallCam/web' to path.
> [Sun Feb 05 17:19:43.370532 2023] [wsgi:info] [pid 638:tid 140277867546176] 
> [remote 88.75.25.178:48210] mod_wsgi (pid=638, process='cf-hallcam-site', 
> application='192.168.1.222:32228|'): Loading Python script file 
> '/home/carsten/HallCam/web/HallCam/wsgi.py'.
> [Sun Feb 05 17:23:21.631286 2023] [wsgi:info] [pid 636:tid 140277976651328] 
> mod_wsgi (pid=636): Daemon process request time limit exceeded, stopping 
> process 'cf-hallcam-site'.
> [Sun Feb 05 17:23:21.631405 2023] [wsgi:info] [pid 636:tid 140277991982976] 
> mod_wsgi (pid=636): Shutdown requested 'cf-hallcam-site'.
> [Sun Feb 05 17:23:26.631650 2023] [wsgi:info] [pid 636:tid 140277591307840] 
> mod_wsgi (pid=636): Aborting process 'cf-hallcam-site'.
> [Sun Feb 05 17:23:26.631699 2023] [wsgi:info] [pid 636:tid 140277591307840] 
> mod_wsgi (pid=636): Exiting process 'cf-hallcam-site'.
> [Sun Feb 05 17:23:26.714616 2023] [wsgi:error] [pid 639:tid 140277725460032] 
> [client 88.75.25.178:48224] Truncated or oversized response headers received 
> from daemon process 'cf-hallcam-site': 
> /home/carsten/HallCam/web/HallCam/wsgi.py, referer: 
> http://vdzuggmrroo5k7e9.myfritz.net:32228/upload/
> [Sun Feb 05 17:23:26.714952 2023] [wsgi:error] [pid 639:tid 140277959865920] 
> (104)Connection reset by peer: [client 88.75.25.178:50558] mod_wsgi 
> (pid=639): Failed to proxy response from daemon., referer: 
> http://vdzuggmrroo5k7e9.myfritz.net:32228/upload/
> [Sun Feb 05 17:23:26.983714 2023] [wsgi:info] [pid 885:tid 140277991982976] 
> mod_wsgi (pid=885): Attach interpreter ''.
> [Sun Feb 05 17:23:26.994980 2023] [wsgi:info] [pid 885:tid 140277991982976] 
> mod_wsgi (pid=885): Adding '/home/carsten/HallCam/web' to path.
> [Sun Feb 05 17:23:37.641476 2023] [wsgi:info] [pid 638:tid 140277976651328] 
> mod_wsgi (pid=638): Daemon process request time limit exceeded, stopping 
> process 'cf-hallcam-site'.
> [Sun Feb 05 17:23:37.641609 2023] [wsgi:info] [pid 638:tid 140277991982976] 
> mod_wsgi (pid=638): Shutdown requested 'cf-hallcam-site'.
> [Sun Feb 05 17:23:42.641863 2023] [wsgi:info] [pid 638:tid 140277591307840] 
> mod_wsgi (pid=638): Aborting process 'cf-hallcam-site'.
> [Sun Feb 05 17:23:42.641910 2023] [wsgi:info] [pid 638:tid 140277591307840] 
> mod_wsgi (pid=638): Exiting process 'cf-hallcam-site'.
> [Sun Feb 05 17:23:42.648248 2023] [wsgi:error] [pid 639:tid 140277717067328] 
> [client 37.81.109.237:49336] Truncated or oversized response headers received 
> from daemon process 'cf-hallcam-site': 
> /home/carsten/HallCam/web/HallCam/wsgi.py
> [Sun Feb 05 17:23:42.648624 2023] [wsgi:error] [pid 639:tid 140277733852736] 
> (104)Connection reset by peer: [client 88.75.25.178:48210] mod_wsgi 
> (pid=639): Failed to proxy response from daemon., referer: 
> http://vdzuggmrroo5k7e9.myfritz.net:32228/
> [Sun Feb 05 17:23:42.997522 2023] [wsgi:info] [pid 906:tid 140277991982976] 
> mod_wsgi (pid=906): Attach interpreter ''.
> [Sun Feb 05 17:23:43.039478 2023] [wsgi:info] [pid 906:tid 140277991982976] 
> mod_wsgi (pid=906): Adding '/home/carsten/HallCam/web' to path.
> [Sun Feb 05 17:25:01.761551 2023] [wsgi:info] [pid 885:tid 140277959865920] 
> mod_wsgi (pid=885): Create interpreter '192.168.1.222:32228|'.
> [Sun Feb 05 17:25:01.774488 2023] [wsgi:info] [pid 885:tid 140277959865920] 
> mod_wsgi (pid=885): Adding '/home/carsten/HallCam/web' to path.
> [Sun Feb 05 17:25:01.775225 2023] [wsgi:info] [pid 885:tid 140277959865920] 
> [remote 37.81.109.237:49338] mod_wsgi (pid=885, process='cf-hallcam-site', 
> application='192.168.1.222:32228|'): Loading Python script file 
> '/home/carsten/HallCam/web/HallCam/wsgi.py'.
> [Sun Feb 05 17:25:03.155261 2023] [wsgi:error] [pid 885:tid 140277959865920] 
> [remote 37.81.109.237:49338] Picture saved to 
> /var/www/HallCam-media/pictures/camera-1/pic_20230205_172500_4.jpg 
> (image/jpeg, 896262 bytes, camera camera-1)
> [Sun Feb 05 17:30:03.323110 2023] [wsgi:info] [pid 885:tid 140277976651328] 
> mod_wsgi (pid=885): Daemon process deadlock timer expired, stopping process 
> 'cf-hallcam-site'.
> [Sun Feb 05 17:30:03.323253 2023] [wsgi:info] [pid 885:tid 140277991982976] 
> mod_wsgi (pid=885): Shutdown requested 'cf-hallcam-site'.
> 
> 
> Carsten Fuchs schrieb am Sonntag, 5. Februar 2023 um 17:23:29 UTC+1:
>> Dear Graham,
>> 
>> I experienced the same timeout errors after having upgraded a server from 
>> Ubuntu 20.04 LTS to Ubuntu 22.04 LTS, thereby also upgrading from Python 3.8 
>> to Python 3.10. The application is a relatively simple Django project that 
>> used to work well until the upgrade. After the upgrade, I deleted the old 
>> virtualenv and built a new one, using `pip install --no-cache-dir -r 
>> requirements.txt` to install it. However, I experience the same problem.
>> 
>> Adding `WSGIApplicationGroup %{GLOBAL}` solved the problem, but I am still 
>> concerned because the site worked well with the older Ubuntu 20.04 LTS and I 
>> would prefer to not mask a potential problem and rather find its root.
>> 
>> Therefore, I added the `request-timeout=30` option to `WSGIDaemonProcess` 
>> (and temporarily commented `WSGIApplicationGroup` out again) in order to get 
>> a stack trace, however it doesn't seem to have any effect: Requests time out 
>> only much later than 30 seconds.
>> 
>> Can you please advise what may have caused the problem when upgrading from 
>> Ubuntu 20.04 LTS to Ubuntu 22.04 LTS and why `request-timeout=30` may not 
>> have any effect?
>> 
>> Best regards,
>> Carsten
>> 
>> A schrieb am Dienstag, 24. Januar 2023 um 10:39:40 UTC+1:
>>> Dear Graham,
>>> 
>>> It was a matter of adding that line and everything fell into place!
>>> 
>>> 
>>> Thanks a lot,
>>> A
>>> 
>>> On Monday, January 23, 2023 at 8:21:44 PM UTC+1 Graham Dumpleton wrote:
>>>> Your configuration means only a single request can be handled at a time by 
>>>> the daemon process, so if you have a very long running request, any other 
>>>> requests would block waiting.
>>>> 
>>>> Even if not a long running request, a problem may be that you are using a 
>>>> Python package that isn't designed to work in a Python sub interpreter. 
>>>> This could cause that package to hang and so everything blocks up (same as 
>>>> long running request at that point) for that request.
>>>> 
>>>> Application Issues — mod_wsgi 4.9.4 documentation
>>>> modwsgi.readthedocs.io
>>>> 
>>>>  
>>>> <https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api>Application
>>>>  Issues — mod_wsgi 4.9.4 documentation 
>>>> <https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api>
>>>> modwsgi.readthedocs.io 
>>>> <https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api>
>>>>        
>>>> <https://modwsgi.readthedocs.io/en/master/user-guides/application-issues.html#python-simplified-gil-state-api>
>>>> Try the solution described in the docs of forcing the use of the main 
>>>> Python interpreter by adding:
>>>> 
>>>>     WSGIApplicationGroup %{GLOBAL}
>>>> 
>>>> If that doesn't help you will need to work out where your code is 
>>>> blocking. For that try adding:
>>>> 
>>>>     request-timeout=30
>>>> 
>>>> option to WSGIDaemonProcess directive.
>>>> 
>>>> This will cause daemon process in your case to restart after 30 seconds 
>>>> when it blocks and in doing that will attempt to log to error log file the 
>>>> stack traces of where your code was blocked.
>>>> 
>>>> Graham
>>>> 
>>>> 
>>>>> On 24 Jan 2023, at 3:27 am, A <avd...@gmail.com <>> wrote:
>>>>> 
>>>> 
>>>>> It keeps timing out and I've been trying to solve it to no avail.
>>>>> Here my modwsgi.conf
>>>>> 
>>>>> <VirtualHost *:80>
>>>>>       ServerName localhost
>>>>>       ServerAlias ----------.com *.----------.com
>>>>>       
>>>>>       Define project_name     ----------
>>>>>       Define user             -------------
>>>>>       
>>>>>       Define project_path     /srv/http/fosware
>>>>>       Define wsgi_path        /srv/http/fosware/fosware
>>>>>       Define environment_path /srv/http/fosware/venv
>>>>>       
>>>>>       WSGIDaemonProcess ${user}-${project_name} user=${user} 
>>>>> group=${user} processes=1 threads=1 python-eggs=/tmp/python-eggs/ 
>>>>> python-path=${project_path}:${environment_path}/lib/python2.7/site-packages
>>>>>       WSGIProcessGroup ${user}-${project_name}
>>>>> 
>>>>>       WSGIScriptAlias / ${wsgi_path}/wsgi.py
>>>>> 
>>>>>       <Directory ${project_path}>
>>>>>         <IfVersion < 2.3 >
>>>>>         Order allow,deny
>>>>>         Allow from all
>>>>>         </IfVersion>
>>>>>         <IfVersion >= 2.3>
>>>>>         Require all granted
>>>>>         </IfVersion>
>>>>>       </Directory>        
>>>>>       
>>>>> 
>>>>>       Alias /static ${project_path}/static
>>>>>       <Directory ${project_path}/static>
>>>>>              Require all granted
>>>>>              SetHandler None
>>>>>              FileETag none
>>>>>              Options FollowSymLinks
>>>>>       </Directory>
>>>>> 
>>>>>       Alias /media ${project_path}/media
>>>>>       <Directory ${project_path}/media>
>>>>>              Require all granted
>>>>>              SetHandler None
>>>>>              FileETag none
>>>>>              Options FollowSymLinks
>>>>>              ErrorDocument 404 /error404
>>>>>       </Directory>
>>>>> 
>>>>>       ErrorLog /var/log/httpd/${user}-${project_name}-error.log
>>>>>       LogLevel info
>>>>>       CustomLog /var/log/httpd/${user}-${project_name}-access.log combined
>>>>> </VirtualHost>
>>>>> 
>>>> 
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "modwsgi" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to modwsgi+u...@googlegroups.com <>.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/modwsgi/02fb2171-ca59-4053-9be3-8ff75e9cf9edn%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/modwsgi/02fb2171-ca59-4053-9be3-8ff75e9cf9edn%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>>> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to modwsgi+unsubscr...@googlegroups.com 
> <mailto:modwsgi+unsubscr...@googlegroups.com>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/a3ede6cc-dfd2-423a-a8e0-55068433afcen%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/modwsgi/a3ede6cc-dfd2-423a-a8e0-55068433afcen%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to modwsgi+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/085601A1-DF5B-48D7-8C7C-705A869108D5%40gmail.com.

Re: [modwsgi] Timeout when reading response header from daemon process.

Reply via email to