Re: [modwsgi] Concurrent requests clogs single apache process

Graham Dumpleton Thu, 17 Sep 2020 21:13:28 -0700

That version of mod_wsgi is over 3 years old.There has been almost 20 releases 
since then. I checked back regarding the buffering workaround and it predates 
that version, and actually seems to only be absolutely necessary with Apache 
2.2.


If you run a single request on the URL which is slow, how long does it take and 
how much data is it returning?

If you restart Apache, how much memory are the Apache child worker processes 
using before you send any requests?

With freshly started Apache, if you do single request against the slow URL, 
does one of the Apache child worker processes jump up in amount of memory it 
uses significantly? If yes, how much does it then consume and what did it start 
at?

Still trying to understand why the processes are using so much memory.

Graham

> On 18 Sep 2020, at 2:01 pm, Scott McConnell <[email protected]> 
> wrote:
> 
> I am using mod_wsgi/4.5.17.
> 
> I haven't made any drastic changes to the server... I hadn't touched 
> /etc/apache2/apache2.conf until talking to you. I haven't done much other 
> than what's outlined here 
> <https://www.digitalocean.com/community/tutorials/how-to-install-the-apache-web-server-on-ubuntu-18-04#step-5-%E2%80%94-setting-up-virtual-hosts-(recommended)>.
>  I spun up a new instance the same way I did originally, and got the same 
> `apache2 -V` result as I sent above (with the blank MPM).
> 
> Running the daemon vs. embedded showed that it is in fact running in daemon 
> mode (a non-blank string was returned).
> Running the second test showed that it's being run in the main interpreter (a 
> blank string was returned).
> 
> I ran a load test on a different path of my site (one that is purely html), 
> and these were the results (it happened wicked fast):
> 
> 
> $ ab -c 15 -n 500 -s 10 https://mysite.com/blog/6/ <https://mysite.com/>
> Server Software: Apache/2.4.29 
> Server Hostname:  mysite.com <http://mysite.com/> 
> Server Port: 443 
> SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-CHACHA20-POLY1305,2048,256 
> TLS Server Name:  mysite.com <http://mysite.com/> 
> 
> Document Path: /blog/6/ 
> Document Length: 13572 bytes 
> 
> Concurrency Level: 15 
> Time taken for tests: 5.143 seconds 
> Complete requests: 500 
> Failed requests: 0 
> Total transferred: 6910500 bytes 
> HTML transferred: 6786000 bytes 
> Requests per second: 97.22 [#/sec] (mean) 
> Time per request: 154.296 [ms] (mean) 
> Time per request: 10.286 [ms] (mean, across all concurrent requests) 
> Transfer rate: 1312.13 [Kbytes/sec] received 
> 
> 
> Connection Times (ms) 
> min mean[+/-sd] median max 
> Connect: 2 2 1.4 2 12 
> Processing: 15 152 442.3 54 2772 
> Waiting: 14 137 434.2 44 2757 
> Total: 18 154 443.5 56 2784
> On Thursday, September 17, 2020 at 9:33:01 PM UTC-4 Graham Dumpleton wrote:
> One question. Given you are using a quite old version of Apache, what version 
> of mod_wsgi are you using?
> 
> If you are using a really old version of mod_wsgi, and are returning 
> absolutely huge responses, the large child worker processes sizes could be 
> because you are triggering some behaviour in Apache when proxying, which 
> causes a blow out in memory. More recent versions of mod_wsgi have a 
> workaround for triggering this issue in Apache. It likely will not help with 
> response times, but should help with memory bloat if this is the cause.
> 
> In general you should avoid using the system package for mod_wsgi as they are 
> often quite old and so missing out on improvements made since.
> 
> Graham
> 
> 
>> On 18 Sep 2020, at 10:03 am, Graham Dumpleton <[email protected] 
>> <applewebdata://01B781B2-DF3B-42D7-933F-E4060B8BC71F>> wrote:
>> 
> 
>> Based on what you have shown, I somewhat doubt you are using worker MPM, the 
>> fact that the process list shows a large number of Apache child worker 
>> processes would indicate that you are more likely using prefork MPM. That 
>> is, these processes.
>> 
> 
>> <PastedGraphic-3.png>
> 
>> 
>> 
>> If you are using worker MPM, then the MPM settings for that MPM would have 
>> to be bizarre to get that number of processes.
>> 
>> It actually worries me a bit that you claim you couldn't find any MPM 
>> settings. Usually Apache would have defaults in the config files. Usually 
>> they would only be missing if they had been explicitly removed
>> 
> 
>> A much bigger problem is why the Apache child worker processes are showing a 
>> resident memory size of 869M. This suggests something is severely wrong with
>> your server setup as they should not be getting to be that large.
>> 
>> Only other thing can think of as to why seeing so many processes if using 
>> worker MPM is if for some reason htop is showing each thread as a separate 
>> entry. Linux used to do this decades ago, but these days only ever see tools 
>> like ps and top breaking out things as processes rather than threads. That 
>> wouldn't explain the process size though, which should not be that big 
>> unless something is setup wrong.
>> 
>> The mod_wsgi daemon process in contrast are only 13M.
>> 
> 
>> <PastedGraphic-4.png>
> 
>> 
>> How much changes have to made to standard out of the box Apache 
>> configuration? Have you wiped out all the Apache config and tried to 
>> construct your own from scratch? Can you explain the steps you did to update 
>> the Apache configuration?
>> 
>> I would also suggest you do these tests.
>> 
>> https://modwsgi.readthedocs.io/en/develop/user-guides/checking-your-installation.html#embedded-or-daemon-mode
>>  
>> <https://modwsgi.readthedocs.io/en/develop/user-guides/checking-your-installation.html#embedded-or-daemon-mode>
>> https://modwsgi.readthedocs.io/en/develop/user-guides/checking-your-installation.html#sub-interpreter-being-used
>>  
>> <https://modwsgi.readthedocs.io/en/develop/user-guides/checking-your-installation.html#sub-interpreter-being-used>
>> 
>> to confirm that your WSGI application requests are actually being handled by 
>> mod_wsgi in the daemon process group.
>> 
>> Graham
>> 
>>> On 18 Sep 2020, at 4:16 am, Scott McConnell <[email protected] 
>>> <applewebdata://01B781B2-DF3B-42D7-933F-E4060B8BC71F>> wrote:
>>> 
>>> Oh, and I did not have any MPM settings set in Apache configuration. I 
>>> tried adding what you sent and it didn't have an effect. Thanks so much for 
>>> helping me out!
>>> 
>>> On Thursday, September 17, 2020 at 2:14:37 PM UTC-4 Scott McConnell wrote:
>>> My load test is pretty mellow, I thought...
>>> 
>>> Originally was doing:
>>> ab -c 15 -n 500 -s 10 https://mysite.com/ <https://mysite.com/>
>>> 
>>> And this caused response times of ~8 sec
>>> 
>>> Trying again with:
>>> ab -c 5 -n 500 -s 10 https://mysite.com/ <https://mysite.com/>
>>> 
>>> still leads to ~3 sec response time. My hope was for this to be able to 
>>> handle ~100 concurrent users, but I hadn't really thought about it in terms 
>>> of requests/second...
>>> 
>>> I'm primarily worried about the base url handler, as the base url triggers 
>>> a large get payload from an external API. Other url's have much faster 
>>> response time.
>>> 
>>> I am just starting to get familiar with MPM's, and I'm still not entirely 
>>> sure which MPM this server uses.
>>> 
>>> Because of this output: 
>>> 
>>> $ sudo ls /etc/apache2/mods-enabled/ 
>>> 
>>> access_compat.load authn_file.load authz_user.load dir.conf mime.conf  
>>> mpm_event.load  proxy.conf rewrite.load socache_shmcb.load wsgi.conf 
>>> 
>>> auth_basic.load authz_core.load deflate.conf dir.load mime.load 
>>> negotiation.conf proxy.load setenvif.conf ssl.conf wsgi.load 
>>> 
>>> authn_core.load authz_host.load deflate.load filter.load mpm_event.conf 
>>> negotiation.load proxy_http.load setenvif.load ssl.load
>>> 
>>> 
>>> I believe I am using worker MPM. I was also confused by the blank "Server 
>>> MPM:" line in this output:
>>> 
>>> $ apache2 -V 
>>> [Thu Sep 17 17:16:46.880034 2020] [core:warn] [pid 26387] AH00111: Config 
>>> variable ${APACHE_RUN_DIR} is not defined 
>>> apache2: Syntax error on line 80 of /etc/apache2/apache2.conf: 
>>> DefaultRuntimeDir must be a valid directory, absolute or relative to 
>>> ServerRoot 
>>> Server version: Apache/2.4.29 (Ubuntu) 
>>> Server built: 2020-08-12T21:33:25 
>>> Server's Module Magic Number: 20120211:68 
>>> Server loaded: APR 1.6.3, APR-UTIL 1.6.1 
>>> Compiled using: APR 1.6.3, APR-UTIL 1.6.1 
>>> Architecture: 64-bit 
>>> Server MPM: 
>>> Server compiled with.... 
>>> -D APR_HAS_SENDFILE 
>>> -D APR_HAS_MMAP 
>>> -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) 
>>> -D APR_USE_SYSVSEM_SERIALIZE 
>>> -D APR_USE_PTHREAD_SERIALIZE 
>>> -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT 
>>> -D APR_HAS_OTHER_CHILD 
>>> -D AP_HAVE_RELIABLE_PIPED_LOGS 
>>> -D DYNAMIC_MODULE_LIMIT=256 
>>> -D HTTPD_ROOT="/etc/apache2" 
>>> -D SUEXEC_BIN="/usr/lib/apache2/suexec" 
>>> -D DEFAULT_PIDLOG="/var/run/apache2.pid" 
>>> -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" 
>>> -D DEFAULT_ERRORLOG="logs/error_log" 
>>> -D AP_TYPES_CONFIG_FILE="mime.types" 
>>> -D SERVER_CONFIG_FILE="apache2.conf"
>>> 
>>> I did not have WSGIRestrictEmbedded On, but tried adding it-- no effect.
>>> 
>>> I do not have any performance monitoring or backend metrics. I've been 
>>> going purely off of htop (and other similar tools) and CloudWatch. I also 
>>> tried https://django-debug-toolbar.readthedocs.io/en/latest/ 
>>> <https://django-debug-toolbar.readthedocs.io/en/latest/> locally but it 
>>> wasn't very insightful to my issue.
>>> On Thursday, September 17, 2020 at 2:06:08 AM UTC-4 Graham Dumpleton wrote:
>>> When you say "load test", do you mean totally overload the server way 
>>> beyond the realistic amount of traffic you would ever expect to get? :-)
>>> 
>>> In other words, are you running tests like:
>>> 
>>>     ab -c 15 -n 1000000000 http://mysite <http://mysite/>
>>> 
>>> or:
>>> 
>>>     siege -c 15 -t 120s http://mysite <http://mysite/>
>>> 
>>> which is just throwing as many requests as absolutely possible at Apache?
>>> 
>>> This is only going to likely cause Apache to choke up as you are putting it 
>>> into an overload state, made worse by number of server processes. It is the 
>>> wrong way of evaluating how much load your server can realistically take.
>>> 
>>> What is the real number of requests/sec you would expect to ever receive?
>>> 
>>> Does every URL handler take 1 second to response, or are response times 
>>> across the site varied?
>>> 
>>> What are the Apache MPM settings you have set? Since using prefork MPM, 
>>> what do you have set for:
>>> 
>>> <IfModule mpm_prefork_module>
>>>     StartServers             1
>>>     MinSpareServers          1
>>>     MaxSpareServers         10
>>>     MaxRequestWorkers      250
>>>     MaxConnectionsPerChild   0
>>> </IfModule>
>>> 
>>> Do you have:
>>> 
>>>     WSGIRestrictEmbedded On
>>> 
>>> set in Apache configuration (outside of VirtualHost)?
>>> 
>>> And finally, do you have any performance monitoring in use, or at least 
>>> have a backend metrics database/service where could report metrics?
>>> 
>>> Graham
>>> 
>>> 
>>>> On 17 Sep 2020, at 3:21 pm, Scott McConnell <scott.mc...@ <>gmail.com 
>>>> <http://gmail.com/>> wrote:
>>>> 
>>> 
>>>> Hello, I am using apache/mod_wsgi to serve a processor/ajax heavy Django 
>>>> application on an Ubuntu 16.04 machine. The app is working with low 
>>>> traffic on https (~1 sec response time), but I'm running into a bottleneck 
>>>> during load testing. 
>>>> 
>>>> During a load test (15 concurrent requestors) response time becomes ~6 
>>>> sec, but CPU utilization peaks at 30% according to CloudWatch. I am using 
>>>> an EC2 instance and I've been continually increasing the size with no 
>>>> effect (now using c5.xlarge).
>>>> 
>>>> The strange part is my htop output... one process is taking the majority 
>>>> of the CPU time, whereas the wsgi daemon processes don't seem to take on 
>>>> any tasks. This culprit process starts after I restart the server and 
>>>> never dies until the server is stopped. 
>>>> 
>>>> This was the output when I tried tracing the process:
>>>> $ sudo strace -p 14645
>>>> 
>>>> strace: Process 14645 attached
>>>> 
>>>> restart_syscall(<... resuming interrupted restart_syscall ...>
>>>> 
>>>> 
>>>> Below are some htop screenshots, and I pasted my config at the bottom. 
>>>> 
>>>> When sorted by CPU utilization:
>>>> 
>>> 
>>>> <Screen Shot 2020-09-17 at 12.37.42 AM.png>
>>>> 
>>>> then all the way at the bottom are my daemon processes named wsws:
>>>> 
>>> 
>>>> <Screen Shot 2020-09-17 at 12.35.47 AM.png>
>>>> 
>>>> Here is my mysite.com.conf:
>>>> 
>>>> <VirtualHost *:80>
>>>> 
>>>>         ServerAdmin webmaster@localhost
>>>>         ServerName mysite.com <http://mysite.com/>
>>>>         ServerAlias www.mysite.com <http://www.mysite.com/>
>>>>         DocumentRoot /var/www/html
>>>>         ErrorLog ${APACHE_LOG_DIR}/error.log
>>>>         CustomLog ${APACHE_LOG_DIR}/access.log combined
>>>>         LogLevel info
>>>> 
>>>>         <Directory /home/ubuntu/myrepo/mysite/static>
>>>>                 Require all granted
>>>>         </Directory>
>>>> 
>>>>         <Directory /home/ubuntu/myrepo/mysite/mysite>
>>>>                 <Files wsgi.py>
>>>>                         Require all granted
>>>>                 </Files>
>>>>         </Directory>
>>>> 
>>>> 
>>>>         WSGIDaemonProcess mysite python-path=/home/ubuntu/myrepo/mysite 
>>>> python-home=/home/ubuntu/myrepo/venv processes=25 threads=1 
>>>> display-name=wsws request-timeout=60 inactivity-timeout=600
>>>>         WSGIProcessGroup mysite
>>>>         WSGIScriptAlias / /home/ubuntu/myrepo/mysite/mysite/wsgi.py
>>>>         WSGIApplicationGroup %{GLOBAL}
>>>>         
>>>>         RewriteEngine on
>>>>         RewriteCond %{SERVER_NAME} =www.mysite.com 
>>>> <http://www.mysite.com/> [OR]
>>>>         RewriteCond %{SERVER_NAME} =mysite.com <http://mysite.com/>
>>>>         RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} 
>>>> <https://%%7BSERVER_NAME%7D%%7BREQUEST_URI%7D> [END,NE,R=permanent]
>>>> 
>>>> </VirtualHost>
>>>> 
>>>> 
>>> 
>>>> -- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "modwsgi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to modwsgi+u...@ <>googlegroups.com <http://googlegroups.com/>.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/modwsgi/3622efe1-62d3-4d07-99c9-74fac83541aen%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/modwsgi/3622efe1-62d3-4d07-99c9-74fac83541aen%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>>> <Screen Shot 2020-09-17 at 12.35.47 AM.png><Screen Shot 2020-09-17 at 
>>>> 12.37.42 AM.png>
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "modwsgi" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected] 
>>> <applewebdata://01B781B2-DF3B-42D7-933F-E4060B8BC71F>.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/modwsgi/e43fccd6-091f-447d-a0e2-7c13051b52d9n%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/modwsgi/e43fccd6-091f-447d-a0e2-7c13051b52d9n%40googlegroups.com?utm_medium=email&utm_source=footer>.
>> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/aa1d7a4f-f085-4e37-9218-81ddb91d4655n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/modwsgi/aa1d7a4f-f085-4e37-9218-81ddb91d4655n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/299A2E1B-5CF5-410C-A4D5-55077AEECD5F%40gmail.com.

Re: [modwsgi] Concurrent requests clogs single apache process

Reply via email to