Re: [modwsgi] Apache High Memory Usage With Modwsgi

Graham Dumpleton Mon, 07 Dec 2020 20:27:01 -0800

Been trying to catch up on other stuff the last few days, which is why this 
response is delayed.

Over the years have seen a number of times people doing this exact same thing 
as you are. That is, doing image manipulation on an uploaded image and then 
returning the result. For one reason or another the final outcome seemed to 
always be that you are better off using a backend queuing system such as Celery 
to handle the image manipulation. In other words, remove the processing of the 
images from your web application processes.

There are a few reasons why this is the case.

The first is that images and image manipulation can use a lot of transient 
memory. Especially when using multithreading in your web application with 
Python, this can result in high peak memory usage for the process. This is 
because you might get a whole bunch of requests come in at the same time and so 
processing of them overlaps. The memory consumption will blow out to the 
maximum required to support that number all being processed at the same time. 
When done, although the memory is released back for use by other parts of the 
application, the damage has already been done and your application will keep 
the overall high memory reservation. End result is that most of the time you 
will have lots of unused memory held by the process, with it only being used 
when you get concurrent requests again.

The second problem is that image manipulation can be CPU intensive. In a 
multithreaded application, depending on how well the image manipulation library 
works and how it handles locking of the global interpreter lock, in worst case 
parts of that image processing will be forced to be serialised resulting in 
requests being blocked and time taken for requests being longer than it would 
if processes were single threaded. In other words, image manipulation done in 
different threads interfere with each other and they all suffer.

The third is that if using embedded mode of mod_wsgi, you can see problems with 
per request thread pool usage of Apache worker process (in which the Python 
code is running), blowing out due to large response sizes. In the old days of 
Apache, up to 8MB could be held in the per request thread memory pool and only 
memory above that limit would actually be released. Thus if have lot of threads 
per worker process, that means 8MB of memory that stays reserved for each 
worker thread. In more recent Apache versions the sample configuration that 
comes with Apache drops this to 2MB, but if the distro has removed that setting 
from original Apache sample configuration, or you remove it, then I believe it 
defaults back to 8MB.

Using a backend Celery task system avoids the first two issues as work is done 
in a separate process and that process could even be recycled after every task, 
so you avoid problem with unused memory hanging around being reserved. The 
Celery processes are also single threaded, eliminating Python global 
interpreter lock issues.

The third problem above can be lessened by ensuring the Apache configuration 
directive for setting per request memory pool size is actually set, and lower 
the value if necessary. How you configure the Apache MPM settings can also 
affect this.

In general though, it is always recommended as first option that you avoid 
using modwsgi embedded mode at all, and use daemon mode. This avoids various 
problems caused by Apache MPM choice and settings.

So if you can change to Celery in the short term, switch to daemon mode instead.

In doing this, ensure that embedded mode is disabled completely by setting:

    WSGIRestrictEmbedded On

Also reduce the per request thread pool size. Where Apache worker processes are 
only acting as proxy to mod_wsgi daemon process, the value I set in 
mod_wsgi-express configuration is:

    ThreadStackSize 262144

Thus 0.25MB per thread instead of 2MB or 8MB.

Another dangerous setting you were using that would have caused lots of 
problems when using embedded mode was:

    MaxKeepAliveRequests 100

This would be causing Apache to restart your application processes too 
frequently, causing higher CPU due to high start up cost. In mod_wsgi-express I 
don't set this at all.

Next problem is:

    KeepAliveTimeout 45

In mod_wsgi-express, I set this to 2 seconds. By having such a high value you 
risk problems, especially when using worker MPM, although event MPM can have 
its own issues. By having it lower, you may not need as many Apache worker 
processes and threads.

The question now is why you were restarting after 100 requests. Was this in 
attempt to try and keep memory usage down?

One of the consequences of this is that would possibly see a lot of interrupted 
requests. This is what those warning messages about killing off processes is 
about. This is because Apache will only wait so long for processes to shutdown. 
Depending on how shutdown is managed, this can be only 5 seconds, but since you 
have long running requests, that can prevent that so Apache kills the processes 
anyway, and thus why requests can be interrupted. You really want to avoid 
periodic restarts of Apache child worker processes using that option.

If you do have a growing memory problem because of issues with your application 
code, there are various ways you can trigger restarts of the mod_wsgi daemon 
processes, but these self initiated restarts allow for a graceful restart 
timeout. Thus for the WSGIDaemonProcess directive you can set the options:

    maximum-requests=100 graceful-timeout=120

So when 100 requests arrive, a restart of the process will be signalled, but 
since the graceful timeout is set to 120 seconds, it will only be forcibly 
restarted after 120 seconds. In the interim, if the number of active requests 
being handled by the process drops to 0, a restart will be triggered at that 
point. This way it limits interrupting of active requests. You will still have 
issues though if have issues with requests getting blocked indefinitely as 
never reaches point where no active requests, but then if that is occurring any 
why are restarting so frequently, you have bigger issues.

For the latter, if you are getting stuck requests, you want to look at 
request-timeout option to WSGIDaemonProcess.

Anyway, for further guidance on setting up mod_wsgi daemon mode, would suggest 
watching:

    https://www.youtube.com/watch?v=H6Q3l11fjU0 
<https://www.youtube.com/watch?v=H6Q3l11fjU0>

The defaults for mod_wsgi daemon mode are not the best options for historical 
reasons. The video talks about that and how mod_wsgi-express sets different 
defaults.

To start with that is probably all I can suggest. Giving recommendations on 
tuning Apache MPM settings and mod_wsgi daemon mode is harder to do at this 
point.

Summarising things. Use Celery as out of process means to handle image 
manipulation. If you can't do that for now, try and switch to mod_wsgi daemon 
mode as that will allow memory and CPU usage to be better controlled.

Graham

> On 7 Dec 2020, at 6:39 pm, Zohaib Ahmed Hassan <[email protected]> 
> wrote:
> 
> I also get this issue sometime 
> [Mon Dec 07 07:04:22.142767 2020] [core:warn] [pid 1836:tid 139752646228928] 
> AH00045: child process 2807 still did not exit, sending a SIGTERM
> [Mon Dec 07 07:04:24.144831 2020] [core:warn] [pid 1836:tid 139752646228928] 
> AH00045: child process 1847 still did not exit, sending a SIGTERM
> [Mon Dec 07 07:04:24.144875 2020] [core:warn] [pid 1836:tid 139752646228928] 
> AH00045: child process 2807 still did not exit, sending a SIGTERM
> [Mon Dec 07 07:04:26.146928 2020] [core:warn] [pid 1836:tid 139752646228928] 
> AH00045: child process 1847 still did not exit, sending a SIGTERM
> [Mon Dec 07 07:04:26.146967 2020] [core:warn] [pid 1836:tid 139752646228928] 
> AH00045: child process 2807 still did not exit, sending a SIGTERM
> [Mon Dec 07 07:04:28.149026 2020] [core:error] [pid 1836:tid 139752646228928] 
> AH00046: child process 1847 still did not exit, sending a SIGKILL
> [Mon Dec 07 07:04:28.149092 2020] [core:error] [pid 1836:tid 139752646228928] 
> AH00046: child process 2807 still did not exit, sending a SIGKILL
> 
> On Sunday, December 6, 2020 at 7:51:28 AM UTC+5 Zohaib Ahmed Hassan wrote:
>  I don't know how much throughput is per second. It's random in the peak 
> hours it is 5req/sec but the below chart can help you understand the requests 
> throughput as well.
> one more thing I have tested 300 Concurrent requests with it using a script 
> and it works well with it but after requests completed the memory usage stays 
> at peek like before if it was 20 percent and with concurrent requests it went 
> to 45 percent it stays at 45 ..
> 
> Zohaib Ahmed Hassan | Senior DevOps Engineer
> 
> Direct: +923045060007 <tel:+92%20304%205060007>
> [email protected] <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>
> www.xiqinc.com <http://www.northbaysolutions.com/>
> 
> 
> 
> On Sat, Dec 5, 2020 at 4:58 PM Graham Dumpleton <[email protected] 
> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>> wrote:
> What about request throughput? That is, requests/sec it current handles, and 
> how many concurrent requests at a time.
> 
> Graham
> 
>> On 5 Dec 2020, at 8:35 pm, Zohaib Ahmed Hassan <[email protected] 
>> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>> wrote:
>> 
>> Thanks for the response here are details
>> 1 mod_wsgi version is 4.5.7
>> 2 its used as embedded mode
>> 3 basically this app get images in request and crop those images and return 
>> , average time it took is around 3 to 5 seconds
>> 
>> On Sat, Dec 5, 2020 at 10:22 AM Graham Dumpleton <[email protected] 
>> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>> wrote:
>> Also, in addition to what I already asked, what version of mod_wsgi is being 
>> used?
>> 
>> Graham
>> 
>>> On 5 Dec 2020, at 4:18 pm, Graham Dumpleton <[email protected] 
>>> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>> wrote:
>>> 
>>> What is the mod_wsgi part of the Apache configuration?
>>> 
>>> Need to know if you are using embedded mode or daemon mode and how it is 
>>> set up.
>>> 
>>> Also, what is the request throughput to the Django application and what is 
>>> average and worst case response times?
>>> 
>>> Graham
>>> 
>>>> On 5 Dec 2020, at 3:19 pm, Zohaib Ahmed Hassan <[email protected] 
>>>> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>> wrote:
>>>> 
>>>> We have an ec2 instance 4vcpu and 16gb of ram which is running Apache 
>>>> server with mpm event behind an aws ELB (application load balancer). This 
>>>> server serve just Images requested by our other applications although for 
>>>> most of application we are uasing cloudfront for caching but one app is 
>>>> directly sending request on server .  Now Apache memory usage reached to 
>>>> 70% every day but it did not come down we have to restart server every 
>>>> time. Earier will old Apache 2.2 version and worker mpm without load 
>>>> balncer we were not having this issue. I have tried different 
>>>> configuration for MPM EVENT and Apache but its not working. Here is 
>>>> apache2.conf 
>>>> 
>>>>     
>>>>     Timeout 120   # also tried the timeout with 300
>>>>     KeepAlive On
>>>>     MaxKeepAliveRequests 100
>>>>     KeepAliveTimeout 45 # varies this setting from 1 seconds to 300
>>>> 
>>>> 
>>>> Here is load balancer setting
>>>> 
>>>>  - Http and https listener
>>>>  
>>>>  - Idle timeout is 30
>>>> 
>>>> Mpm event
>>>> 
>>>>     <IfModule mpm_event_module>
>>>>         StartServers            2 
>>>>         MinSpareThreads         50 
>>>>         MaxSpareThreads         75 
>>>>         ThreadLimit                      64
>>>>         #ServerLimit               400
>>>>         ThreadsPerChild          25
>>>>         MaxRequestWorkers        400
>>>>         MaxConnectionsPerChild   10000
>>>> </IfModule>
>>>> 
>>>>  1. When i change MaxRequestWorkers to 150 with MaxConnectionsPerChild  0 
>>>> and ram usage reached 47 percent system health checks are failed and new 
>>>> instance is launched by auto scaling group. Seems like worker limit is 
>>>> reached which already happend when this instance was working with 8GB Ram.
>>>>  2. Our other server which are just running with simple django site and 
>>>> django rest frame apis are working fine with default values for MPM and 
>>>> apache configured on installation. 
>>>>  3. I have also tried the configuration with KeepAliveTimeout equals to 2, 
>>>> 3 and 5 seconds as well but it did not work . 
>>>>  4. I have also follow this link [enter link description here][1] it 
>>>> worked somewhat better but memory usage is not coming down. 
>>>> 
>>>> here is the recent error  log 
>>>> 
>>>>     [Fri Dec 04 07:45:21.963290 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:22.964362 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:23.965432 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:24.966485 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:25.967281 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:26.968328 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:27.969392 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:28.970449 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:29.971505 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:30.972548 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:31.973593 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:32.974644 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:33.975697 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:34.976753 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>>     [Fri Dec 04 07:45:35.977818 2020] [mpm_event:error] [pid 5232:tid 
>>>> 139782245895104] AH03490: scoreboard is full, not at 
>>>> MaxRequestWorkers.Increase ServerLimit.
>>>> 
>>>> top command result 
>>>> 
>>>>   
>>>> 
>>>>  
>>>> 
>>>>     3296 www-data  20   0 3300484 469824  58268 S   0.0  2.9   0:46.46 
>>>> apache2                                               
>>>>      2544 www-data  20   0 3359744 453868  58292 S   0.0  2.8   1:24.53 
>>>> apache2                                               
>>>>      1708 www-data  20   0 3357172 453524  58208 S   0.0  2.8   1:02.85 
>>>> apache2                                               
>>>>       569 www-data  20   0 3290880 444320  57644 S   0.0  2.8   0:37.53 
>>>> apache2                                               
>>>>      3655 www-data  20   0 3346908 440596  58116 S   0.0  2.7   1:03.54 
>>>> apache2                                               
>>>>      2369 www-data  20   0 3290136 428708  58236 S   0.0  2.7   0:35.74 
>>>> apache2                                               
>>>>      3589 www-data  20   0 3291032 382260  58296 S   0.0  2.4   0:50.07 
>>>> apache2                                               
>>>>      4298 www-data  20   0 3151764 372304  59160 S   0.0  2.3   0:18.95 
>>>> apache2                                               
>>>>      4523 www-data  20   0 3140640 310656  58032 S   0.0  1.9   0:07.58 
>>>> apache2                                               
>>>>      4623 www-data  20   0 3139988 242640  57332 S   3.0  1.5   0:03.51 
>>>> apache2
>>>> 
>>>> What is wrong in the configuration that is causing high memory?
>>>> 
>>>> 
>>>>   [1]: 
>>>> https://aws.amazon.com/premiumsupport/knowledge-center/apache-backend-elb/ 
>>>> <https://aws.amazon.com/premiumsupport/knowledge-center/apache-backend-elb/>
>>>> 
>>>> -- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "modwsgi" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected] 
>>>> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/modwsgi/f10fbec6-d2b9-4486-a63b-e1fe80f45ddbn%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/modwsgi/f10fbec6-d2b9-4486-a63b-e1fe80f45ddbn%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>> 
>> 
>> 
>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "modwsgi" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/modwsgi/9FMOGfbmTgg/unsubscribe 
>> <https://groups.google.com/d/topic/modwsgi/9FMOGfbmTgg/unsubscribe>.
>> To unsubscribe from this group and all its topics, send an email to 
>> [email protected] 
>> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/modwsgi/B006AA7E-E305-4358-AD4A-4EDBE229FA5C%40gmail.com
>>  
>> <https://groups.google.com/d/msgid/modwsgi/B006AA7E-E305-4358-AD4A-4EDBE229FA5C%40gmail.com?utm_medium=email&utm_source=footer>.
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] 
>> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/modwsgi/CAGxhakKtpo2rRNFy0tLiOaykKep4187nPjZnxJnvLnXeFSxa%3Dg%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/modwsgi/CAGxhakKtpo2rRNFy0tLiOaykKep4187nPjZnxJnvLnXeFSxa%3Dg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> 
> 
> -- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "modwsgi" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/modwsgi/9FMOGfbmTgg/unsubscribe 
> <https://groups.google.com/d/topic/modwsgi/9FMOGfbmTgg/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email to 
> [email protected] 
> <applewebdata://8320B3FE-56F8-4822-AAD0-89D62A3B1952>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/2D7540A0-82BD-4A02-9FF6-459E92A04FCA%40gmail.com
>  
> <https://groups.google.com/d/msgid/modwsgi/2D7540A0-82BD-4A02-9FF6-459E92A04FCA%40gmail.com?utm_medium=email&utm_source=footer>.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/modwsgi/1fb1b7b2-32fc-435f-b952-2730aad964e2n%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/modwsgi/1fb1b7b2-32fc-435f-b952-2730aad964e2n%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/modwsgi/8D3986B0-B4CE-4A4B-9951-C899A2B65271%40gmail.com.

Re: [modwsgi] Apache High Memory Usage With Modwsgi

Reply via email to