Re: [modwsgi] mod_wsgi Showing more threads in htop than assigned

Graham Dumpleton Wed, 19 Feb 2014 20:31:26 -0800

Blog version of the issue with number of threads seen.

http://blog.dscpl.com.au/2014/02/use-of-threading-in-modwsgi-daemon-mode.html


I stole your htop output. :-)

Note that the blog post explains a bit more, mentioning a transient reaper 
thread that is created at the time of shutdown.

It is possible I should create that reaper thread up front as well, making 4 
extra. Am wondering whether delaying creation may be the cause of a rare 
problem with processes hanging. This could occur if resources were exhausted 
and the thread could not be created. If request threads or interpreter 
destruction then subsequently hung, the process would never exit.

This would though produce a specific log message though and I have never seen 
that message reported. All the same, may be safer to create the reaper thread 
at the outset and have it wait on a thread condition variable to know when to 
activate.

Graham

On 20/02/2014, at 2:06 PM, Graham Dumpleton <[email protected]> wrote:

> For each mod_wsgi daemon process where you have set threads=n, you will see 
> n+3 threads.
> 
> The n threads is obviously the configured number of threads to handle 
> requests.
> 
> The other three threads are as follows:
> 
> 1. The main thread which was left running after the daemon process forked 
> from Apache. It is from this thread that the n requests threads are created 
> initially. It will also create 2 additional threads described below. After it 
> has done this, this main thread becomes a caretaker for the whole process. It 
> will wait on a special socketpair, which a signal handler will write a 
> character to as a flag that the process should shutdown. In other words, this 
> main thread just sits there and stops the process from exiting until told to.
> 
> 2. The second thread is a monitor thread. What it does is manage things like 
> the activity timeout and shutdown timeout. If either of those timeouts occur 
> it will send a signal to the same process (ie., itself), to trigger shutdown 
> of the process.
> 
> 3. The third thread is another monitoring thread, but one which specifically 
> detects whether the whole Python interpreter itself gets into a complete 
> deadlock and stops doing anything. If this is detected it will again send a 
> signal to the same process to trigger a shutdown.
> 
> So the additional threads are to manage process shutdown and ensure the 
> process is still alive and doing stuff.
> 
> As to your memory issue, the problem with web application deployments which 
> just about no one takes into consideration is that not all URLs in a web 
> application are equal. I actually proposed a talk for PyCon US this year 
> about this specific issue and how to deal with, but the talk was rejected.
> 
> In short, because your complete web application runs in the same process 
> space, if one specific URL, or a small subset of URLs has special resource 
> requirements, it dictates for the complete application what resources you 
> require, even if those URLs might be infrequently used.
> 
> As an example, the admin pages in a Django application are not frequently 
> used, but they may have a requirement to process a lot of data. This could 
> create a large transient memory requirement just for the request, but since 
> memory allocations from the operating system are generally never given back, 
> this one infrequent request will blow out memory usage for the whole 
> application. This memory once allocated will be retained by the process until 
> the process is subsequently restarted.
> 
> Because of this, you could have a stupid situation whereby a request which is 
> only run once every fifteen minutes, could over the course of a few hours, 
> progressively be handled by a different process in a multiprocess web server 
> configuration. Thus your overall memory usage will seem to jump up for no 
> good reason until finally all processes have finally hit a plateau where they 
> have allocated the maximum amount of memory they require to handle the worst 
> case transient memory usage requirements for individual requests.
> 
> It can though get worse though if you also have multithreading being used in 
> each process. As the response time for a memory hungry URL is longer and 
> longer, you raise the odds that you could have two such memory hungry 
> requests wanting to be handled concurrently within the same process in 
> different threads. What this means is that your worst case memory usage isn't 
> actually just the worst case memory requirement for a specific URL, but that 
> multiplied by the number of threads in the process.
> 
> Further examples I have seen in the past where people have been hit by this 
> are a site map, PDF generation and possibly even RSS feeds where a 
> significant amount of content is returned with each item rather than it just 
> being a summary.
> 
> The big problem in all of this is identifying which URL has the large 
> transient memory requirement. Tools available for this aren't good and you 
> generally have to fallback to adhoc solutions to try and work it out. I'll 
> get to how you can work it out later, possibly as separate email as I have to 
> go find some code I wrote once before for someone to try and work it out.
> 
> As to solving the problem when you have identified which URLs are the 
> problem, ideally you would change how the code works to avoid the large 
> transient memory requirement. If you cannot do that, or not straight away, 
> then you can fall back on a number of different techniques to at least lesson 
> the impact, by configuring the web server differently.
> 
> You have already identified two ways that this can be done, which is the 
> inactivity timeout and maximum number of requests per process before a 
> restart.
> 
> The problem with these as a solution is that the requirement for a small set 
> of URLs has dictated the configuration for the whole application. Using them 
> can therefore have an impact on other parts of the application.
> 
> In the case of setting a maximum for the number of requests handled for the 
> process, you can introduce a significant amount of process churn if this is 
> set too low relative to the overall throughput. That is, the processes will 
> get restarted on a frequent basis.
> 
> I talk about this issue of process churn in my PyCon talk from last year:
> 
> http://lanyrd.com/2013/pycon/scdyzk/
> 
> but you can also see what I mean in the attached application capacity 
> analysis report picture.
> 
> <PastedGraphic-1.png>
> The better solution to this problem with not all URLs being equal and having 
> different resource requirements, is to vertically partition your web 
> application and spread it across multiple processes. Where each process only 
> handles a subset of URLs. Luckily this can be easily handled by mod_wsgi 
> using multiple daemon process groups and delegating URLs to different 
> processes.
> 
> Take for example admin URLs in Django. If these are indeed infrequently used 
> but can have a large transient memory requirement, what we can do is:
> 
> WSGIDaemonProcess main processes=5 threads=5
> WSGIDaemonProcess admin threads=3 inactivity-timeout=30 maximum-requests=20
> 
> WSGIApplicationGroup %{GLOBAL}
> WSGIProcessGroup main
> 
> WSGIScriptAlias / /some/path/wsgi.py
> 
> <Location /admin>
> WSGIProcessGroup admin
> </Location>
> 
> So what we have done is created two daemon process groups and have shoved the 
> admin pages into a distinct one of its own where we can be more aggressive 
> and uses inactivity timeout and maximum requests to combat excessive memory 
> use. In doing this we have left along things for the bulk of the web 
> application.
> 
> The end result is that we can tailor configuration settings for different 
> parts of the application. The only requirement is that we can reasonably 
> easily separate them out based on the URL being able to be matched by a 
> Location/LocationMatch directive in Apache.
> 
> In this example we have done this specifically to separate our misbehaving 
> parts of an application, but the converse can also be done.
> 
> If you think about it, most of the traffic for your site will often hit a 
> small subset of URLs. The performance of the handling of these small, but 
> very frequently visited URLs, could be impeded by having to use a more 
> general configuration for the server.
> 
> What may work better is to delegate the very high trafficked URLs into their 
> own daemon process with a processes/threads mix tuned for that scenario. 
> Because that daemon is only going to handle a smaller number of URLs, the 
> actual amount of code from your application that would ever be executed 
> within that process would be much smaller. So long as your code base is setup 
> such that it only lazily imports code for specific handlers when necessary 
> the first time, you can keep this optimised process quite lean as far as 
> memory usage.
> 
> So instead of having every process having to be very fat and eventually load 
> up all parts of your application code, you can leave that for a smaller 
> number of processes alone, where although they are going to serve up a 
> greater number of different URLs, wouldn't necessarily get much traffic and 
> so don't have to have as much capacity.
> 
> You might therefore have the following:
> 
> WSGIDaemonProcess main processes=1 threads=5
> WSGIDaemonProcess volume processes=3 threads=5
> WSGIDaemonProcess admin threads=3 inactivity-timeout=30 maximum-requests=20
> 
> WSGIApplicationGroup %{GLOBAL}
> WSGIProcessGroup main
> 
> WSGIScriptAlias / /some/path/wsgi.py
> 
> <Location /publications/article/>
> WSGIProcessGroup volume
> </Location>
> 
> <Location /admin>
> WSGIProcessGroup admin
> </Location>
> 
> In your case we are therefore shoving the one URL which accounts for almost 
> 50% of your total traffic into one daemon process group. This should have a 
> lower memory footprint and so we can afford to run it across a few processes, 
> each with a small number of process. All other non admin traffic, where all 
> the remain coding for your application would be loaded, can be handled by one 
> process.
> 
> So by juggling things like this, handling as special cases worst case URLs 
> for transient memory usage, as well as your high traffic URLs, one can often 
> quite dramatically control the amount of memory used.
> 
> Now what about monitoring these so as to be able to gauge effectiveness.
> 
> Because server monitoring in New Relic can't separately identify the mod_wsgi 
> daemon process groups, even when the display-name options is used, for things 
> like memory tracking you cannot rely readily on server monitoring. This is 
> because everything will be lumped under Apache and you cannot tell what the 
> memory requirements are of each.
> 
> What you have to do in this case is rely on the memory usage charts on the 
> main overview dashboard for the web application in New Relic.
> 
> <PastedGraphic-3.png>
> 
> We have a problem though at this point though and that is that everything 
> will still report under the same existing application in the New Relic UI and 
> so we still don't have separation.
> 
> What we can do here though is configure things though so that each daemon 
> process group reports into a separate application, as well as still reporting 
> to a combined application for everything. This can be done from the Apache 
> configuration file using:
> 
> WSGIDaemonProcess main processes=1 threads=5
> WSGIDaemonProcess volume processes=3 threads=5
> WSGIDaemonProcess admin threads=3 inactivity-timeout=30 maximum-requests=20
> 
> WSGIApplicationGroup %{GLOBAL}
> WSGIProcessGroup main
> 
> SetEnv newrelic.app_name 'My Site (main);My Site'
> 
> WSGIScriptAlias / /some/path/wsgi.py
> 
> <Location /publications/article/>
> WSGIProcessGroup volume
> SetEnv newrelic.app_name 'MySite (volume);My Site'
> </Location>
> 
> <Location /admin>
> WSGIProcessGroup admin
> SetEnv newrelic.app_name 'MySite (admin);My Site'
> </Location>
> 
> So we are using specialisation via the Location directive to override what 
> the application name the New Relic Python agent reports to.
> 
> We are also in this case using a semi colon separated list of names.
> 
> The result is that each daemon process group logs under a separate 
> application of the form 'My Site (XXX)' but at the same time they also all 
> report to 'My Site'.
> 
> This way you can still have a combined view, but you can also look at each 
> daemon process group in isolation.
> 
> The isolation is important, because you can then do the following separately 
> for each daemon process group.
> View response times.
> View throughput.
> View memory usage.
> View CPU usage.
> View capacity analysis report.
> Trigger thread profiler.
> If things were separated and they were all reporting only to the same 
> application, the data presented by this would be all mixed up and for the 
> last 4 could be confusing.
> 
> Okay, so that is probably going to be a lot to digest but represents just a 
> part of what I would have presented at PyCon US if my talk had been submitted.
> 
> Other things I would have talked about would have included dealing with 
> request back log when overloaded due to increase traffic for certain URLs, 
> dealing with danger of malicious POST requests with large content size etc 
> etc.
> 
> Am sure the above will keep you busy for a while at least though. :-)
> 
> Now that I have done all that, I should clean it up a bit and put it up in a 
> couple of blog posts.
> 
> Graham
> 
> On 20/02/2014, at 8:06 AM, scoopseven <[email protected]> wrote:
> 
>> Graham, I'm still not sure why with processes=5 threads=2 I see 5 threads 
>> for each process for mod_wsgi in htop. If you could explain that last little 
>> hanging chad it would be great. Thanks!
>> 
>> Updated SO with summary of solution: 
>> http://serverfault.com/questions/576527/apache-processes-in-top-more-than-maxclients
>> 
>> Mark
>> 
>> 
>> On Wednesday, February 19, 2014 12:05:56 PM UTC-5, scoopseven wrote:
>> This question started on SO: 
>> http://serverfault.com/questions/576527/apache-processes-in-top-more-than-maxclients/576600
>> 
>> I've updated my Apache config and mod_wsgi settings, but am still 
>> experiencing memory creep. Here's my site conf and my apache2.conf:
>> 
>> WSGIDaemonProcess mywsgi user=www-data group=www-data processes=5 threads=5 
>> display-name=mod-wsgi 
>> python-path=/home/admin/.virtualenvs/django/lib/python2.7/site-packages
>> WSGIPythonHome /home/admin/.virtualenvs/django
>> WSGIRestrictEmbedded On
>> WSGILazyInitialization On
>> 
>> <VirtualHost 127.0.0.1:8080>
>>     ServerName www.mysite.com
>>     DocumentRoot /srv/mysite
>>     
>>     SetEnvIf X-Forwarded-Protocol https HTTPS=1
>>     WSGIScriptAlias / /srv/mysite/system/apache/django.wsgi process-group= 
>> mywsgi application-group=%{GLOBAL}
>>     RequestHeader add X-Queue-Start "%t"
>> </VirtualHost>
>> 
>> <IfModule mpm_worker_module>
>>     StartServers             1
>>     ThreadsPerChild          5
>>     MinSpareThreads          5
>>     MaxSpareThreads         10
>>     MaxClients              25
>>     ServerLimit              5
>>     MaxRequestsPerChild      0
>>     MaxMemFree            1024
>> </IfModule>
>> 
>> I'm watching apache and mod_wsgi via htop and apache seems to be playing by 
>> the rules, never loading more than 25 threads. It usually stays around 10-15 
>> threads. We average around 5-6 requests/second monitored by /server-status/. 
>> The thing that's bothering me is that I'm counting 44 mod_wsgi threads in 
>> htop. I assumed that since I had processes=5 threads=5 I would only see a 
>> maximum of 30 threads below (5 processes + 25 threads). 
>> 
>> Partial htop dump:
>> 
>>  2249 www-data   20   0  159M 65544  4676 S 26.0  0.8  2:09.93 mod-wsgi      
>>     -k start
>>  2248 www-data   20   0  164M 69040  5560 S 148.  0.8  2:10.72 mod-wsgi      
>>     -k start
>>  2274 www-data   20   0  159M 65544  4676 S  0.0  0.8  0:12.58 mod-wsgi      
>>     -k start
>>  2250 www-data   20   0  157M 62212  5168 S 10.0  0.7  1:50.35 mod-wsgi      
>>     -k start
>>  2291 www-data   20   0  164M 69040  5560 S 41.0  0.8  0:17.07 mod-wsgi      
>>     -k start
>>  2251 www-data   20   0  165M 69320  4676 S  0.0  0.8  1:59.48 mod-wsgi      
>>     -k start
>>  2272 www-data   20   0  159M 65544  4676 S  0.0  0.8  0:28.67 mod-wsgi      
>>     -k start
>>  2282 www-data   20   0  165M 69320  4676 S  0.0  0.8  0:33.85 mod-wsgi      
>>     -k start
>>  2292 www-data   20   0  164M 69040  5560 S 28.0  0.8  0:28.08 mod-wsgi      
>>     -k start
>>  2298 www-data   20   0  157M 62212  5168 S  0.0  0.7  0:14.93 mod-wsgi      
>>     -k start
>>  2299 www-data   20   0  157M 62212  5168 S  1.0  0.7  0:23.71 mod-wsgi      
>>     -k start
>>  2358 www-data   20   0  164M 69040  5560 S  1.0  0.8  0:02.62 mod-wsgi      
>>     -k start
>>  2252 www-data   20   0  165M 70468  4660 S 41.0  0.8  1:55.85 mod-wsgi      
>>     -k start
>>  2273 www-data   20   0  159M 65544  4676 S 10.0  0.8  0:29.03 mod-wsgi      
>>     -k start
>>  2278 www-data   20   0  159M 65544  4676 S  1.0  0.8  0:02.79 mod-wsgi      
>>     -k start
>>  2264 www-data   20   0  165M 70468  4660 S  0.0  0.8  0:07.50 mod-wsgi      
>>     -k start
>>  2266 www-data   20   0  165M 70468  4660 S 25.0  0.8  0:39.49 mod-wsgi      
>>     -k start
>>  2300 www-data   20   0  157M 62212  5168 S  6.0  0.7  0:28.78 mod-wsgi      
>>     -k start
>>  2265 www-data   20   0  165M 70468  4660 S 15.0  0.8  0:31.44 mod-wsgi      
>>     -k start
>>  2294 www-data   20   0  164M 69040  5560 R 54.0  0.8  0:34.82 mod-wsgi      
>>     -k start
>>  2279 www-data   20   0  165M 69320  4676 S  0.0  0.8  0:32.63 mod-wsgi      
>>     -k start
>>  2297 www-data   20   0  157M 62212  5168 S  3.0  0.7  0:09.68 mod-wsgi      
>>     -k start
>>  2302 www-data   20   0  157M 62212  5168 S  0.0  0.7  0:27.62 mod-wsgi      
>>     -k start
>>  2323 www-data   20   0  157M 62212  5168 S  0.0  0.7  0:02.56 mod-wsgi      
>>     -k start
>>  2280 www-data   20   0  165M 69320  4676 S  0.0  0.8  0:13.00 mod-wsgi      
>>     -k start
>>  2263 www-data   20   0  165M 70468  4660 S  0.0  0.8  0:19.35 mod-wsgi      
>>     -k start
>>  2322 www-data   20   0  165M 69320  4676 S  0.0  0.8  0:03.05 mod-wsgi      
>>     -k start
>>  2275 www-data   20   0  165M 70468  4660 S  0.0  0.8  0:02.72 mod-wsgi      
>>     -k start
>>  2285 www-data   20   0  164M 69040  5560 S  0.0  0.8  0:00.00 mod-wsgi      
>>     -k start
>>  2288 www-data   20   0  164M 69040  5560 S  0.0  0.8  0:00.11 mod-wsgi      
>>     -k start
>>  2290 www-data   20   0  164M 69040  5560 S  4.0  0.8  0:15.66 mod-wsgi      
>>     -k start
>>  2293 www-data   20   0  164M 69040  5560 S 20.0  0.8  0:29.01 mod-wsgi      
>>     -k start
>>  2268 www-data   20   0  159M 65544  4676 S  0.0  0.8  0:00.00 mod-wsgi      
>>     -k start
>>  2269 www-data   20   0  159M 65544  4676 S  0.0  0.8  0:00.11 mod-wsgi      
>>     -k start
>>  2270 www-data   20   0  159M 65544  4676 S 15.0  0.8  0:26.62 mod-wsgi      
>>     -k start
>>  2271 www-data   20   0  159M 65544  4676 S  0.0  0.8  0:26.55 mod-wsgi      
>>     -k start
>> 
>> Last night I had processes=3 threads=3 and my NR capacity report reported 
>> 100% usage 
>> (https://rpm.newrelic.com/accounts/67402/applications/1132078/optimize/capacity_analysis),
>>  so I upped it to processes=5 threads=5 and now I have 44 threads going. 
>> Despite the instance count reported by NR staying relatively stable, memory 
>> consumption continues to increase 
>> (https://rpm.newrelic.com/accounts/67402/servers/1130000/processes#id=152494639).
>>  I realize that nobody except for Graham can see those NR reports, sorry. 
>> 
>> Has anyone dealt with this situation before?
>> 
>> Mark
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "modwsgi" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at http://groups.google.com/group/modwsgi.
>> For more options, visit https://groups.google.com/groups/opt_out.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/modwsgi.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [modwsgi] mod_wsgi Showing more threads in htop than assigned

Reply via email to