Re: [modwsgi] Re: Apache memory consumption

Graham Dumpleton Mon, 04 Apr 2011 15:28:43 -0700

On 5 April 2011 00:26, Kent <[email protected]> wrote:
> First, Graham, thank you for taking some time; my responses below:
>
> On Apr 2, 6:21 pm, Graham Dumpleton <[email protected]>
> wrote:
>> On 2 April 2011 07:32, Kent <[email protected]> wrote:
>>
>> > I am hoping you might gracefully suggest what we might be able to do
>> > to improve our problem of memory usage being consumed by apache.
>>
>> > We have a turbo gears type web server with 2GB ram which is running a
>> > point of sale system for about 15 or 18 stores.
>>
>> > We are running mod_wsgi 2.5, but about to upgrade to 3.3.
>>
>> > Is it typical for one process (wsgi:rarch) to consume virtually all
>> > the CPU and memory consumption while the remaining apache children
>> > seem to not be doing much?
>>
>> With your configuration yes, that would be expected.
>>
>
> After finding your Sydney slideshow presentation, I'm understanding
> that if I were to set processes=2 threads=15, for example, I'd have 2
> (fatter) processes which actually run the python wsgi app and to which
> the other threads can delegate; is that understanding correct?


In the context of your very fat web application, I don't the processes
would appear any 'fatter' than they were already. Since you are
restricting it to two, rather than whatever Apache MPM for embedded
mode allowed, you have at least constrained it.

In other words, using multiple threads in a process will increase the
amount of base memory used by that process, but it isn't necessarily
that much that I would be labelling it 'fatter'.

> If so, I'm not sure I understand the purpose of the threads, since
> wouldn't they need to effectively wait for a process anyway?  Earlier,
> I believed threads=15 (and processes=1) would allow me to have many
> simultaneous requests processing in parallel.  Can this one process
> accept multiple requests and multitask them, and if so, then what
> advantage is gained from processes=2 or higher (does it only make
> sense with multi-core processor)?

Despite the presence of the GIL in Python which restricts only one
thread to running Python code at a time in a process, with a
potentially I/O bound process like a web application, there is ample
opportunity for the GIL to be released while code is waiting for I/O,
such that an effective level of concurrency can still be handled with
one single multithreaded process.

So, using multiple processes across multiple CPUs can allow you to
harness the CPU power of the whole system, the nature of web
applications is such that you can still achieve a lot with a single
process.

Have a read of comments I make in the following about parallelisation
in Apache/mod_wsgi.

  http://blog.dscpl.com.au/2007/09/parallel-python-discussion-and-modwsgi.html
  http://blog.dscpl.com.au/2007/07/web-hosting-landscape-and-modwsgi.html

>> TurboGears is known to have a large base memory foot print to begin
>> with. The size of your process though appears to be the result of
>> application code performing caching and not purging the cache
>> properly. Alternatively, objects in application and creating reference
>> count cycles between objects which the Python garbage collector can't
>> break and so they hang around.
>>
>
> Yes, I cache things that make sense to cache, but cache them as part
> of 'session' objects which I believed to be being garbage collected,
> maybe that is my problem.  I wanted to see if this was typical of even
> well behaved wsgi apps running thru apache *because of an article I
> read*, which reads:

Even if not explicitly caching, object cycles can still cause problems
for transient objects.

You might try seeing if you can get going:

http://pypi.python.org/pypi/Dozer

This will allow you to try and track were all the objects are being
created and what type they are.

> "If you serve 99% static files and 1% dynamic files with Apache, each
> httpd process will use from 3-20 megs of RAM (depending on your MOST
> complex dynamic page).

That is a motherhood statement that has no practical usefulness and
would likely be totally meaningless to anything but the specific setup
and application the person was using. At I guess I would say that that
statement wasn't even made about Python web applications. Python web
applications tend to have much larger memory requirements.

> This occurs because a process grows to accommodate whatever it is
> serving, and NEVER decreases until that process dies. Unless you have
> very few dynamic pages and major traffic fluctuation, most of your
> httpd processes will soon take up an amount of RAM equal to the
> largest dynamic script on your system. A very smart web server would
> deal with this automatically. As it is, you have a few options to
> manually improve RAM usage."
>
> http://onlamp.com/pub/a/onlamp/2004/02/05/lamp_tuning.html
>
> This article lead me to hypothesize [hypothesise ;) ] that it would be
> typical for any apache/wsgi server to slowly increase in RAM
> consumption as more and more requests simultaneously requested
> processing that required some bulk of RAM... if these occurred in
> parallel, then the article suggests this RAM is NEVER returned to the
> OS.

The memory use should plateau though and shouldn't just keep growing.
If it keeps growing you have an object leak through bad caching or
object cycles.

> I can take further inquiry to the turbogears group if I can't resolve
> my memory problems, but please first answer this:  suppose my WSGI app
> grabbed a large amount of RAM *and assume it properly disposed of it*:
> would I see the RAM returned to the OS, or would the apache process
> hold it indefinitely?

The simple answer is that now you wouldn't see memory returned to the
OS. There are some slight exceptions to this but only with recent
Python versions (not sure which, may even only be some of the Python
3.X versions). You shouldn't though be able to count on those
exceptions though as in most cases you aren't likely to encounter it.

Graham

>> I would probably suggest you ask about your memory problems on the
>> TurboGear mailing list.
>>
>> Other than that, the only place that memory leaks usually come from
>> Apache itself are when mod_python is also loaded, but since only one
>> process has memory problems and certain other bits of configuration
>> are working, you don't appear to be doing that.
>>
>> Graham
>
> --
> You received this message because you are subscribed to the Google Groups 
> "modwsgi" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/modwsgi?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en.

Re: [modwsgi] Re: Apache memory consumption

Reply via email to