2009/3/23 Graham Dumpleton <graham.dumple...@gmail.com>:
> 2009/3/23 gert <gert.cuyk...@gmail.com>:
>>
>>
>>
>> On Mar 23, 2:02 am, Graham Dumpleton <graham.dumple...@gmail.com>
>> wrote:
>>> 2009/3/23 gert <gert.cuyk...@gmail.com>:
>>>
>>> > wsgi r1232 python 3.1 apache 2.2.11
>>
>> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME
>> COMMAND
>>
>>> > www      29747  0.0  0.8 229496  4160 ?        Sl   01:35   0:00
>>> > (wsgi:site1)         -k start
>>> > www      29776  0.0  0.8   8268  4040 ?        S    01:35   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      29777  0.0  0.8   8268  4032 ?        S    01:35   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      29778  0.0  0.8   8268  4032 ?        S    01:35   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      29779  0.0  0.8   8268  4032 ?        S    01:35   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      29780  0.0  0.8   8268  4032 ?        S    01:35   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>>
>>> > x20 apache2ctl restart
>>>
>>> > www      30550  0.0  1.3 231432  6352 ?        Sl   01:36   0:00
>>> > (wsgi:site1)         -k start
>>> > www      30579  0.0  1.3  10204  6192 ?        S    01:36   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      30580  0.0  1.3  10204  6184 ?        S    01:36   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      30581  0.0  1.3  10204  6184 ?        S    01:36   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      30582  0.0  1.3  10204  6184 ?        S    01:36   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>> > www      30583  0.0  1.3  10204  6184 ?        S    01:36   0:00 /usr/
>>> > httpd/bin/httpd -k start
>>>
>>> Looking into my crystal ball I assume that you are possibly pointing
>>> out that memory is still being leaked.
>>>
>>> Even though that issue addresses a larger source of memory leakage,
>>> the Python interpreter itself still leaks memory when Py_Finalize() is
>>> called.
>>>
>>> I actually find the comment by Mark Hammond in:
>>>
>>>  http://groups.google.com/group/comp.lang.python/browse_frm/thread/7b8...
>>>
>>> quite disturbing. Namely:
>>>
>>> """Calling
>>> Py_Initialize and Py_Finalize multiple times does leak (Python 3 has
>>> mechanisms so this need to always be true in the future, but it is true
>>> now for non-trivial apps."""
>>>
>>> Unfortunately his grammar is a bit unclear and so not 100% sure what
>>> he meant. Not sure if what he meant to say is that Python 3 will
>>> always have memory leaks, or that it shouldn't, whereas older versions
>>> of Python can.
>>>
>>> If by design Python 3.0 is now going to never properly clean up its
>>> memory on exit, then we are all screwed and embedded mode will be
>>> useless and may as well be removed, as well as mod_python also dying
>>> for good. This means that mod_wsgid as described in mod_wsgi roadmap
>>> will be the only viable way of running Python under Apache in the
>>> future.
>>>
>>> I'll see if I can get Mark to clarify what he meant.
>>
>> Note that (wsgi:site1) witch is the daemon process, increases exactly
>> the same the 5 embedded processes
>
> The important one to look at to gauge rate of leakage is the Apache
> parent process. So, if can enable showing of PPID as well as PID, you
> can more easily see which is the parent process of the wsgi process.
> That will be the one you want to compare rate of growth.
>
> Anyway, as I said, while ever Python leaks memory on Py_Finalize()
> this is going to be an issue. Although third party C extensions module
> might leak memory as well, they aren't loaded into Apache parent as
> don't provide a way of preloading of additional modules into parent.
> That leaks can occur is one of the reasons don't allow it.
>
> All up, this is another reason why using daemon mode is better default
> way of doing things as you don't need to restart whole of Apache just
> to restart a WSGI application.

Part of the discussion associated with:

  http://bugs.python.org/issue1856

is pertinent to this problem.

One thing that is suggested is that the underlying data which exists
to support the simplified Python GIL cannot be torn down and has to
exist for the life of the process.

This might be reasonable if Py_Initialize() is called straight away,
but when a restart in Apache occurs it will unload the mod_wsgi.so
file in the Apache parent process and thus also unload the Python
library. As a result, all the references to that preserved global data
is lost and cannot be reused. Thus when mod_wsgi.so is reloaded by
Apache the global variables are reset to nulls and Python thinks it
has to reinitialise the data.

This therefore is going to be one source of memory leaks that can
never be avoided. If there are similar instances where Python makes
the assumption that it can cache the data because Py_Initialize() may
be called again, and so never truly free the memory, that will also
leak.

This is why having daemon mode only option is better. In that case and
for mod_wsgi 4.0, Python would be initialised in separate monitor
process. On a restart that whole monitor process is also destroyed.
Thus don't have this problem with memory leaks when calling
Py_Initialize() a second time, as would never occur.

We are therefore almost at the point where for UNIX systems embedded
mode should be completely done away with or has to change
significantly. On Windows it doesn't matter, as Python is only
initialised in the Apache child process anyway and not cycling of
Py_Initialize()/Py_Finalize().

If getting rid of embedded mode entirely, the problem is how to
support WSGIAccessScript, WSGIAuthUserScript and WSGIAuthGroupScript
if want to keep providing that option. The only option for that would
be to do what FASTCGI does and execute the operation for that in the
daemon process as well. It would slow things down doing that, but no
choice.

The other thing that would be similarly a problem would be
WSGIDispatchScript. This allows user (admin) provided Python code to
select process group and application group for specific WSGI
application dynamically.

The only choice is to not initialise Python in the Apache parent
process and instead delay initialisation until after Apache child
server processes are created. This is actually what the experimental
WSGILazyInitialization directive in mod_wsgi 3.0 does. The difference
at the moment though is that that directive also causes initialisation
for daemon processes to be delayed. This all avoids the memory leaks
in the Apache parent process which is in turn inherited by the Apache
child server processes and daemon processes, but from my tests results
in all those processes then taking more memory as some ability to
share data or rely on delayed copy on write is lost. So, you win one
way but loose in another.

Anyway, gert, I am sure you will enjoy having a play with the
WSGILazyInitialization directive and see how it affects your overall
memory usage figures, as well as verify that it does eliminate the
memory leak problems in the Apache parent process and thus all other
processes.

Graham

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"modwsgi" group.
To post to this group, send email to modwsgi@googlegroups.com
To unsubscribe from this group, send email to 
modwsgi+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/modwsgi?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to