I forgot to mentioned that I changed request_tp_dealloc to:
static void request_tp_dealloc(requestobject *self)
{
// de-register the object from the GC
// before its deallocation, to prevent the
// GC to run on a partially de-allocated object
if (self->rbuff != NULL) {
free(self->rbuff);
}
PyObject_GC_UnTrack(self);
request_tp_clear(self);
PyObject_GC_Del(self);
}
I don't know if that function will be the right place to
free(self->rbuff) but it for the mean time there is no leak in my
test.
Jim Gallacher wrote:
I've created a JIRA issue for the readline leaks. The one I detail is a
corner case related to what you found, but I don't think the fix below
will help. Take a look at 182 and let me know what you think.
http://issues.apache.org/jira/browse/MODPYTHON-182
I think we should be checking requestobject self->rbuff during the
request cleanup and make sure it really is NULL, just as a safety check.
Alexis Marrero wrote:
Jim,
I found the culprit!!!
There are two unrelated memory leaks.
The first one is in req_readline().
This code:
/* is there anything left in the rbuff from previous reads? */
if (self->rbuff_pos < self->rbuff_len) {
/* if yes, process that first */
while (self->rbuff_pos < self->rbuff_len) {
buffer[copied++] = self->rbuff[self->rbuff_pos];
if ((self->rbuff[self->rbuff_pos++] == '\n') ||
(copied == len)) {
/* our work is done */
/* resize if necessary */
if (copied < len)
if(_PyString_Resize(&result, copied))
return NULL;
return result;
}
}
}
Should look like this:
/* is there anything left in the rbuff from previous reads? */
if (self->rbuff_pos < self->rbuff_len) {
/* if yes, process that first */
while (self->rbuff_pos < self->rbuff_len) {
buffer[copied++] = self->rbuff[self->rbuff_pos];
if ((self->rbuff[self->rbuff_pos++] == '\n') ||
(copied == len)) {
/* our work is done */
/* resize if necessary */
if (copied < len)
if(_PyString_Resize(&result, copied))
return NULL;
if (self->rbuff_pos >= self->rbuff_len && self->rbuff !=
NULL)
{
free(self->rbuff);
self->rbuff = NULL;
}
return result;
}
}
}
That solves one. Like I mentioned in one of the emails to the mailing
list, the buffer was not been freed in the last readline().
But not completely - see MODPYTHON-182.
The second one, for which I don't have a fix yet is apache.make_table()
in mod_python/util.py line 152. If I comment lines 152, 225, 227 you
will see that memory doesn't grow. I will keep investigating...
As will I ...
Jim
Until the next email.
/amn
Jim Gallacher wrote:
I ran my baseline test with 500k requests, and got the following:
(Note that all the figures will have an error of +/- 0.1)
baseline 500k requests 1.7%
So it would seem that there is not a specific problem in readline, or my
test case is messed up. FYI here are my 2 handlers:
def baseline_handler(req):
req.content_type = 'text/plain'
req.write('ok baseline:')
return apache.OK
def readline_handler(req):
# the body of the request consists of
# '\n'.join([ 'a'*10 for i in xrange(0,10) ])
req.content_type = 'text/plain'
count = 0
while(1):
line = req.readline()
if not line:
break
count += 1
req.write('ok readline: %d lines read' % count)
return apache.OK
Jim
Jim Gallacher wrote:
I'll have some time to investigate this over the next couple of days. I
ran my leaktest script for FieldStorage and readline, and FieldStorage
certainly still leaks, but I'm not so sure about readline itself.
baseline 1k requests 1.2%
readline 500k requests 1.6%
fieldstorage 498k requests 10.1%
The memory consumption figures are for a machine with 512MB ram.
I'm running my baseline test with 500k requests right now to see if the
1.6% figure for readline represents a real leak in that function, or if
it is just mod_python itself.
My memory leak test suite is probably at the point that other people
will find it useful. Once I've written a README explaining its use I'll
commit it to the repository so everybody to play. If you anyone wants to
give it a shot in the interim I can email it to you. Give me shout
offlist.
I haven't had a chance to look at the code you highlight below, or at
least not closely. The whole req_readline function looks like it will
require a good strong cup of coffee to fully comprehend. ;)
Jim
Alexis Marrero wrote:
Experimenting on this issue, I noticed that neither of the following
set
of "ifs" are ever met:
786 /* Free old rbuff as the old contents have been copied
over and
787 we are about to allocate a new rbuff. Perhaps this could
be reused
788 somehow? */
789 if (self->rbuff_pos >= self->rbuff_len && self->rbuff !=
NULL)
790 {
791 free(self->rbuff);
792 self->rbuff = NULL;
793 }
--------
846 /* Free rbuff if we're done with it */
847 if (self->rbuff_pos >= self->rbuff_len && self->rbuff !=
NULL)
848 {
849 free(self->rbuff);
850 self->rbuff = NULL;
851 }
I noticed that by putting some statements to write to the output
stream. They never execute.
/amn
On Aug 10, 2006, at 1:43 PM, Alexis Marrero wrote:
All,
We are trying to nail down a memory leak that happens only when
documents are POSTed to the server.
For testing we have a short script that does:
while True:
dictionary_of_parameters = {'field1': 'a'*100000}
post('url...', dictionary_of_parameters)
Then we run "top" on the server and watch the server memory grow
without bound. Why do we know that the problem is in
request.readline()? If I go to
mod_python.util.FieldStorage.read_to_boundary() and add the following
statement:
def read_to_boundary(...):
return True
...
as the first executable line in the function the memory does not grow.
I have read the req_readline a 1000 time and I can't figure out where
the problem is.
My config:
Python 2.4.1
mod_python 3.2.10
Our request handler does nothing other than using
util.FieldStorage(req) and req.write('hello').
I have some suspicion that it has to do with:
....
19 * requestobject.c
20 *
21 * $Id: requestobject.c 420297 2006-07-09 13:53:06Z nlehuen $
22 *
23 */
....
846 /* Free rbuff if we're done with it */
847 if (self->rbuff_pos >= self->rbuff_len && self->rbuff !=
NULL)
848 {
849 free(self->rbuff);
850 self->rbuff = NULL;
851 }
852
Though, I can't confirm.
/amn
|