Re: Segfaults in ConnectionHander (Possible Solution)
Jim Gallacher wrote: Volodya wrote: On Mon, Jan 30, 2006 at 09:40:39PM -0500, Graham Dumpleton wrote: Graham Dumpleton wrote .. Extending the above code as: Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } /* Return empty string if no buckets. Can be caused by EAGAIN. */ if (APR_BRIGADE_EMPTY(bb)) { return PyString_FromString(""); } seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check whether any new buckets added and returning empty string if not. Okay, this may work, but the EAGAIN propogating backup as an empty string to Python can cause a tight loop to occur where calls are going out and back into Python code. This will occur until something is read or an error occurs. To avoid the back and forth, another option may be: while (APR_BRIGADE_EMPTY(bb)) { Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } } Graham, this code runs smoothly, i.e. no segfaults, all tests passed: FreeBSD 4.9: That's good news. I still wonder why we are seeing this problem in 3.2 and 3.1.4 though. And what I meant to say was "and *NOT* in 3.1.4". Jim
Re: Segfaults in ConnectionHander (Possible Solution)
Volodya wrote: On Mon, Jan 30, 2006 at 09:40:39PM -0500, Graham Dumpleton wrote: Graham Dumpleton wrote .. Extending the above code as: Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } /* Return empty string if no buckets. Can be caused by EAGAIN. */ if (APR_BRIGADE_EMPTY(bb)) { return PyString_FromString(""); } seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check whether any new buckets added and returning empty string if not. Okay, this may work, but the EAGAIN propogating backup as an empty string to Python can cause a tight loop to occur where calls are going out and back into Python code. This will occur until something is read or an error occurs. To avoid the back and forth, another option may be: while (APR_BRIGADE_EMPTY(bb)) { Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } } Graham, this code runs smoothly, i.e. no segfaults, all tests passed: FreeBSD 4.9: That's good news. I still wonder why we are seeing this problem in 3.2 and 3.1.4 though. Jim
Re: Segfaults in ConnectionHander (Possible Solution)
On Mon, Jan 30, 2006 at 09:40:39PM -0500, Graham Dumpleton wrote: > Graham Dumpleton wrote .. > > Extending the above code as: > > > > Py_BEGIN_ALLOW_THREADS; > > rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, > > bufsize); > > Py_END_ALLOW_THREADS; > > > > if (! APR_STATUS_IS_SUCCESS(rc)) { > > PyErr_SetObject(PyExc_IOError, > > PyString_FromString("Connection read error")); > > return NULL; > > } > > > > /* Return empty string if no buckets. Can be caused by EAGAIN. */ > > if (APR_BRIGADE_EMPTY(bb)) { > > return PyString_FromString(""); > > } > > > > seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check > > whether any new buckets added and returning empty string if not. > > Okay, this may work, but the EAGAIN propogating backup as an empty > string to Python can cause a tight loop to occur where calls are going > out and back into Python code. This will occur until something is read > or an error occurs. > > To avoid the back and forth, another option may be: > > while (APR_BRIGADE_EMPTY(bb)) { > Py_BEGIN_ALLOW_THREADS; > rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, > bufsize); > Py_END_ALLOW_THREADS; > > if (! APR_STATUS_IS_SUCCESS(rc)) { > PyErr_SetObject(PyExc_IOError, > PyString_FromString("Connection read error")); > return NULL; > } > } > Graham, this code runs smoothly, i.e. no segfaults, all tests passed: FreeBSD 4.9: Apache/2.0.50 (prefork) Python/2.3.4 Apache/2.0.55 (prefork) Python/2.4.2 Thanks!
Re: Segfaults in ConnectionHander (Possible Solution)
Graham Dumpleton wrote: Graham Dumpleton wrote .. Returning back up to _conn_read() in mod_python source code, we have where core_input_filter() was called ap_get_brigade(): Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } Since APR_SUCCESS was returned and assigned to "rc", no problem is detected. The code which follows then assumes that the first bucket in the bucket brigade actually contains valid data, when in fact the first bucket is actually crap as nothing was done to set up a valid bucket since EAGAIN was returned. As a consequence it crashes. Thus in summary, _conn_read() doesn't cater in any way for the possibility that the initial socket read may have failed because of EAGAIN and thus the bucket is bogus. The problem is, how is it mean't to know this if the value APR_SUCCESS is returned by ap_get_brigade(). Extending the above code as: Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } /* Return empty string if no buckets. Can be caused by EAGAIN. */ if (APR_BRIGADE_EMPTY(bb)) { return PyString_FromString(""); } seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check whether any new buckets added and returning empty string if not. Can someone else seeing this issue try this fix and see if the tests then work. Note that APR_STATUS_IS_SUCCESS has been removed from apr 1.x, which is one of the issues in getting mod_python to run in Apache 2.2. It looks like we should just check if rc != 0. This is according to discussion here featuring Greg Stein and Ryan Bloom: http://www.mail-archive.com/dev@httpd.apache.org/msg21757.html I'll update MODPYTHON-78 regarding Apache 2.2 with details on this and apr_sockaddr_port_get which has also been removed in apr 1.x. Jim
Re: Segfaults in ConnectionHander
Jim Gallacher wrote: Graham Dumpleton wrote: What I might speculate is that if the test in mod_python for the connection handler is setup to run on a secondary listener port, but with the primary still active, that it may trigger the problem on other systems like Linux. Jim, you might want to try this and see if you can duplicate it on Linux. I'll try it tonight. Graham, I am not able to reproduce the problem using the configuration and example code you give in MODPYTHON-102. (Linux Debian 2.6.12-1-k7 kernel). Jim
Re: Segfaults in ConnectionHander (Possible Solution)
Graham Dumpleton wrote .. > Extending the above code as: > > Py_BEGIN_ALLOW_THREADS; > rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); > Py_END_ALLOW_THREADS; > > if (! APR_STATUS_IS_SUCCESS(rc)) { > PyErr_SetObject(PyExc_IOError, > PyString_FromString("Connection read error")); > return NULL; > } > > /* Return empty string if no buckets. Can be caused by EAGAIN. */ > if (APR_BRIGADE_EMPTY(bb)) { > return PyString_FromString(""); > } > > seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check > whether any new buckets added and returning empty string if not. Okay, this may work, but the EAGAIN propogating backup as an empty string to Python can cause a tight loop to occur where calls are going out and back into Python code. This will occur until something is read or an error occurs. To avoid the back and forth, another option may be: while (APR_BRIGADE_EMPTY(bb)) { Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } } What doesn't make sense to me is that on my Mac OS X box where this problem only occurs when you have two listener ports, even when you have already read some input from the connection, it tight loops with the lowest level read always returning EAGAIN. Ie., it doesn't block at all. Thus something really bad is happening on on Mac OS X. Unless Apache is setting some strange ioctl options on the socket to inadvertently cause this, it looks to me like Mac OS X is broken in some way. I am still on Mac OS X (10.3). I'll have to try it on my 10.4 box and see if it makes any difference. Graham
Re: Segfaults in ConnectionHander (Possible Solution)
Graham Dumpleton wrote .. > Returning back up to _conn_read() in mod_python source code, we have > where core_input_filter() was called ap_get_brigade(): > > Py_BEGIN_ALLOW_THREADS; > rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); > Py_END_ALLOW_THREADS; > > if (! APR_STATUS_IS_SUCCESS(rc)) { > PyErr_SetObject(PyExc_IOError, > PyString_FromString("Connection read error")); > return NULL; > } > > Since APR_SUCCESS was returned and assigned to "rc", no problem is detected. > > The code which follows then assumes that the first bucket in the bucket > brigade actually contains valid data, when in fact the first bucket is > actually > crap as nothing was done to set up a valid bucket since EAGAIN was returned. > As a consequence it crashes. > > Thus in summary, _conn_read() doesn't cater in any way for the possibility > that the initial socket read may have failed because of EAGAIN and thus > the bucket is bogus. The problem is, how is it mean't to know this if the > value APR_SUCCESS is returned by ap_get_brigade(). Extending the above code as: Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } /* Return empty string if no buckets. Can be caused by EAGAIN. */ if (APR_BRIGADE_EMPTY(bb)) { return PyString_FromString(""); } seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check whether any new buckets added and returning empty string if not. Can someone else seeing this issue try this fix and see if the tests then work. Graham
Re: Segfaults in ConnectionHander
This may be a good question to post to dev@httpd.apache.org Grisha On Mon, 30 Jan 2006, Graham Dumpleton wrote: Getting a bit closer now, have next part of puzzle worked out. Graham Dumpleton wrote .. This is starting to look really ugly. In _conn_read(), it first creates a bucket brigade from the connection objects pool object. No chance of this being destroyed prematurely as a result. bb = apr_brigade_create(c->pool, c->bucket_alloc); From what I understand, it then makes a call which links the bucket brigade to the actual source of data. rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Under normal circumstances this would also have the side effect of performing the first actual read of data off the socket connection which the client created to Apache. When ap_get_brigade() is called, it is actually calling through to the function core_input_filter() in Apache (server/core.c). In that function, it ultimately hits the code: e = APR_BRIGADE_FIRST(ctx->b); rv = apr_bucket_read(e, &str, &len, block); if (APR_STATUS_IS_EAGAIN(rv)) { return APR_SUCCESS; } Tracking down into apr_bucket_read() it ends up calling the function socket_bucket_read() containg the code: *str = NULL; *len = APR_BUCKET_BUFF_SIZE; buf = apr_bucket_alloc(*len, a->list); /* XXX: check for failure? */ rv = apr_socket_recv(p, buf, len); if (block == APR_NONBLOCK_READ) { apr_socket_timeout_set(p, timeout); } if (rv != APR_SUCCESS && rv != APR_EOF) { apr_bucket_free(buf); return rv; } The apr_socket_recv() is what is doing the initial read of data from the socket connection. This should block until the first data is received. What is happening though is that it is returning -1 with errno set to EAGAIN. Thus it frees the temporary bucket it created and returns EAGAIN as the result. If you note the code in the core_input_filter() it has: if (APR_STATUS_IS_EAGAIN(rv)) { return APR_SUCCESS; } Thus, when EAGAIN is encountered, it simply returns success and does not do anything else. Returning back up to _conn_read() in mod_python source code, we have where core_input_filter() was called ap_get_brigade(): Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } Since APR_SUCCESS was returned and assigned to "rc", no problem is detected. The code which follows then assumes that the first bucket in the bucket brigade actually contains valid data, when in fact the first bucket is actually crap as nothing was done to set up a valid bucket since EAGAIN was returned. As a consequence it crashes. Thus in summary, _conn_read() doesn't cater in any way for the possibility that the initial socket read may have failed because of EAGAIN and thus the bucket is bogus. The problem is, how is it mean't to know this if the value APR_SUCCESS is returned by ap_get_brigade(). At this point, seems a bit of research is needed of other examples of connection handlers for Apache to see how they handle the initial startup sequence and processing of initial data. What is in mod_python now does not appear to be reliable in the face of an EAGAIN error occuring. Graham
Re: Segfaults in ConnectionHander
Getting a bit closer now, have next part of puzzle worked out. Graham Dumpleton wrote .. > This is starting to look really ugly. > > In _conn_read(), it first creates a bucket brigade from the connection > objects pool object. No chance of this being destroyed prematurely > as a result. > > bb = apr_brigade_create(c->pool, c->bucket_alloc); > > >From what I understand, it then makes a call which links the bucket > brigade to the actual source of data. > > rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); > > Under normal circumstances this would also have the side effect of > performing the first actual read of data off the socket connection which > the client created to Apache. When ap_get_brigade() is called, it is actually calling through to the function core_input_filter() in Apache (server/core.c). In that function, it ultimately hits the code: e = APR_BRIGADE_FIRST(ctx->b); rv = apr_bucket_read(e, &str, &len, block); if (APR_STATUS_IS_EAGAIN(rv)) { return APR_SUCCESS; } Tracking down into apr_bucket_read() it ends up calling the function socket_bucket_read() containg the code: *str = NULL; *len = APR_BUCKET_BUFF_SIZE; buf = apr_bucket_alloc(*len, a->list); /* XXX: check for failure? */ rv = apr_socket_recv(p, buf, len); if (block == APR_NONBLOCK_READ) { apr_socket_timeout_set(p, timeout); } if (rv != APR_SUCCESS && rv != APR_EOF) { apr_bucket_free(buf); return rv; } The apr_socket_recv() is what is doing the initial read of data from the socket connection. This should block until the first data is received. What is happening though is that it is returning -1 with errno set to EAGAIN. Thus it frees the temporary bucket it created and returns EAGAIN as the result. If you note the code in the core_input_filter() it has: if (APR_STATUS_IS_EAGAIN(rv)) { return APR_SUCCESS; } Thus, when EAGAIN is encountered, it simply returns success and does not do anything else. Returning back up to _conn_read() in mod_python source code, we have where core_input_filter() was called ap_get_brigade(): Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString("Connection read error")); return NULL; } Since APR_SUCCESS was returned and assigned to "rc", no problem is detected. The code which follows then assumes that the first bucket in the bucket brigade actually contains valid data, when in fact the first bucket is actually crap as nothing was done to set up a valid bucket since EAGAIN was returned. As a consequence it crashes. Thus in summary, _conn_read() doesn't cater in any way for the possibility that the initial socket read may have failed because of EAGAIN and thus the bucket is bogus. The problem is, how is it mean't to know this if the value APR_SUCCESS is returned by ap_get_brigade(). At this point, seems a bit of research is needed of other examples of connection handlers for Apache to see how they handle the initial startup sequence and processing of initial data. What is in mod_python now does not appear to be reliable in the face of an EAGAIN error occuring. Graham
Re: Segfaults in ConnectionHander
Graham Dumpleton wrote: What I might speculate is that if the test in mod_python for the connection handler is setup to run on a secondary listener port, but with the primary still active, that it may trigger the problem on other systems like Linux. Jim, you might want to try this and see if you can duplicate it on Linux. I'll try it tonight. Jim
Re: Segfaults in ConnectionHander FreeBSD (was Re: 3.2.6 test period - how long do we wait?)
David Fraser wrote: Jim Gallacher wrote: Barry Pederson wrote: I think this is the general kind of thing we're looking for though, with some mistaken pointer/memory operation. Too bad we can't write *everything* in python. :( You haven't been following PyPy then? :-) David Well, sure, but I think porting will have to wait until at least mod_python 3.4. :) Jim
Re: Segfaults in ConnectionHander FreeBSD (was Re: 3.2.6 test period - how long do we wait?)
Jim Gallacher wrote: Barry Pederson wrote: I think this is the general kind of thing we're looking for though, with some mistaken pointer/memory operation. Too bad we can't write *everything* in python. :( You haven't been following PyPy then? :-) David
Re: Segfaults in ConnectionHander
Changed subject heading. See more of what I have uncovered below. Not sure where to go next. Graham Dumpleton wrote .. > > > Unlike suggestions by someone else that "self" seemed to be getting > corrupted, > > > it looks fine to me, and code simply crashed down in: > > > > > > apr_bucket_read(b, &data, &size, APR_BLOCK_READ) > > > > > > on very first call to it. Thus need to start tracking into Apache itself > > and see what > > > there may be about bucket structures that isn't correct. This is where > > I got to > > > last time before I gave up, feeling it wasn't worth the effort at the > > time. I'll try > > > and build a version of Apache with debug so I can get a better stack > > trace. > > > > The first thing I'd check is for validity of b. Buckets use reference > > counting much like Python, so sometimes it's possible for a bucket to > > "self-distruct". > > Starting to delve into the bucket now. Haven't looked at reference count > stuff yet, but the b->type object seems to be bogus. This is where the > read() function pointer is kept and since it is a bad value it is why it > dies. This is starting to look really ugly. In _conn_read(), it first creates a bucket brigade from the connection objects pool object. No chance of this being destroyed prematurely as a result. bb = apr_brigade_create(c->pool, c->bucket_alloc); >From what I understand, it then makes a call which links the bucket brigade to the actual source of data. rc = ap_get_brigade(c->input_filters, bb, mode, APR_BLOCK_READ, bufsize); Under normal circumstances this would also have the side effect of performing the first actual read of data off the socket connection which the client created to Apache. Import things here to note are the value of: c->input_filters->frec->filter_func.in_func going into the call. Not sure exactly, but I imagine that this is the first input filter which handles reading from the socket. My logging shows the address of the input filter in memory as 178456. When ap_get_brigade() returns okay, the first actual bucket from the bucket brigade is obatained: b = APR_BRIGADE_FIRST(bb); There are two interesting values in the bucket worth looking at: b->type->name b->type->read The first is the type of bucket object and the second is the pointer to a function to read data from the bucket. My logging shows the type of bucket as being "HEAP" and the address of the read function pointer as 1819356. I will not go into the rest of the function except to say that as necessary it may do additional reads using apr_bucket_read() to get more data if required when that initially read by ap_get_brigade() isn't enough. Anyway, the above is when it is working okay. This being when I have the connection handler attached to my primary listener port. As soon as I add into the main Apache configuration file an additional socket for Apache to listen on, ie., when I add: Listen 8081 it will crash in _conn_read() no matter whether I have attached the connection handler to the primary listener port or the additional listener port. In contrast to the above, when it dies, the address of the input filter in memory is still 178456, but the initial bucket in the bucket brigade as populated by ap_get_brigade() is bogus. Ie., I get for the name crap like: \x01\x80b\x18\x01\x8f\xec\x18\x01\x83b\x18\x01\x80b\x1c\x01\x8f\xcc\xb8 and the address of the read function is 88. Importantly, the ap_get_brigade() function does not block on a read waiting for the first data coming over the socket like it did before. With the bogus bucket returned, when apr_bucket_read() is later called, it tries to use the read function in the initial bucket which being bogus causes the crash. Thus in summary, with a secondary listener port the ap_get_brigade() function doesn't block on read waiting for first data, returning immediately, but still seeming to return success. The initial bucket in the bucket brigade then seems to be bogus. What I might speculate is that if the test in mod_python for the connection handler is setup to run on a secondary listener port, but with the primary still active, that it may trigger the problem on other systems like Linux. Jim, you might want to try this and see if you can duplicate it on Linux. BTW, I am not saying this is the same problem on the BSD systems, but it certainly is not correct either way. Graham
Re: Segfaults in ConnectionHander FreeBSD (was Re: 3.2.6 test period - how long do we wait?)
Jim Gallacher wrote: Dang, it's frustrating not being able to reproduce this bug in Linux. I suppose it's maybe something to do with different malloc implementations or such. I haven't seen any +1s for OpenBSD, which would be interesting to see since they added some stuff in 3.8 to help catch problems with this sort of thing http://kerneltrap.org/node/5584 Anyone been able to use valgrind or similar with mod_python? I Googled and found a couple old messages from '02 and '04 mentioning attempts to use this, but doesn't sound like much came out of it. I think there's a valgrind port on FreeBSD, so I may give that a try. Barry
Re: Segfaults in ConnectionHander FreeBSD (was Re: 3.2.6 test period - how long do we wait?)
Barry Pederson wrote: I don't know if this is the answer to the problem, but it looks like a bug anyway. In connobject.c starting at line 133: /* time to grow destination string? */ if (len == 0 && bytes_read == bufsize) { _PyString_Resize(&result, bufsize + HUGE_STRING_LEN); buffer = PyString_AS_STRING((PyStringObject *) result); buffer += HUGE_STRING_LEN; bufsize += HUGE_STRING_LEN; } It looks like we've just set the buffer pointer to an address somewhere inside the buffer. That can't be good. The buffer pointer should be set to the bytes_read position. Perhaps one of you FreeBSD heads could try the attached patch. Jim Index: src/connobject.c === --- src/connobject.c(revision 369511) +++ src/connobject.c(working copy) @@ -135,7 +135,7 @@ _PyString_Resize(&result, bufsize + HUGE_STRING_LEN); buffer = PyString_AS_STRING((PyStringObject *) result); -buffer += HUGE_STRING_LEN; +buffer += bytes_read; bufsize += HUGE_STRING_LEN; } Sorry, that doesn't seem to fix it. I did a fresh extraction of mod_python-3.2.6.tgz, applied the patch, did ./configure, make, su, make install, exit su, cd test, ran test.py - got the same result as before, with the same core dump apparently. I really didn't think it would help since a buffer of HUGE_STRING_LEN (8192) should have been created in the first place. The unit test wouldn't be reading that many bytes, so I doubt the buffer is getting resized. All the same it still looks like a bug. I think this is the general kind of thing we're looking for though, with some mistaken pointer/memory operation. Too bad we can't write *everything* in python. :( --- As I mentioned in another message, I did some experimenting with disabling other unittests and found if you disable just "test_fileupload", all the remaining tests including "test_connectionhandler" pass. I hadn't forgotten. I'm just trying to understand what might be going on in the code and I spotted the bug. If you disable everything except "test_fileupload" and "test_connectionhandler", then "test_connectionhandler" still crashes. Dang, it's frustrating not being able to reproduce this bug in Linux. So I suspect that it's code involved with running "test_fileupload" (Testing 1 MB file upload support) that's really the source of the problem, and it's screwing up some part of memory thats only tripped over later later during the connectionhandler test. One of the things I'm trying to understand is what has changed since 3.1 that is causing this bug. The only thing different in connobject.c is a fix to actually return local_ip and local_host (was returning remote_ip and remote_host), plus makeipaddr and makesockaddr now call the equivalent apr functions rather than the prior roll-our-own approach. Neither of these changes should impact the _conn_read. Jim