mod_python as a mod_dav backend
Hi, Not sure if this is best posted here, or to mod_dav mailing list. But here goes. Has anyone looked at using mod_python to backend mod_dav, with a similar usage to FUSE's python binding. Basically mod_dav_python. Thanks, Matt -- Matt Carpenter [EMAIL PROTECTED] FCP Internet LTD Unit 3, 52 Victoria Road, Aldershot, Hampshire, GU11 1SS tel: +44 (0) 1252 333 344 fax: +44 (0) 1252 333 348 efax: +44 (0) 8704 281 008 This message is confidential; Any unauthorised disclosure, use or dissemination, either whole or partial, is prohibited. If you are not the intended recipient, please notify the sender immediately and delete all copies of this message. Any views or opinions presented are solely those of the author and do not necessarily represent those of FCP Internet or its subsidiaries.
Re: mod_python as a mod_dav backend
Graham Dumpleton wrote: Others may know what you are talking about, but I plead ignorance. Can you perhaps describe further what you are talking about, how it would be used etc. A URL to stuff that could be read to understand similar things would also help. Graham What I am trying to achieve: I'm writing a module for our system for managing documents that're attached to records in our database, updates to these documents are recorded into the database along with what user made the user that made the edit, and various other information depending on what type of document (the system manages templates, and mailmerges with virtual .csv files pulling data from the database as well). The directory structure is entirely virtual, the structure on the servers is just a few directories for each type of file, and the files are named after their record in the database. mod_dav implements hooks (see http://mailman.lyra.org/pipermail/dav-dev/2005-April/005926.html), but I'm not a C programmer, so I'd like a these hooks to be able to call python functions instead. Hope that makes sense. Thanks, Matt -- Matt Carpenter [EMAIL PROTECTED] FCP Internet LTD Unit 3, 52 Victoria Road, Aldershot, Hampshire, GU11 1SS tel: +44 (0) 1252 333 344 fax: +44 (0) 1252 333 348 efax: +44 (0) 8704 281 008 This message is confidential; Any unauthorised disclosure, use or dissemination, either whole or partial, is prohibited. If you are not the intended recipient, please notify the sender immediately and delete all copies of this message. Any views or opinions presented are solely those of the author and do not necessarily represent those of FCP Internet or its subsidiaries.
Re: Segfaults in ConnectionHander FreeBSD (was Re: 3.2.6 test period - how long do we wait?)
Jim Gallacher wrote: Barry Pederson wrote: I think this is the general kind of thing we're looking for though, with some mistaken pointer/memory operation. Too bad we can't write *everything* in python. :( You haven't been following PyPy then? :-) David
Re: 3.2.6 test period - how long do we wait?
Gregory (Grisha) Trubetskoy wrote: On Sun, 29 Jan 2006, Graham Dumpleton wrote: buffer += bufsize; On a second thought - yes, you're right :-) And if he's not then there is a bug in filter_read since that is what it does and it is very similar to _conn_read. Jim
Re: mod_python as a mod_dav backend
On 30/01/2006, at 9:11 PM, Matt Carpenter wrote: Hi, Not sure if this is best posted here, or to mod_dav mailing list. But here goes. Has anyone looked at using mod_python to backend mod_dav, with a similar usage to FUSE's python binding. Basically mod_dav_python. Others may know what you are talking about, but I plead ignorance. Can you perhaps describe further what you are talking about, how it would be used etc. A URL to stuff that could be read to understand similar things would also help. Graham
Re: Segfaults in ConnectionHander
This may be a good question to post to dev@httpd.apache.org Grisha On Mon, 30 Jan 2006, Graham Dumpleton wrote: Getting a bit closer now, have next part of puzzle worked out. Graham Dumpleton wrote .. This is starting to look really ugly. In _conn_read(), it first creates a bucket brigade from the connection objects pool object. No chance of this being destroyed prematurely as a result. bb = apr_brigade_create(c-pool, c-bucket_alloc); From what I understand, it then makes a call which links the bucket brigade to the actual source of data. rc = ap_get_brigade(c-input_filters, bb, mode, APR_BLOCK_READ, bufsize); Under normal circumstances this would also have the side effect of performing the first actual read of data off the socket connection which the client created to Apache. When ap_get_brigade() is called, it is actually calling through to the function core_input_filter() in Apache (server/core.c). In that function, it ultimately hits the code: e = APR_BRIGADE_FIRST(ctx-b); rv = apr_bucket_read(e, str, len, block); if (APR_STATUS_IS_EAGAIN(rv)) { return APR_SUCCESS; } Tracking down into apr_bucket_read() it ends up calling the function socket_bucket_read() containg the code: *str = NULL; *len = APR_BUCKET_BUFF_SIZE; buf = apr_bucket_alloc(*len, a-list); /* XXX: check for failure? */ rv = apr_socket_recv(p, buf, len); if (block == APR_NONBLOCK_READ) { apr_socket_timeout_set(p, timeout); } if (rv != APR_SUCCESS rv != APR_EOF) { apr_bucket_free(buf); return rv; } The apr_socket_recv() is what is doing the initial read of data from the socket connection. This should block until the first data is received. What is happening though is that it is returning -1 with errno set to EAGAIN. Thus it frees the temporary bucket it created and returns EAGAIN as the result. If you note the code in the core_input_filter() it has: if (APR_STATUS_IS_EAGAIN(rv)) { return APR_SUCCESS; } Thus, when EAGAIN is encountered, it simply returns success and does not do anything else. Returning back up to _conn_read() in mod_python source code, we have where core_input_filter() was called ap_get_brigade(): Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c-input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString(Connection read error)); return NULL; } Since APR_SUCCESS was returned and assigned to rc, no problem is detected. The code which follows then assumes that the first bucket in the bucket brigade actually contains valid data, when in fact the first bucket is actually crap as nothing was done to set up a valid bucket since EAGAIN was returned. As a consequence it crashes. Thus in summary, _conn_read() doesn't cater in any way for the possibility that the initial socket read may have failed because of EAGAIN and thus the bucket is bogus. The problem is, how is it mean't to know this if the value APR_SUCCESS is returned by ap_get_brigade(). At this point, seems a bit of research is needed of other examples of connection handlers for Apache to see how they handle the initial startup sequence and processing of initial data. What is in mod_python now does not appear to be reliable in the face of an EAGAIN error occuring. Graham
Re: Segfaults in ConnectionHander (Possible Solution)
Graham Dumpleton wrote .. Returning back up to _conn_read() in mod_python source code, we have where core_input_filter() was called ap_get_brigade(): Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c-input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString(Connection read error)); return NULL; } Since APR_SUCCESS was returned and assigned to rc, no problem is detected. The code which follows then assumes that the first bucket in the bucket brigade actually contains valid data, when in fact the first bucket is actually crap as nothing was done to set up a valid bucket since EAGAIN was returned. As a consequence it crashes. Thus in summary, _conn_read() doesn't cater in any way for the possibility that the initial socket read may have failed because of EAGAIN and thus the bucket is bogus. The problem is, how is it mean't to know this if the value APR_SUCCESS is returned by ap_get_brigade(). Extending the above code as: Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c-input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString(Connection read error)); return NULL; } /* Return empty string if no buckets. Can be caused by EAGAIN. */ if (APR_BRIGADE_EMPTY(bb)) { return PyString_FromString(); } seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check whether any new buckets added and returning empty string if not. Can someone else seeing this issue try this fix and see if the tests then work. Graham
Re: Segfaults in ConnectionHander (Possible Solution)
Graham Dumpleton wrote .. Extending the above code as: Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c-input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString(Connection read error)); return NULL; } /* Return empty string if no buckets. Can be caused by EAGAIN. */ if (APR_BRIGADE_EMPTY(bb)) { return PyString_FromString(); } seems to fix the problem. Ie., use call to APR_BRIGADE_EMPTY(bb) to check whether any new buckets added and returning empty string if not. Okay, this may work, but the EAGAIN propogating backup as an empty string to Python can cause a tight loop to occur where calls are going out and back into Python code. This will occur until something is read or an error occurs. To avoid the back and forth, another option may be: while (APR_BRIGADE_EMPTY(bb)) { Py_BEGIN_ALLOW_THREADS; rc = ap_get_brigade(c-input_filters, bb, mode, APR_BLOCK_READ, bufsize); Py_END_ALLOW_THREADS; if (! APR_STATUS_IS_SUCCESS(rc)) { PyErr_SetObject(PyExc_IOError, PyString_FromString(Connection read error)); return NULL; } } What doesn't make sense to me is that on my Mac OS X box where this problem only occurs when you have two listener ports, even when you have already read some input from the connection, it tight loops with the lowest level read always returning EAGAIN. Ie., it doesn't block at all. Thus something really bad is happening on on Mac OS X. Unless Apache is setting some strange ioctl options on the socket to inadvertently cause this, it looks to me like Mac OS X is broken in some way. I am still on Mac OS X (10.3). I'll have to try it on my 10.4 box and see if it makes any difference. Graham
Re: Segfaults in ConnectionHander
Jim Gallacher wrote: Graham Dumpleton wrote: What I might speculate is that if the test in mod_python for the connection handler is setup to run on a secondary listener port, but with the primary still active, that it may trigger the problem on other systems like Linux. Jim, you might want to try this and see if you can duplicate it on Linux. I'll try it tonight. Graham, I am not able to reproduce the problem using the configuration and example code you give in MODPYTHON-102. (Linux Debian 2.6.12-1-k7 kernel). Jim
Re: contribution to mod_python: Apache + SimpleXMLRPCServer (fwd)
An initial few comments from a first pass through. def _write(self, request, response, content_type='text/xml'): request.send_http_header() request.content_type = content_type request.write(response) This is technically wrong, although it doesn't matter on mod_python 3.0. The issue is that send_http_header() in mod_python 2.7 should only be called after content type is set. Here they do it before. Only works because in 3.X send_http_header() is a NOP. # men have been killed for less temp = sys.stderr sys.stderr = stderr_mod_python(self.request) ... sys.stderr = temp This is not safe in a multithread MPM. except Exception, e: # report exception back to server response = xmlrpclib.dumps( xmlrpclib.Fault(1, %s:%s % (sys.exc_type, sys.exc_value)) ) # and also log it, duh etype, evalue, etb = sys.exc_info() stack = traceback.format_exception(etype, evalue, etb) for l in stack: sys.stderr.write(l) First it uses fudged sys.stderr. Second is that it exacerbates a problem in XML-RPC which is that there is no concept of namespaces for error return codes. Because they have used arbitrary return status of 1 for internal exception or unexpected exception in user code, then you can't distinguish easily a valid fault response with return status of 1 generated by user code from unexpected exception. It may be more appropriate to generate a 500 HTTP error response in this circumstance given that it really consitutes an internal server error rather than it being a valid XML-RPC fault response generated by the user. This is an issue for debate though. It depends on whether you want to be conformant with how SimpleXmlRpcServer works which is where this questionable code came from in the first place. Other issues are that it doesn't check incoming content type to validate that it is actually 'text/xml' per specification for XML-RPC. It doesn't use incoming content length for read on POST data which can be a problem in some cases. It also doesn't set outgoing content length as per specification for XML-RPC. It could also perhaps be a bit more knowledgeable about mod_python and pass through apache.SERVER_RETURN exception so as to allow exposed methods to still generate it if need be. This isn't the only implementation of XML-RPC support integrated with mod_python. I have an alternate take on it in Vampire which isn't bound to the SimpleXmlRpcServer base class. See: http://svn.dscpl.com.au/vampire/trunk/software/vampire/xmlrpc.py I can't find the others right now, but I have posted links to them before on the main mod_python list. Overall, I'm not sure at this point that it is worthwhile putting XML- RPC support in mod_python. If it is done, I would prefer to see it be done as part of a larger effort to provide a range of handler components which all work consistently together, rather than adhoc bits and pieces that cannot be glued together easily. Grisha wrote .. If someone here has spare Brain/CPU cycles, could you look at the attached code and provide feedback? Grisha -- Forwarded message -- Date: Mon, 30 Jan 2006 18:04:42 -0800 From: Matt Chisholm [EMAIL PROTECTED] Subject: Re: contribution to mod_python: Apache + SimpleXMLRPCServer On Jan 30 2006, 11:42, Matt Chisholm wrote: We've written a few classes to use the SimpleXMLRPCServer module in Python with mod_python instead of the Python CGI module. We've been using it internally for a while and we'd like to contribute it back to the mod_python project; we don't really have the time to create a separate project for it, but it seems like something that would be useful to many people. We agree to assign copyright of this code to the mod_python project and to license it under the mod_python license. Our employer, BitTorrent Inc., also agrees. I've attached a copy of the code. Please let me know if this is not the right channel to send contributions; also, I'm not on this list so please respond to me individually. Matt Chisholm