RE: [PATCH] DAV method registration
From: Ryan Morgan [mailto:[EMAIL PROTECTED]] Sent: 01 October 2002 07:18 [...] Since Apache2 has the ability to serve multiple protocols, so the number of registered methods can grow quite rapidly. Since the HTTP module cannot be removed from the server (without some serious effort.. ask rbb), the next best thing is to not register those DAV methods if they will not be used by the server. When were talking other protocols, it doesn't really make sense to have the methods mixed with the http methods. We should focus more on making it easier to not load the HTTP module than to use a 'bandaid' fix. 62 methods fills up extremely fast when you are going to handle multiple protocols; what is going to be the next step when you want to serve yet another protocol? Sander
Re: Keeping per-virtual host data?
Hi Steve, --On Monday, September 30, 2002 10:50 PM +0100 Steve Kemp [EMAIL PROTECTED] wrote: Hi All, I've written a module for Apache which will limit the amount of data the server may transfer in a given time period. (OK that's a simplistic overview anyway ;) To do this I make use of the information gathered by mod_status, and when the data transferred goes over a given threshhold I cause the server to simply respond to all requests with a redirect. This works fine on a per-server basis. I'd like to re-code my module to work with virtual hosts; so I believe I need some way of keeping track of the size of data which has been transferred on a per-vhost basis. (Or failing that find another module, like mod_status, which contains the necessary magic which I can piggy back upon). Is this possible, and if so does anybody have a pointer to a method I could use? www.mod-snmp.com This is an SNMP module that does per v-host statistics. With some modifications one could use this for the purpose you want. All of the modules I've studied seem to manage to operate without having per vhost static data which makes me think either I'm missing something obvious, or it's not possible. Either way I'd love to be enlightened. The only problem you encounter is that the statistics exist in a side process and you have to make some connection from each Apache child to the side process. It does cross my mind that the forking may rule out persistant static data of the kind I want; but temporary files could be used I guess.. If you are willing to put all the data into temp files that could be OK. However, in order to avoid dead-locks and many locks for access to this you better fork of a process that only writes into it where all children just can read it. For those who might be interested my code is online at: http://www.steve.org.uk/Software/mod_curb/ (Just be gentle; it's my first module ;) Steve --- www.steve.org.uk Harrie Internet Management Consulting mailto: [EMAIL PROTECTED] http://www.lisanza.net/ Author of MOD-SNMP, enabling SNMP management the Apache HTTP server
Re: Keeping per-virtual host data?
Hi Harrie, On Tue, Oct 01, 2002 at 10:42:57AM +0200, Harrie Hazewinkel wrote: www.mod-snmp.com This is an SNMP module that does per v-host statistics. With some modifications one could use this for the purpose you want. Thanks for the pointer. I've had a quick look at this just now and it's quite nice. (Maybe it'd be a good idea to have the source unpack into it's own directory ;) The only problem you encounter is that the statistics exist in a side process and you have to make some connection from each Apache child to the side process. Sure. If you are willing to put all the data into temp files that could be OK. However, in order to avoid dead-locks and many locks for access to this you better fork of a process that only writes into it where all children just can read it. After a bit of trial and error I've managed to get the effect I want. My big mistake was using 'per_dir_config' rather than the per server config. Once I'd corrected that the rest flowed fairly easily. (Reading the virtual host from the mod_status is trivial, and that allows me to do everything I need to do). Steve --- # GNU MP3 Streaming http://www.gnump3d.org/
Re: providing apache2 rpms ?
On Tue, Oct 01, 2002 at 01:21:44AM +0100, Pier Fumagalli wrote: Since we're on it, and since I did that already to keep up to date our 20+ Solaris machines... Do we want to include also a proto and a script for Solaris? I think that the idea of having an easy way to roll a Solaris package would be really nice, especially because the only binaries that I can find are apache 1.3.12 on sunfreeware and a 2.0.39 in dist/httpd/binaries/solaris/. Not that I mind rolling my own packages, but we might help people move to 2.0 faster this way. vh Mads Toftum -- `Darn it, who spiked my coffee with water?!' - lwall
RE: building Release symbols for Win32
Unless someone knows a trick that I'm not aware of debugging Win32 crash dumps (DrWatson .dmp files) can be a real pain unless you have symbols because Frame Pointer Omission records make it hard to construct the call stack from a dump. Having symbols that match the binary build handy when you examine the dump avoids this problem. Does anyone have a reason we shouldn't build .pdb files for Release builds (we already do it for Debug builds). This only increases the size of the DLL by a few bytes and has negligible impact on performance. The plan would be to update the Release build for every .dsp in Apache.dsw: - in compiler settings generate Program Database Debug Info (/Zi) - in the Linker Debug settings generate Microsoft Format debug Info (/debug) The other thing we would need to do would be to save away the .pdb files when the binary install package is built (I don't think we'd want to package them in the install image since the files are quite large). Allan Edwards Sounds like a really good idea to me. Bill
Re: Keeping per-virtual host data?
On Mon, 30 Sep 2002 22:50:13 +0100, Steve Kemp wrote: Hi All, I've written a module for Apache which will limit the amount of data the server may transfer in a given time period. (OK that's a simplistic overview anyway ;) To do this I make use of the information gathered by mod_status, and when the data transferred goes over a given threshhold I cause the server to simply respond to all requests with a redirect. This works fine on a per-server basis. I'd like to re-code my module to work with virtual hosts; so I believe I need some way of keeping track of the size of data which has been transferred on a per-vhost basis. (Or failing that find another module, like mod_status, which contains the necessary magic which I can piggy back upon). Is this possible, and if so does anybody have a pointer to a method I could use? All of the modules I've studied seem to manage to operate without having per vhost static data which makes me think either I'm missing something obvious, or it's not possible. Either way I'd love to be enlightened. it's hard, as there is no way 'in-process' to manage that information without using a mutex lock to protect the data structure (imagine two processes/threads serving the same URL for a virtual host) or having some kind per-thread per-vhost structure which your module would summarize. mod-status avoids a lock as each thread writes it's server stats to a thread-specifc area which mod-status sums up. It does cross my mind that the forking may rule out persistant static data of the kind I want; but temporary files could be used I guess.. For those who might be interested my code is online at: http://www.steve.org.uk/Software/mod_curb/ (Just be gentle; it's my first module ;) Steve --- www.steve.org.uk
[PATCH] Threadsafety to mod_ssl dbm session cache
I am reposting a patch posted by Jeff Trawick which adds mutex protection to the retrieve operation. Before I applied this patch, I experienced the following problems with 2.0.40 on win32: - Child process termination (status 3221225477 in error_log, segfault?) - often twice or more every weekday. - Infinite loops (I traced one to dbm_close: http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12705) several times a week. I applied the patch on September 17th and the server has been running perfectly stable since. There has been no significant change in server load, nor any other changes in software or hardware. The extra mutex protection is in my experience required on win32 (probably needed for other threaded MPMs as well), and IMO should be commited to cvs. Index: modules/ssl/ssl_scache_dbm.c === RCS file: /home/cvs/httpd-2.0/modules/ssl/ssl_scache_dbm.c,v retrieving revision 1.16 diff -u -r1.16 ssl_scache_dbm.c --- modules/ssl/ssl_scache_dbm.c17 May 2002 11:24:17 - 1.16 +++ modules/ssl/ssl_scache_dbm.c17 Sep 2002 11:32:47 - -228,21 +228,25 * XXX: Should we open the dbm against r-pool so the cleanup will * do the apr_dbm_close? This would make the code a bit cleaner. */ +ssl_mutex_on(s); if ((rc = apr_dbm_open(dbm, mc-szSessionCacheDataFile, APR_DBM_RWCREATE, SSL_DBM_FILE_MODE, mc-pPool)) != APR_SUCCESS) { ap_log_error(APLOG_MARK, APLOG_ERR, rc, s, Cannot open SSLSessionCache DBM file `%s' for reading (fetch), mc-szSessionCacheDataFile); +ssl_mutex_off(s); return NULL; } rc = apr_dbm_fetch(dbm, dbmkey, dbmval); if (rc != APR_SUCCESS) { apr_dbm_close(dbm); +ssl_mutex_off(s); return NULL; } if (dbmval.dptr == NULL || dbmval.dsize = sizeof(time_t)) { apr_dbm_close(dbm); +ssl_mutex_off(s); return NULL; } -251,12 +255,14 ucpData = (UCHAR *)malloc(nData); if (ucpData == NULL) { apr_dbm_close(dbm); +ssl_mutex_off(s); return NULL; } memcpy(ucpData, (char *)dbmval.dptr+sizeof(time_t), nData); memcpy(expiry, dbmval.dptr, sizeof(time_t)); apr_dbm_close(dbm); +ssl_mutex_off(s); /* make sure the stuff is still not expired */ now = time(NULL); - cheers, amund
Re: Build on AIX fails
Do the LDAP authentication modules build on AIX yet? At 2.0.40 I could not get the httpd-ldap sub-project to build on AIX -- despite getting it to build just fine on Solaris and Windows Correction -- the module built on AIX, but would not load on startup. -- Jess Holle P.S. It took a fair amount of help from Jeff for me to get 2.0.40 with SSL and mod_jk to build on AIX (with gcc) even without the LDAP stuff! Jeff Trawick wrote: "Bennett, Tony - CNF" [EMAIL PROTECTED] writes: I just downloaded 2.0.42 and attempted to build it on AIX 4.3.3 and it failed...it appears to be building module libraries named lib$MODULE_NAME.al (for example: modules/dav/main/.libs/libmod_dav.al) instead of lib$MODULE_NAME.a(for example: modules/dav/main/.libs/libmod_dav.a) libtool and IBM's C compiler don't get along perfectly. Luckily, those are warnings that you can ignore. These are the only errors, as far as I can see: ld: 0711-317 ERROR: Undefined symbol: .sk_new_null ld: 0711-317 ERROR: Undefined symbol: .X509_STORE_CTX_set_verify_cb ld: 0711-317 ERROR: Undefined symbol: .BIO_snprintf Aren't these all OpenSSL-related? If I get time I'll try to build it with SSL support. For background: My configure step: ./configure --prefix=/usr/local/apache2 \ --enable-dav=static \ --enable-dav_fs=static \ --enable-ssl=static \ --with-ssl=/home/dms/openssl_dir By the way, I always do CC=xlc_r ./configure --other-flags when using IBM's C compiler for AIX. xlc_r ensures that thread stuff is set up correctly.
Re: [PATCH] Threadsafety to mod_ssl dbm session cache
Amund Elstad [EMAIL PROTECTED] writes: I am reposting a patch posted by Jeff Trawick which adds mutex protection to the retrieve operation. Before I applied this patch, I experienced the following problems with 2.0.40 on win32: - Child process termination (status 3221225477 in error_log, segfault?) - often twice or more every weekday. - Infinite loops (I traced one to dbm_close: http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12705) several times a week. I applied the patch on September 17th and the server has been running perfectly stable since. There has been no significant change in server load, nor any other changes in software or hardware. The extra mutex protection is in my experience required on win32 (probably needed for other threaded MPMs as well), and IMO should be commited to cvs. Thanks for the update (and for diagnosing it in the first place!). I kept the patch around but wasn't going to commit until I got some feedback. I'll commit it today. -- Jeff Trawick | [EMAIL PROTECTED] Born in Roswell... married an alien...
Re: POST
On Mon, 30 Sep 2002, Greg Stein wrote: On Mon, Sep 30, 2002 at 06:53:09PM -0400, Ryan Bloom wrote: ... The problem is that the default_handler shouldn't be involved. Because mod_dav is now replacing the r-handler field for ALL requests, things Woah! *NOT* all requests. Only those under the URL namespace which has been assigned to mod_dav. It does not just go out and blast r-handler willy-nilly. You have specifically enabled DAV for the URL space in question. ... If those two things are done, then we could have two handlers for the same resource. However, mod_dav shouldn't just be taking over all requests and assuming that they were meant for the core server. Doing so means that all generators are broken if DAV is loaded in the server. It does not just be taking over all requests. It is handling requests for the space that you've assigned to mod_dav. For this particular case, the bug is in default_handler(). Plain and simple. There is no reason for a POST request to return the file contents. Yes, the system should also call the thing as a CGI script, in this case, but that doesn't excuse the default handler. No Greg, I'm sorry, but the bug has nothing to do with the default_handler. Plain and simple. If mod_dav wasn't in the server, the default_handler would never be called, because mod_cgi would have been given an opportunity to handle the request. The bug is in mod_dav, and no fix anyplace else will actually solve the problem. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: POST
On Tue, Oct 01, 2002 at 11:03:16AM -0400, Ryan Bloom wrote: On Mon, 30 Sep 2002, Greg Stein wrote: ... For this particular case, the bug is in default_handler(). Plain and simple. There is no reason for a POST request to return the file contents. Yes, the system should also call the thing as a CGI script, in this case, but that doesn't excuse the default handler. No Greg, I'm sorry, but the bug has nothing to do with the default_handler. Plain and simple. If mod_dav wasn't in the server, the default_handler would never be called, because mod_cgi would have been given an opportunity to handle the request. The bug is in mod_dav, and no fix anyplace else will actually solve the problem. mod_dav causes the bug in default_handler to be exposed. A secondary issue is how mod_dav alters the dav-handler in a way which disables POST to a CGI. You've fixed this latter issue (altho it breaks the RFC 2518 requirement of checking locks before allowing a POST). I think we should figure out a different hook to use for that check. While the fixups hook isn't really intended for this, it would seem a good place to do the check. mod_dav already hooks it, so that should be fine. Cheers, -g -- Greg Stein, http://www.lyra.org/
Re: cvs commit: httpd-2.0 ROADMAP
On Tue, Oct 01, 2002 at 03:26:20PM -, [EMAIL PROTECTED] wrote: jerenkrantz2002/10/01 08:26:20 Modified:.ROADMAP Log: I'm borderline obsessive compulsive regarding tabs, but you knew that... (Also correct directive usage) Yah... Emacs' text mode inserts the tabs. I should just disable tabs globally in my .emacs ... -- Greg Stein, http://www.lyra.org/
Re: POST
On Tue, 1 Oct 2002, Greg Stein wrote: On Tue, Oct 01, 2002 at 11:03:16AM -0400, Ryan Bloom wrote: On Mon, 30 Sep 2002, Greg Stein wrote: ... For this particular case, the bug is in default_handler(). Plain and simple. There is no reason for a POST request to return the file contents. Yes, the system should also call the thing as a CGI script, in this case, but that doesn't excuse the default handler. No Greg, I'm sorry, but the bug has nothing to do with the default_handler. Plain and simple. If mod_dav wasn't in the server, the default_handler would never be called, because mod_cgi would have been given an opportunity to handle the request. The bug is in mod_dav, and no fix anyplace else will actually solve the problem. mod_dav causes the bug in default_handler to be exposed. Nope. The default_handler relies on other handlers to run first, so that it only gets the requests it is supposed to get. Even if we change the default_handler to only serve GET requests, the bug still exists, because the bug is in mod_dav. A secondary issue is how mod_dav alters the dav-handler in a way which disables POST to a CGI. You've fixed this latter issue (altho it breaks the RFC 2518 requirement of checking locks before allowing a POST). I think we should figure out a different hook to use for that check. While the fixups hook isn't really intended for this, it would seem a good place to do the check. mod_dav already hooks it, so that should be fine. The fixups hook is definately intended for this. The handler phase is only intended for actually generating content. Mod_dav isn't generating content for a POST request, thus it shouldn't be trying to handle it in the handler phase. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: cvs commit: httpd-2.0/server core.c
On 1 Oct 2002 [EMAIL PROTECTED] wrote: gstein 2002/10/01 09:24:41 Modified:server core.c Log: Fix bug in the default handler. POST is not allowed on regular files. The resource must be handled by something *other* than the default handler. -1. This is going to break PHP. PHP is a filter now, which means that the page is served by the default_handler. Since PHP requests are allowed to use POST, this is now broken. As I said before, the bug is in mod_dav, and must be fixed there. Ryan Revision ChangesPath 1.207 +8 -0 httpd-2.0/server/core.c Index: core.c === RCS file: /home/cvs/httpd-2.0/server/core.c,v retrieving revision 1.206 retrieving revision 1.207 diff -u -r1.206 -r1.207 --- core.c 30 Sep 2002 23:43:18 - 1.206 +++ core.c 1 Oct 2002 16:24:41 - 1.207 @@ -3259,6 +3259,14 @@ return HTTP_NOT_FOUND; } +/* we understood the POST method, but it isn't legal for this + particular resource. */ +if (r-method_number == M_POST) { +ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, + This resource does not accept the POST method.); +return HTTP_METHOD_NOT_ALLOWED; +} + if ((status = apr_file_open(fd, r-filename, APR_READ | APR_BINARY, 0, r-pool)) != APR_SUCCESS) { ap_log_rerror(APLOG_MARK, APLOG_ERR, status, r, -- ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
[PATCH] HTTP_NOT_MODIFIED (304) and Authentication-Info(bug???)
Hi, Please refer to my earlier post regarding 304 response and the Authentication-Info header. I am resending it in the hope of receiving an authoratitive response. Is Authentication-Info header (as defined in RFC-2617) for Digest-authentication considered Entity-header? When Apache retuns a 304 Not Modified, it simply includes WWW-Authenticate and Proxy-Authenticate among the authentication related headers (http_protocol.c:1609 for Apache2, and http_protocol.c:2746 for Apache-1.3.26). According to RFC-2616, 304 should not include other entity headers. Now, if Digest authentication (or any other scheme that makes use of Authentication-Info) is enabled for a particular location, and the server has to return a 304, this header does not go across. This would break the auth info state between the client and the server. Since Digest-authentication is an accepted extension to HTTP/1.1, shouldn't Authentication-Info also be sent across? If it is determined that Authentication-Info needs to be sent across for a 304 Not Modified response, I am attaching a patch that will do the needful. --- http_protocol.c Thu Sep 5 19:27:48 2002 +++ http_protocol.c Tue Oct 1 10:49:33 2002 -1618,6 +1618,7 Warning, WWW-Authenticate, Proxy-Authenticate, + Authentication-Info, NULL); } else { Thanks, -Indu
Re: POST
On Tue, Oct 01, 2002 at 12:27:25PM -0400, Ryan Bloom wrote: On Tue, 1 Oct 2002, Greg Stein wrote: ... mod_dav causes the bug in default_handler to be exposed. Nope. The default_handler relies on other handlers to run first, so that it only gets the requests it is supposed to get. Even if we change the default_handler to only serve GET requests, the bug still exists, because the bug is in mod_dav. What do you mean nope. Stop trying to be so Right. There is a bug in the default_handler. I just committed a fix. And I *already* said there is a separate issue/bug. You don't have to keep beating the damned horse and continue to disagree to try and show that you're the guy with all the answers. It's really annoying. You could just say, yah. default_handler does have a bug, but both need to be fixed [but I already fixed the mod_dav one]. A secondary issue is how mod_dav alters the dav-handler in a way which disables POST to a CGI. You've fixed this latter issue (altho it breaks the RFC 2518 requirement of checking locks before allowing a POST). I think we should figure out a different hook to use for that check. While the fixups hook isn't really intended for this, it would seem a good place to do the check. mod_dav already hooks it, so that should be fine. The fixups hook is definately intended for this. The handler phase is only intended for actually generating content. Mod_dav isn't generating content for a POST request, thus it shouldn't be trying to handle it in the handler phase. /** * Allows modules to perform module-specific fixing of header fields. This * is invoked just before any content-handler * param r The current request * return OK, DECLINED, or HTTP_... * ingroup hooks */ AP_DECLARE_HOOK(int,fixups,(request_rec *r)) As I said: this hook isn't quite right, but it will serve our needs. I'll tweak mod_dav to fix the POST checking. Cheers, -g -- Greg Stein, http://www.lyra.org/
PHP POST handling (was: cvs commit: httpd-2.0/server core.c)
On Tue, Oct 01, 2002 at 12:29:53PM -0400, [EMAIL PROTECTED] wrote: On 1 Oct 2002 [EMAIL PROTECTED] wrote: gstein 2002/10/01 09:24:41 Modified:server core.c Log: Fix bug in the default handler. POST is not allowed on regular files. The resource must be handled by something *other* than the default handler. -1. This is going to break PHP. PHP is a filter now, which means that the page is served by the default_handler. Since PHP requests are allowed to use POST, this is now broken. As I said before, the bug is in mod_dav, and must be fixed there. That will get fixed. I keep saying that, but you keep harping on default handler isn't broken. Bunk. The default_handler is broken. If you POST to a resource, it returns the resource. That isn't right. But the PHP point is a good one. So how do we prevent a POST from returning the resource, yet make it available for PHP? I think that we have a model problem here. For a POST method, you need something to *handle* the POST. Somebody actually needs to accept the posted content, do something with it, and then *generate* a response. That response might actually have other filter processing to apply to it. It almost seems like PHP needs to act as a handler for POST requests (to deal with the posted content), and as a filter for GET requests. Does that seem right? Cheers, -g -- Greg Stein, http://www.lyra.org/
Re: cvs commit: httpd-2.0 ROADMAP
On Tue, Oct 01, 2002 at 01:55:00PM -, [EMAIL PROTECTED] wrote: ... * The translate_name hook goes away + + Wrowe altogether disagrees. translate_name today even operates + on URIs ... this mechansim needs to be preserved. Hunh? The translate_name hook is defined to translate a URI into a filename. As such, it doesn't apply to non-filename-based repositories. What am I missing? :-) Cheers, -g -- Greg Stein, http://www.lyra.org/
Re: PHP POST handling
On Tue, Oct 01, 2002 at 01:34:20PM -0400, Ryan Bloom wrote: On Tue, 1 Oct 2002, Greg Stein wrote: ... The default_handler is broken. If you POST to a resource, it returns the resource. That isn't right. But the PHP point is a good one. So how do we prevent a POST from returning the resource, yet make it available for PHP? I think that we have a model problem here. For a POST method, you need something to *handle* the POST. Somebody actually needs to accept the posted content, do something with it, and then *generate* a response. That response might actually have other filter processing to apply to it. It almost seems like PHP needs to act as a handler for POST requests (to deal with the posted content), and as a filter for GET requests. Does that seem right? No, it doesn't seem right. The default_handler is designed to deliver a file from disk down to the next filter. In the case of filters, we have extended the handler to work for POST and GET requests, because that is how filters work. If we want this design, then all right. But it only works if we have something to intercept the POST if nobody handles it. If a filter does not handle the POSTed request body, then we need to return 405 (Method Not Allowed). IOW, alter the handler as I suggested, or have another filter to figure out that the body wasn't processed and to rejigger the response into an error response. That latter design for turn-into-error seems a bit hacky, tho, which is why I suggested the handler mechanism. The assumption is that if it is a POST request, and the default_handler gets the request, then there is a filter that will actually run the request. This works, and has worked since we introduced filters, and converted some handlers to filters. Well... by this logic, we need to fix the default handler to *also* deliver files for PROPFIND, PROPPATCH, REPORT, etc. It is just as easy to argue that a PHP script can/should handle those requests, too. I simply don't think that a filter should read/consume a request body. The handler is responsible for handling the request, which includes processing the body. The bug is in mod_dav. It is trying to handle too many requests. There Yup. You fixed it, but that also broke the lock-checking. But no biggy -- you didn't know and that is exactly why you asked for a review :-) I'll get that extra check behavior put back in. Ideally, we should have a hook that runs around fixup time which says should this request be handled? That hook would be used for if header processing (If:, If-Matches:, If-Not-Modifed:, etc), and it can be used for checking WebDAV locks. Currently, the handlers are doing all this stuff by calling ap_meets_conditions(). That will check some of the If-* headers, but not WebDAV's If: header. And if a handler forgets to include the call to meets_conditions, then it is broken. ... Because I don't believe there was a bug in the default_handler, it was working as designed. Well, it delivers files for POST, sure, but nothing stopped that from getting to the client if nobody handled it. The default_handler assumes that it will be run last, and that other modules will have an opportunity to serve the request if they are supposed to. Right. I am not trying to show that I have all the answers, but in this case, I did the research, and I found the bug. And, I do have a VERY strong opinion about how it should be solved. Filters introduced some intricacies that weren't in the server before. If we leave the change in that was made to the default_handler, nothing will be fixed. If we add a new handler that handles POST requests for PHP, what does that get us? It wouldn't have stopped the bug that we are discussing. At best, it would have changed the response the server gave to the user. However, that is only if the request was for a CGI script. In the case of PHP, php requests were working before my fix, because the default_handler was doing it's job. With the fix to the default_handler that is in place now and without the mod_dav fix, both CGI and PHP are broken. Without the default_handler fix and without the mod_dav fix, Just CGI are broken. With just the mod_dav fix, and without the default_handler fix, CGI and PHP work. With both fixes, CGI work and PHP are broken. You want to fix PHP scripts now by adding a new handler for PHP. Would that be specific to PHP? If so, then there are two ways to run PHP scripts. No thanks. If not, then I fail to see what is being fixed. I think we're refining our handling. These intermediates are broken in one way or another, but we're getting there. About a month ago, before we started messing with mod_dav's decision making for handling, we had some problem or another (I forget what). Then Justin and I started monkeying with it in mod_dav, and we lost the dav-handler thing, and that threw off mod_dir because it jumped in when it shouldn't have. Then
Re: PHP POST handling
At 01:12 PM 10/1/2002, Greg Stein wrote: For PHP, we said make it a filter [so the source can come from anywhere]. I think we really should have said for GET requests, allow it to be processed by PHP. The POST, PROPFIND, COPY, etc should all be possible to handle by PHP, which means that PHP also needs a handler. Agreed, if you write a PHP script we better allow you to PROPFIND or COPY the puppy, in addition to POST. It doesn't mean we need a handler. We need to know if something is expected to be coming down the wire at one of our filters. Maybe more than one of our filters. One of my itches that I haven't had time yet to scratch is to implement the apreq filter to expose the post (propfind, copy, etc) data to one or more than one filter who -might- be interested in the client request body. Heck, PHP should also be able to handle a GET request. For example, it should be able to retrieve the content from a database, and then shove that into the filter stack. IOW, PHP is really a handler *and* a filter. It can handle the generation of content, but it can also process generated content when that content is a PHP script. That sort of misplaced design ends up exposing two fault points, one in the PHP handler, and one in the PHP filter. Better one source of pain. That said, it is -still- a handler since it just picks up the fd out of the sendfile bucket and processes that file. Until the zend mechanics are in place to slurp from brigade - output results, this never really belonged as a filter anyways. And that said, you can't break POST to the default handler, please revert that change. Bill
Re: PHP POST handling
--On Tuesday, October 1, 2002 11:12 AM -0700 Greg Stein [EMAIL PROTECTED] wrote: I simply don't think that a filter should read/consume a request body. The handler is responsible for handling the request, which includes processing the body. Well, PHP doesn't exactly do that. PHP's current strategy is to create an input filter that setsaside all input. This is triggered by the ap_discard_request_body() call in default_handler (as discard causes all data to be read). So, when data is actually pushed down into the output filter chain, PHP has a copy of the body in its private structure. And, if its script requires the body, it returns the ctx-post_data value in its callbacks. I think the biggest concern is when multiple modules want the input body. Right now, it's fairly vague what will happen (and I'm not even sure what the right answer is here). Forcing input filters and doing setasides (flat void* instead of bb's in PHP) seems a bit clunky. However, we also don't want to store the request body in memory. -- justin
Re: PHP POST handling
how many sites are you going to break with this core.c change? From what I have heard the RFC says that the server can do what it chooses on a POST to a file. we (apache 1.3 apache 2.042 and below) chose to serve a file. Greg Stein wrote: On Tue, Oct 01, 2002 at 01:34:20PM -0400, Ryan Bloom wrote: On Tue, 1 Oct 2002, Greg Stein wrote: ... The default_handler is broken. If you POST to a resource, it returns the resource. That isn't right. But the PHP point is a good one. So how do we prevent a POST from returning the resource, yet make it available for PHP? I think that we have a model problem here. For a POST method, you need something to *handle* the POST. Somebody actually needs to accept the posted content, do something with it, and then *generate* a response. That response might actually have other filter processing to apply to it. It almost seems like PHP needs to act as a handler for POST requests (to deal with the posted content), and as a filter for GET requests. Does that seem right? No, it doesn't seem right. The default_handler is designed to deliver a file from disk down to the next filter. In the case of filters, we have extended the handler to work for POST and GET requests, because that is how filters work. If we want this design, then all right. But it only works if we have something to intercept the POST if nobody handles it. If a filter does not handle the POSTed request body, then we need to return 405 (Method Not Allowed). IOW, alter the handler as I suggested, or have another filter to figure out that the body wasn't processed and to rejigger the response into an error response. That latter design for turn-into-error seems a bit hacky, tho, which is why I suggested the handler mechanism. The assumption is that if it is a POST request, and the default_handler gets the request, then there is a filter that will actually run the request. This works, and has worked since we introduced filters, and converted some handlers to filters. Well... by this logic, we need to fix the default handler to *also* deliver files for PROPFIND, PROPPATCH, REPORT, etc. It is just as easy to argue that a PHP script can/should handle those requests, too. I simply don't think that a filter should read/consume a request body. The handler is responsible for handling the request, which includes processing the body. The bug is in mod_dav. It is trying to handle too many requests. There Yup. You fixed it, but that also broke the lock-checking. But no biggy -- you didn't know and that is exactly why you asked for a review :-) I'll get that extra check behavior put back in. Ideally, we should have a hook that runs around fixup time which says should this request be handled? That hook would be used for if header processing (If:, If-Matches:, If-Not-Modifed:, etc), and it can be used for checking WebDAV locks. Currently, the handlers are doing all this stuff by calling ap_meets_conditions(). That will check some of the If-* headers, but not WebDAV's If: header. And if a handler forgets to include the call to meets_conditions, then it is broken. ... Because I don't believe there was a bug in the default_handler, it was working as designed. Well, it delivers files for POST, sure, but nothing stopped that from getting to the client if nobody handled it. The default_handler assumes that it will be run last, and that other modules will have an opportunity to serve the request if they are supposed to. Right. I am not trying to show that I have all the answers, but in this case, I did the research, and I found the bug. And, I do have a VERY strong opinion about how it should be solved. Filters introduced some intricacies that weren't in the server before. If we leave the change in that was made to the default_handler, nothing will be fixed. If we add a new handler that handles POST requests for PHP, what does that get us? It wouldn't have stopped the bug that we are discussing. At best, it would have changed the response the server gave to the user. However, that is only if the request was for a CGI script. In the case of PHP, php requests were working before my fix, because the default_handler was doing it's job. With the fix to the default_handler that is in place now and without the mod_dav fix, both CGI and PHP are broken. Without the default_handler fix and without the mod_dav fix, Just CGI are broken. With just the mod_dav fix, and without the default_handler fix, CGI and PHP work. With both fixes, CGI work and PHP are broken. You want to fix PHP scripts now by adding a new handler for PHP. Would that be specific to PHP? If so, then there are two ways to run PHP scripts. No thanks. If not, then I fail to see what is being fixed. I think we're refining our handling. These intermediates are broken in one way or another, but we're getting there. About a month ago, before we started messing with
Re: PHP POST handling
--On Tuesday, October 1, 2002 11:54 AM -0700 Ian Holsman [EMAIL PROTECTED] wrote: how many sites are you going to break with this core.c change? From what I have heard the RFC says that the server can do what it chooses on a POST to a file. we (apache 1.3 apache 2.042 and below) chose to serve a file. Nah, 1.3 returns 405 (Method Not Allowed) on a POST on a page served by the default_handler. See src/main/http_core.c:3860. -- justin
[PATCH] nudge the core output filter to allow streaming output
The content length filter is clever enough to handle sporadic output from CGIs, but just because it passes down a brigade with the output so far doesn't mean it will be sent. By adding a flush bucket to the end of the brigade, we make sure that core output filter doesn't hold onto it for a long time in case the amount of output is too small to sent immediately. The content length filter has already figured out that it will be a long time before more output is available. --- server/protocol.c 19 Sep 2002 12:20:19 - 1.1.1.1 +++ server/protocol.c 1 Oct 2002 18:15:44 - @@ -1263,6 +1263,9 @@ */ if (e != APR_BRIGADE_FIRST(b)) { apr_bucket_brigade *split = apr_brigade_split(b, e); +apr_bucket *flush = +apr_bucket_flush_create(r-connection-bucket_alloc); + +APR_BRIGADE_INSERT_TAIL(b, flush); rv = ap_pass_brigade(f-next, b); if (rv != APR_SUCCESS) { apr_brigade_destroy(split); testcase: printenv with flushing enabled for STDOUT and a sleep after every output statement use mod_cgi for handling the cgi request; mod_cgid is busted because the apr_file_t in the pipe bucket doesn't look like a pipe and the timeout tricks done by the content length filter are ineffective (the timeout calls get APR_EINVAL) -- Jeff Trawick | [EMAIL PROTECTED] Born in Roswell... married an alien...
[PATCH] mod_cgid tells APR that an os file is a pipe
mod_cgid has an AF_UNIX socket which it uses to create an apr_file_t which it uses to create a pipe bucket which it passes down the filter chain but it gets to a filter (content-length filter) which wants to play with the I/O timeout and APR fails the timeout manipulation because it doesn't think the apr_file_t represents a pipe (or something close enough to a pipe) Thus the content-length filter's code to allow streaming output doesn't work with mod_cgid because of the socket/file/pipe confusion. The following patch adds a way for an APR app to say that the apr_file_t being created from an os file should be treated as a pipe. Index: modules/generators/mod_cgid.c === RCS file: /cvs/phoenix/2.0.42/modules/generators/mod_cgid.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 mod_cgid.c --- modules/generators/mod_cgid.c 19 Sep 2002 12:20:17 - 1.1.1.1 +++ modules/generators/mod_cgid.c 1 Oct 2002 18:57:49 - @@ -1101,7 +1101,7 @@ * Note that this does not register a cleanup for the socket. We did * that explicitly right after we created the socket. */ -apr_os_file_put(tempsock, sd, 0, r-pool); +apr_os_file_put_ex(tempsock, sd, 1, 0, r-pool); if ((argv0 = strrchr(r-filename, '/')) != NULL) argv0++; @@ -1422,7 +1422,7 @@ * Note that this does not register a cleanup for the socket. We did * that explicitly right after we created the socket. */ -apr_os_file_put(tempsock, sd, 0, r-pool); +apr_os_file_put_ex(tempsock, sd, 1, 0, r-pool); if ((retval = ap_setup_client_block(r, REQUEST_CHUNKED_ERROR))) return retval; Index: srclib/apr/file_io/unix/open.c === RCS file: /cvs/phoenix/2.0.42/srclib/apr/file_io/unix/open.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 open.c --- srclib/apr/file_io/unix/open.c 19 Sep 2002 12:20:20 - 1.1.1.1 +++ srclib/apr/file_io/unix/open.c 1 Oct 2002 18:57:49 - @@ -219,15 +219,18 @@ return APR_SUCCESS; } -APR_DECLARE(apr_status_t) apr_os_file_put(apr_file_t **file, - apr_os_file_t *thefile, - apr_int32_t flags, apr_pool_t *pool) +APR_DECLARE(apr_status_t) apr_os_file_put_ex(apr_file_t **file, + apr_os_file_t *thefile, + int is_pipe, + apr_int32_t flags, + apr_pool_t *pool) { int *dafile = thefile; (*file) = apr_pcalloc(pool, sizeof(apr_file_t)); (*file)-pool = pool; (*file)-eof_hit = 0; +(*file)-is_pipe = is_pipe != 0; (*file)-blocking = BLK_UNKNOWN; /* in case it is a pipe */ (*file)-timeout = -1; (*file)-ungetchar = -1; /* no char avail */ @@ -250,6 +253,14 @@ } return APR_SUCCESS; } + +APR_DECLARE(apr_status_t) apr_os_file_put(apr_file_t **file, + apr_os_file_t *thefile, + apr_int32_t flags, + apr_pool_t *pool) +{ +return apr_os_file_put_ex(file, thefile, 0, flags, pool); +} APR_DECLARE(apr_status_t) apr_file_eof(apr_file_t *fptr) { Index: srclib/apr/include/apr_portable.h === RCS file: /cvs/phoenix/2.0.42/srclib/apr/include/apr_portable.h,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 apr_portable.h --- srclib/apr/include/apr_portable.h 19 Sep 2002 12:20:21 - 1.1.1.1 +++ srclib/apr/include/apr_portable.h 1 Oct 2002 18:57:49 - @@ -364,6 +364,20 @@ apr_int32_t flags, apr_pool_t *cont); /** + * convert the file from os specific type to apr type. + * @param file The apr file we are converting to. + * @param thefile The os specific file to convert + * @param ispipe Whether or not the descriptor should be treated as a pipe + * @param flags The flags that were used to open this file. + * @param cont The pool to use if it is needed. + * @remark On Unix, it is only possible to put a file descriptor into + * an apr file type. + */ +APR_DECLARE(apr_status_t) apr_os_file_put_ex(apr_file_t **file, + apr_os_file_t *thefile, int is_pipe, + apr_int32_t flags, apr_pool_t *cont); + +/** * convert the dir from os specific type to apr type. * @param dir The apr dir we are converting to. * @param thedir The os specific dir to convert -- Jeff Trawick | [EMAIL PROTECTED] Born in Roswell... married an alien...
Speaking of pipes from cgis causing trouble...
I've been working on the caching code and ran across a core dump... A particular file contains an SSI call to a cgi. The cgi causes a pipe bucket to pass down the chain. cache_in_filter tries to save the bucket away and core dumps. Since a pipe bucket can be of any length, and could take any amount of time to complete, I would assume that the cache code should decide not to cache any response with a pipe bucket, correct? -- Paul J. Reder --- The strength of the Constitution lies entirely in the determination of each citizen to defend it. Only if every single citizen feels duty bound to do his share in this defense are the constitutional rights secure. -- Albert Einstein
Re: Speaking of pipes from cgis causing trouble...
On Tue, 1 Oct 2002, Paul J. Reder wrote: I've been working on the caching code and ran across a core dump... A particular file contains an SSI call to a cgi. The cgi causes a pipe bucket to pass down the chain. cache_in_filter tries to save the bucket away and core dumps. Since a pipe bucket can be of any length, and could take any amount of time to complete, I would assume that the cache code should decide not to cache any response with a pipe bucket, correct? Not necessarily. The cache code should stream the data to the cache, and allow the data to also stream to the core_output_filter. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
RE: [PATCH] nudge the core output filter to allow streaming output
The content length filter is clever enough to handle sporadic output from CGIs, but just because it passes down a brigade with the output so far doesn't mean it will be sent. By adding a flush bucket to the end of the brigade, we make sure that core output filter doesn't hold onto it for a long time in case the amount of output is too small to sent immediately. The content length filter has already figured out that it will be a long time before more output is available. --- server/protocol.c 19 Sep 2002 12:20:19 - 1.1.1.1 +++ server/protocol.c 1 Oct 2002 18:15:44 - @@ -1263,6 +1263,9 @@ */ if (e != APR_BRIGADE_FIRST(b)) { apr_bucket_brigade *split = apr_brigade_split(b, e); +apr_bucket *flush = apr_bucket_flush_create(r-connection-bucket_alloc); + +APR_BRIGADE_INSERT_TAIL(b, flush); rv = ap_pass_brigade(f-next, b); if (rv != APR_SUCCESS) { apr_brigade_destroy(split); testcase: printenv with flushing enabled for STDOUT and a sleep after every output statement use mod_cgi for handling the cgi request; mod_cgid is busted because the apr_file_t in the pipe bucket doesn't look like a pipe and the timeout tricks done by the content length filter are ineffective (the timeout calls get APR_EINVAL) -- Jeff Trawick | [EMAIL PROTECTED] Born in Roswell... married an alien... +1. This replicates the desirable behaviour of 1.3 Bill
Re: PHP POST handling
On Tue, 1 Oct 2002, William A. Rowe, Jr. wrote: At 01:12 PM 10/1/2002, Greg Stein wrote: For PHP, we said make it a filter [so the source can come from anywhere]. I think we really should have said for GET requests, allow it to be processed by PHP. The POST, PROPFIND, COPY, etc should all be possible to handle by PHP, which means that PHP also needs a handler. Agreed, if you write a PHP script we better allow you to PROPFIND or COPY the puppy, in addition to POST. These are two different statements, if I am reading both correctly. Please correct me if I am not. Will, you are saying that if we have a PHP script, then we need to be able to do all DAV operations on the script. Greg, you are saying that a PHP script needs to be able to satisfy a DAV request (meaning that the PHP code actually copies the resource, or generates the PROPFIND data). Assuming I am reading the two statements correctly, I agree with Will, but not with Greg. There is a major difference between satisfying a COPY or PROPFIND request and generating a page that has accepted POST data. A filter will never be able to satisfy COPY or PROPFIND, because those are actions that should be done in the handler phase. However, having two ways to read the PHP script from disk (default_handler and php_handler), and run the page through the interpreter doesn't make sense. That is why PHP was re-written as a filter, to allow it to take advantage of ANY back-end data store that we have. Heck, PHP should also be able to handle a GET request. For example, it should be able to retrieve the content from a database, and then shove that into the filter stack. IOW, PHP is really a handler *and* a filter. It can handle the generation of content, but it can also process generated content when that content is a PHP script. I think I am missing something here. PHP doesn't handle content generation. It never has. In Apache 1.3, PHP could read a script from the disk and interpret it. In Apache 2.0, PHP _should_ be able to read a script from a bucket and interpret it. (The fact that it doesn't right now, is not really germane to this discussion). From my reading of the statement above, you want people to be able to write handlers in PHP, which would find another page or script in a database and send it down the filter stack. That can't be done right now, PHP can't write handlers that way, at least not that I am aware of. This BTW, is why mod_perl has both concepts, handlers and filters. Handlers are used as content endpoints, they generate data. Filters are used to modify data that was already generated. Please let me know if I have misunderstood anything in this mail. Everything I have said above is based on my reading of the message, and I tried to point out where I may have not understood what the original author was saying. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: Speaking of pipes from cgis causing trouble...
Ryan Bloom wrote: On Tue, 1 Oct 2002, Paul J. Reder wrote: I've been working on the caching code and ran across a core dump... A particular file contains an SSI call to a cgi. The cgi causes a pipe bucket to pass down the chain. cache_in_filter tries to save the bucket away and core dumps. Since a pipe bucket can be of any length, and could take any amount of time to complete, I would assume that the cache code should decide not to cache any response with a pipe bucket, correct? Not necessarily. The cache code should stream the data to the cache, and allow the data to also stream to the core_output_filter. Until it reaches the specified max cache size? So instead of trying to just copy and insert the pipe bucket, it should read from the pipe and add buckets to the cache (and forward copies on)? Seems like it would defeat some of the sendfile optimizations and such... Is it worth the code and lost optimizations to read/store/pass the piped data? -- Paul J. Reder --- The strength of the Constitution lies entirely in the determination of each citizen to defend it. Only if every single citizen feels duty bound to do his share in this defense are the constitutional rights secure. -- Albert Einstein
Re: PHP and Apache 1.3 issue ...
Hi all, I really wonder if nobody here on the list is able to answer this question although I thought most of the Apache core programmers are here I know this issue very well and would also like to get it solved cause we all on NetWare are living with it for months now...; and in addition Ramesh has only ONE day to get a fix in before his demo on Oct 3 so can please someone reply and try to answer his question below?? thanks, Guenter. I posted this in the PHP-dev mailing list, but it appeared to me that the question is more relevant to the apache-dev mailing list. I am reposting it here. Any help would be greatly appreciated. One further piece of information: We can cleanly shutdown Apache 2.0 running the same PHP binaries without any issues. I just now confirmed that with Apache 2.0, the thread which is calling bc_new_num() is the *same* as the thread which calls bc_free_num(). bc_new_num() gets called 3 times by the *same* thread at PHP load time and gets called 3 times by the *same* thread at PHP unload time. This means that there must be way someway to instruct Apache 1.3 to use the same thread for all related activities. Thanks, S.R. Ramesh Shankar wrote: As you may know, we are working to things lined up for the Oct 3 demo. We have run into a problem with shutting down Apache which is running PHP. I have provided the details from my posting on the PHP news group below. Any help you can offer in solving this issue would be greatly appreciated: News posting -- I am trying to port PHP and I am running into a problem with Apache 1.3. I am not familiar with the PHP language by itself and I am used to working at the O.S. level and so any help would be greatly appreciated. I tried to read through archives, the readmes, FAQs etc, but I couldn't find answer for this question: When PHP is loaded with Apache 1.3 and Apache is shutdown, PHP 4.2.2 crashes inside the routine _efree() called from bc_free_num(). I found the problem to be related to linked list corruption and on further investigation, suspected it to be a case of the thread calling bc_free_num() to be different from that of the thread that called bc_new_num(). I verfied this by enabling TSRM_DEBUG while building zend, which enables this detection in _efree(). I also set the tsrm_error_level to TSRM_ERROR_LEVEL_ERROR to enable dispalying of TSRM messages to the screen. And sure enough, I got an error message from _efree() from the following excerpt: #if defined(ZTS) TSRM_DEBUG if (p-thread_id != tsrm_thread_id()) { tsrm_error(TSRM_ERROR_LEVEL_ERROR, Memory block allocated at %s:(%d) on thread %x freed at %s:(%d) on thread %x, ignoring, p-filename, p-lineno, p-thread_id, __zend_filename, __zend_lineno, tsrm_thread_id()); return; } #endif I have ZTS defined thread support in Zend. I am not able to understand and appreciate all the details of the use of the macros to access the globals variables via AG(), CG() etc and from what I could figure out, ts_resource_ex()is some kind of thread specific data mechanism and would work only if threads are dedicated to PHP. I am not able to understand who is responsible for ensuring that the same thread gets used for a complete request - whether it is some thing that I need to do in Apache or whether I need to enable/disable something while compiling PHP. Any help would be greatly appreciated. Please copy me on your reply as well. Thanks in advance, S.R.
Re: PHP POST handling
On Tue, Oct 01, 2002 at 04:27:15PM -0400, Ryan Bloom wrote: On Tue, 1 Oct 2002, William A. Rowe, Jr. wrote: At 01:12 PM 10/1/2002, Greg Stein wrote: For PHP, we said make it a filter [so the source can come from anywhere]. I think we really should have said for GET requests, allow it to be processed by PHP. The POST, PROPFIND, COPY, etc should all be possible to handle by PHP, which means that PHP also needs a handler. Agreed, if you write a PHP script we better allow you to PROPFIND or COPY the puppy, in addition to POST. These are two different statements, if I am reading both correctly. Please correct me if I am not. Will, you are saying that if we have a PHP script, then we need to be able to do all DAV operations on the script. Greg, you are saying that a PHP script needs to be able to satisfy a DAV request (meaning that the PHP code actually copies the resource, or generates the PROPFIND data). Assuming I am reading the two statements correctly, I agree with Will, but not with Greg. Why couldn't mod_dav be implemented in PHP? I see no particular reason why not... Currently, PHP cannot because it is a filter, not a handler. [ and yes: you should be able to manage the *scripts* using DAV operations; typically, you do that through a different vhost or path on the server so that you don't confuse GET between GET source and GET output ] There is a major difference between satisfying a COPY or PROPFIND request and generating a page that has accepted POST data. Actually, I see little difference. A POST method accepts posted data and generates a response. PROPFIND accepts data and generates a response. COPY is mostly header-based, but I certainly don't see COPY being handled by a filter :-) A filter will never be able to satisfy COPY or PROPFIND, because those are actions that should be done in the handler phase. What makes these different from POST? If you can articulate that, then I'll be able to understand your POV much better. However, having two ways to read the PHP script from disk (default_handler and php_handler), and run the page through the interpreter doesn't make sense. That is why PHP was re-written as a filter, to allow it to take advantage of ANY back-end data store that we have. In this case, you're saying content generator provides script. execute script. But how do we get PHP to be a content generator? Should PHP never be able to act as a content generator? How do we get PHP to yank a Perl script out of a database and feed it to mod_perl? Heck... we don't have enough resolution in our codebase right now. How do we write a DAV provider in PHP, and store those scripts in SVN? In other words, I want SVN to provide the raw content (but not act as a DAV provider), then have that content executed by PHP to operate as a DAV server. There is a lot of fuzziness in the early stages of the filter stack. Where is the line between a handler and the beginning of the stack? How many stages does the server go thru before the initial content is found? For example, a disk file contains a PHP script which is a DAV server which loads the content out of a database which is a Perl script to generate content. That content is then SSI-processed, gzip'd, and then shoved onto the wire. If there is no request body, then the line between handler and filter stack just isn't there. You just kick an EOS brigade into the filter stack and the first filter inserts content before the EOS (say, by reading it off disk). When you get a request body, then it becomes a bit more difficult as it would be neat to pass that through the filter stack, but it becomes hard to distinguish between the input and the generated content. Blah blah. I'm a bit off track there. There is one entity in the request processing which is responsible for managing the request body. Everything else is about altering the resulting output. Who handles the request? It can't positively be the content generator, as we said that we want to load a script, and have that script be the handler (e.g. handle PROPFIND). Heck, PHP should also be able to handle a GET request. For example, it should be able to retrieve the content from a database, and then shove that into the filter stack. IOW, PHP is really a handler *and* a filter. It can handle the generation of content, but it can also process generated content when that content is a PHP script. I think I am missing something here. PHP doesn't handle content generation. It never has. In Apache 1.3, PHP could read a script from the disk and interpret it. In Apache 2.0, PHP _should_ be able to read a script from a bucket and interpret it. In 2.0, we have a thing called content generation which means generate the script content. In 1.3, that was wrapped into the processing of that script. So yes... PHP used to handle the concept of content generation, as we know it in 2.0 today. By moving PHP to a filter, we simply allowed the
Re: PHP POST handling
On Tue, Oct 01, 2002 at 01:32:16PM -0500, William A. Rowe, Jr. wrote: At 01:12 PM 10/1/2002, Greg Stein wrote: ... One of my itches that I haven't had time yet to scratch is to implement the apreq filter to expose the post (propfind, copy, etc) data to one or more than one filter who -might- be interested in the client request body. As long as it is understood that only *one* thing can consume the request body. Then the question arises: how do you arbitrate that? It would be nice to simply say the handler is the consume but that doesn't solve the case of a PHP script stored in an SVN repository. It is almost like we need some concept of stages leading up to request [body] processing and content generation. Heck, PHP should also be able to handle a GET request. For example, it should be able to retrieve the content from a database, and then shove that into the filter stack. IOW, PHP is really a handler *and* a filter. It can handle the generation of content, but it can also process generated content when that content is a PHP script. That sort of misplaced design ends up exposing two fault points, one in the PHP handler, and one in the PHP filter. Better one source of pain. But they are two modes of operation. In one, you generate the original content (e.g. a PROPFIND or GET response against a database), in the other you're filtering content. That said, it is -still- a handler since it just picks up the fd out of the sendfile bucket and processes that file. Until the zend mechanics are in place to slurp from brigade - output results, this never really belonged as a filter anyways. The problem is that it isn't really filtering content. The filter thing was a hack to enable us to have PHP scripts in arbitrary locations (e.g. in a DAV repository). But it isn't *filtering*. It is executing the script and producing output. The notion of take this input, munge it in some way, and produce output is *very* shaky. And that said, you can't break POST to the default handler, please revert that change. The POST behavior before my fix was a regression against 1.3. Its existence was to support the PHP-processing-as-a-filter implementation. The impact of adding the POST feature to the default handler was never really detected until we uncovered it via a bug in mod_dav. Since this regression was found, I went in and fixed it. Unfortunately, that breaks PHP :-) But I think the PHP integration is broken and needs to be fixed. While new PHP integration is being solved, we can *temporarily* revert the change so that PHP users aren't inconvenienced while the integration is worked on. But the default_handler fix does need to go in to prevent the regression. (I term it a regression because 1.3 didn't do it, and POSTing to something which isn't going to handle the posted content is not right; we should be generating an error that states the POST is not allowed for that resource; that is what 405 is all about) Cheers, -g -- Greg Stein, http://www.lyra.org/
Re: PHP POST handling
On Tue, Oct 01, 2002 at 11:51:12AM -0700, Justin Erenkrantz wrote: --On Tuesday, October 1, 2002 11:12 AM -0700 Greg Stein [EMAIL PROTECTED] wrote: I simply don't think that a filter should read/consume a request body. The handler is responsible for handling the request, which includes processing the body. Well, PHP doesn't exactly do that. PHP's current strategy is to create an input filter that setsaside all input. This is triggered by the ap_discard_request_body() call in default_handler (as discard causes all data to be read). So, when data is actually pushed down into the output filter chain, PHP has a copy of the body in its private structure. And, if its script requires the body, it returns the ctx-post_data value in its callbacks. Ohmigod. I *really* didn't need to hear that. So you're saying that if I do file upload to a PHP script, and upload a 10 megabyte file, then it is going to spool that whole mother into memory? Oh oh... even better. Let's just say that the PHP script isn't even *thinking* about handling a request body. Maybe it is only set up for a GET request. Or maybe it *is* set up for POST, but only for FORM contents. But Mr Attacker comes along and throws 1 gigabyte into the request. What then? DoS? Swap hell on the server? I think the biggest concern is when multiple modules want the input body. Right now, it's fairly vague what will happen (and I'm not even sure what the right answer is here). Forcing input filters and doing setasides (flat void* instead of bb's in PHP) seems a bit clunky. However, we also don't want to store the request body in memory. -- justin Agreed on all parts. I think a filter *can* read the request body (i.e. the content generator loads a PHP script, PHP runs it (as the first filter), reads the body, and loads content from a database). But that implies that the request body should not have been thrown out in the default handler. But it almost seems cleaner to say there is a series of stages which perform the request handling: process the body and generate the (initial) content. These stages could load a script from somewhere, run it, (repeat) and generate the content into the filter stack. Right now, we are confusing a *script* with *content*. Cheers, -g -- Greg Stein, http://www.lyra.org/
Re: PHP POST handling
On Tue, Oct 01, 2002 at 04:27:15PM -0400, Ryan Bloom wrote: On Tue, 1 Oct 2002, William A. Rowe, Jr. wrote: At 01:12 PM 10/1/2002, Greg Stein wrote: For PHP, we said make it a filter [so the source can come from anywhere]. I think we really should have said for GET requests, allow it to be processed by PHP. The POST, PROPFIND, COPY, etc should all be possible to handle by PHP, which means that PHP also needs a handler. Agreed, if you write a PHP script we better allow you to PROPFIND or COPY the puppy, in addition to POST. These are two different statements, if I am reading both correctly. Please correct me if I am not. Will, you are saying that if we have a PHP script, then we need to be able to do all DAV operations on the script. Greg, you are saying that a PHP script needs to be able to satisfy a DAV request (meaning that the PHP code actually copies the resource, or generates the PROPFIND data). Assuming I am reading the two statements correctly, I agree with Will, but not with Greg. Why couldn't mod_dav be implemented in PHP? I see no particular reason why not... Currently, PHP cannot because it is a filter, not a handler. We have a switch in PHP now to handle mod_dav requests actually (under 1.3.x) There is no specific DAV support in there, it's just a switch that allows PHP to be a handler for things other than GET, HEAD and POST so people can implement whatever DAV stuff they want in userspace. -Rasmus
Re: PHP POST handling
On Tue, Oct 01, 2002 at 03:30:43PM -0700, Rasmus Lerdorf wrote: ... Why couldn't mod_dav be implemented in PHP? I see no particular reason why not... Currently, PHP cannot because it is a filter, not a handler. We have a switch in PHP now to handle mod_dav requests actually (under 1.3.x) There is no specific DAV support in there, it's just a switch that allows PHP to be a handler for things other than GET, HEAD and POST so people can implement whatever DAV stuff they want in userspace. Sweet! That is really nice... Also, it would be prudent to point out that we aren't *just* talking about WebDAV methods. It should be quite possible to experiment and try out new HTTP methods. HTTP is explicitly defined to allow arbitrary methods (especially for experimenting prior to an RFC, as new RFCs are released, vendor specific, or just plain ol' having fun). Cheers, -g -- Greg Stein, http://www.lyra.org/
Re: PHP POST handling
At 03:27 PM 10/1/2002, Ryan Bloom wrote: On Tue, 1 Oct 2002, William A. Rowe, Jr. wrote: At 01:12 PM 10/1/2002, Greg Stein wrote: For PHP, we said make it a filter [so the source can come from anywhere]. I think we really should have said for GET requests, allow it to be processed by PHP. The POST, PROPFIND, COPY, etc should all be possible to handle by PHP, which means that PHP also needs a handler. Agreed, if you write a PHP script we better allow you to PROPFIND or COPY the puppy, in addition to POST. These are two different statements, if I am reading both correctly. Please correct me if I am not. Will, you are saying that if we have a PHP script, then we need to be able to do all DAV operations on the script. Greg, you are saying that a PHP script needs to be able to satisfy a DAV request (meaning that the PHP code actually copies the resource, or generates the PROPFIND data). Both Greg and I are stating that PHP should be able to serve PROPFIND, COPY, GET, POST, DELETE, or FOOBAR requests. PHP scripts can be coerced into providing all sorts of useful behaviors, not the least of which is GET. Much like a CGI script might know how to handle DAV or other sorts of requests. Assuming I am reading the two statements correctly, I agree with Will, but not with Greg. There is a major difference between satisfying a COPY or PROPFIND request and generating a page that has accepted POST data. There is a trivial difference. Both take client bodies. I'm not even suggesting that COPY or PROPFIND or FOOBAR should be allowed by default, but the administrator should be able to override the defaults if they have a script to handle such methods. A filter will never be able to satisfy COPY or PROPFIND, because those are actions that should be done in the handler phase. However, having two ways to read the PHP script from disk (default_handler and php_handler), and run the page through the interpreter doesn't make sense. That is why PHP was re-written as a filter, to allow it to take advantage of ANY back-end data store that we have. That's bogus. You REALLY can't argue that FOOBAR must be a handler but POST must be a filter, since both take client bodies and provide response bodies. Pick your argument and quit waffling on the fence. Heck, PHP should also be able to handle a GET request. For example, it should be able to retrieve the content from a database, and then shove that into the filter stack. IOW, PHP is really a handler *and* a filter. It can handle the generation of content, but it can also process generated content when that content is a PHP script. I think I am missing something here. PHP doesn't handle content generation. It never has. In Apache 1.3, PHP could read a script from the disk and interpret it. In Apache 2.0, PHP _should_ be able to read a script from a bucket and interpret it. That script result wasn't generated? What am I missing here? (The fact that it doesn't right now, is not really germane to this discussion). Agreed, we are discussing how things aught to work, not how they have been kludged to work. From my reading of the statement above, you want people to be able to write handlers in PHP, which would find another page or script in a database and send it down the filter stack. That can't be done right now, PHP can't write handlers that way, at least not that I am aware of. This BTW, is why mod_perl has both concepts, handlers and filters. Handlers are used as content endpoints, they generate data. Filters are used to modify data that was already generated. Please let me know if I have misunderstood anything in this mail. Everything I have said above is based on my reading of the message, and I tried to point out where I may have not understood what the original author was saying. Greg and I are really in the same place, but arguing opposite solutions {we agree all these cases should be consistent, and Greg argues that the answer is handler+filter, while I argue the answer is remains filters.} Bill
Re: Speaking of pipes from cgis causing trouble...
At 03:27 PM 10/1/2002, Paul J. Reder wrote: Ryan Bloom wrote: On Tue, 1 Oct 2002, Paul J. Reder wrote: I've been working on the caching code and ran across a core dump... A particular file contains an SSI call to a cgi. The cgi causes a pipe bucket to pass down the chain. cache_in_filter tries to save the bucket away and core dumps. Since a pipe bucket can be of any length, and could take any amount of time to complete, I would assume that the cache code should decide not to cache any response with a pipe bucket, correct? Not necessarily. The cache code should stream the data to the cache, and allow the data to also stream to the core_output_filter. Until it reaches the specified max cache size? So instead of trying to just copy and insert the pipe bucket, it should read from the pipe and add buckets to the cache (and forward copies on)? Seems like it would defeat some of the sendfile optimizations and such... Is it worth the code and lost optimizations to read/store/pass the piped data? Not if you read off the headers, determine cacheability, and then send the remaining pipe bucket down and ignore it when we can't cache the result anyways. If the resource can be cached, the penalty for reading and then storing + sending is nothing when you weigh serving out of the cache, next time around. Bill
Re: PHP POST handling
At 05:19 PM 10/1/2002, Greg Stein wrote: On Tue, Oct 01, 2002 at 01:32:16PM -0500, William A. Rowe, Jr. wrote: At 01:12 PM 10/1/2002, Greg Stein wrote: ... One of my itches that I haven't had time yet to scratch is to implement the apreq filter to expose the post (propfind, copy, etc) data to one or more than one filter who -might- be interested in the client request body. As long as it is understood that only *one* thing can consume the request body. Then the question arises: how do you arbitrate that? It would be nice to simply say the handler is the consume but that doesn't solve the case of a PHP script stored in an SVN repository. It is almost like we need some concept of stages leading up to request [body] processing and content generation. Wrong. Multiple things can access the properties from the request. Consider some form variables. One filter might be transforming on variable X, while another performs some transformation on variable Z. And they use the same storage for all small bits of the POST. In the case of upload requests, one consumer must 'sign up' to stream the file, and provide a method for retrieving the contents if anyone else cares {for the duration of the request.} In theory, that consumer is the script that wants to persist the POSTed file upload. If nobody wants to capture a huge POST body, we may end up with a tmpfile method, but if the file will be persisted, there is no reason to stow it twice. Heck, PHP should also be able to handle a GET request. For example, it should be able to retrieve the content from a database, and then shove that into the filter stack. IOW, PHP is really a handler *and* a filter. It can handle the generation of content, but it can also process generated content when that content is a PHP script. That sort of misplaced design ends up exposing two fault points, one in the PHP handler, and one in the PHP filter. Better one source of pain. But they are two modes of operation. In one, you generate the original content (e.g. a PROPFIND or GET response against a database), in the other you're filtering content. In both cases you are transforming a PHP script into the resulting output from processing the script. No difference, truly. That said, it is -still- a handler since it just picks up the fd out of the sendfile bucket and processes that file. Until the zend mechanics are in place to slurp from brigade - output results, this never really belonged as a filter anyways. The problem is that it isn't really filtering content. The filter thing was a hack to enable us to have PHP scripts in arbitrary locations (e.g. in a DAV repository). But it isn't *filtering*. It is executing the script and producing output. The notion of take this input, munge it in some way, and produce output is *very* shaky. And that said, you can't break POST to the default handler, please revert that change. The POST behavior before my fix was a regression against 1.3. Its existence was to support the PHP-processing-as-a-filter implementation. The impact of adding the POST feature to the default handler was never really detected until we uncovered it via a bug in mod_dav. Since this regression was found, I went in and fixed it. Unfortunately, that breaks PHP :-) But I think the PHP integration is broken and needs to be fixed. While new PHP integration is being solved, we can *temporarily* revert the change so that PHP users aren't inconvenienced while the integration is worked on. But the default_handler fix does need to go in to prevent the regression. (I term it a regression because 1.3 didn't do it, and POSTing to something which isn't going to handle the posted content is not right; we should be generating an error that states the POST is not allowed for that resource; that is what 405 is all about) We should continue this discussion after the change is reverted, and let's find a better answer. Going back to the apreq input filter, I can see where it would share with the core if anyone registered to review the client body. If someone does, the handler could be toggled to accept POST. If nobody registers, then we could decline POST. Bill
Re: PHP POST handling
--On Tuesday, October 1, 2002 3:26 PM -0700 Greg Stein [EMAIL PROTECTED] wrote: So you're saying that if I do file upload to a PHP script, and upload a 10 megabyte file, then it is going to spool that whole mother into memory? Yup. Oh oh... even better. Let's just say that the PHP script isn't even *thinking* about handling a request body. Maybe it is only set up for a GET request. Or maybe it *is* set up for POST, but only for FORM contents. But Mr Attacker comes along and throws 1 gigabyte into the request. What then? DoS? Swap hell on the server? The PHP input filter will always read the body and allocate the space - irrespective of what the real script desires. In fact, looking at the code, I believe PHP will only free the memory if the script reads the body (do all scripts read the entire body?). So, a GET with a body (perfectly valid) may introduce memory leakage. PHP uses malloc/realloc/free because it wants the body in one contiguous chunk - therefore, our pools don't help. I think a filter *can* read the request body (i.e. the content generator loads a PHP script, PHP runs it (as the first filter), reads the body, and loads content from a database). But that implies that the request body should not have been thrown out in the default handler. Correct. At one point, I submitted a patch to the PHP lists to do exactly that, but once we rearranged how we discard bodies, this method couldn't work. The problem we had was when to 'discard' the body - we originally had it discarding at the end, but in order to properly handle 413s, we have to discard the body before generating the response. That's fairly new behavior on our part, but one I think that brings us in line with the desires of the RFC. Otherwise, we could have a 200 and then find out that it really should have been a 413 (because the body is too large). Therefore, we have to process the body before generating any content. And, since we now allow chunked encoding almost everywhere, we do have to read the entire body to know if it exceeds our limit. 1.3 chickened out on this and forbade chunked-encoding on request bodies. But it almost seems cleaner to say there is a series of stages which perform the request handling: process the body and generate the (initial) content. These stages could load a script from somewhere, run it, (repeat) and generate the content into the filter stack. Right now, we are confusing a *script* with *content*. I think the problem is that we aren't doing a good job of getting the script the content it (may) need. While it could be interesting to try to separate reading and writing in Apache, certainly the PHP language doesn't support that (as I believe you can write and then read the body). So, I'm not sure that we can split it out into multiple phases in an effective manner. Reading and writing in PHP (or any CGI script) is just too intertwined to support this. I think we're sorta stuck, but I might be wrong. -- justin
Re: Speaking of pipes from cgis causing trouble...
On Tue, 1 Oct 2002, Paul J. Reder wrote: Are there any other bucket types to worry about (i.e. that can't just be copied)? Currently no, but your code shouldn't have that kind of knowledge. There might be third-party buckets of this sort. Indeterminate length is indicated by bkt-length == -1. Just do an rv=apr_bucket_copy(bkt,copy); and if rv comes back as APR_ENOTIMPL, you know you have to read from the bucket and try again. --Cliff
Re: Speaking of pipes from cgis causing trouble...
William A. Rowe, Jr. wrote: At 03:27 PM 10/1/2002, Paul J. Reder wrote: Ryan Bloom wrote: On Tue, 1 Oct 2002, Paul J. Reder wrote: I've been working on the caching code and ran across a core dump... A particular file contains an SSI call to a cgi. The cgi causes a pipe bucket to pass down the chain. cache_in_filter tries to save the bucket away and core dumps. Since a pipe bucket can be of any length, and could take any amount of time to complete, I would assume that the cache code should decide not to cache any response with a pipe bucket, correct? Not necessarily. The cache code should stream the data to the cache, and allow the data to also stream to the core_output_filter. Until it reaches the specified max cache size? So instead of trying to just copy and insert the pipe bucket, it should read from the pipe and add buckets to the cache (and forward copies on)? Seems like it would defeat some of the sendfile optimizations and such... Is it worth the code and lost optimizations to read/store/pass the piped data? Not if you read off the headers, determine cacheability, and then send the remaining pipe bucket down and ignore it when we can't cache the result anyways. If the resource can be cached, the penalty for reading and then storing + sending is nothing when you weigh serving out of the cache, next time around. Well, considering that we wouldn't even be in the core dumping code if it hadn't passed the cacheability checks, I think *this* part is safe. My concern was more along the lines of, how much are we worried that reading and storing pipe buckets will result in wasted cycles when the cache entry has to be tossed after exceeding the specified maximum size. Is the potential cache gain worth the price of reading, storing, and passing the pipe contents given the potential for having to toss it for growing too big? If the probability of tossing is low, or largely under user configuration control, then I'll adjust the cache code to be able to read and store pipe and socket bucket contents. Are there any other bucket types to worry about (i.e. that can't just be copied)? -- Paul J. Reder --- The strength of the Constitution lies entirely in the determination of each citizen to defend it. Only if every single citizen feels duty bound to do his share in this defense are the constitutional rights secure. -- Albert Einstein
Re: Speaking of pipes from cgis causing trouble...
On Tue, 1 Oct 2002, Cliff Woolley wrote: Indeterminate length is indicated by bkt-length == -1. Oops, I meant to delete this sentence along with the rest of that thought. In this context, checking -length for -1 is not useful because there might be known-length buckets you can't copy. So ignore this sentence and just go with the checking apr_bucket_copy() for APR_ENOTIMPL thing. :)
Re: PHP POST handling
Both Greg and I are stating that PHP should be able to serve PROPFIND, COPY, GET, POST, DELETE, or FOOBAR requests. PHP scripts can be coerced into providing all sorts of useful behaviors, not the least of which is GET. Much like a CGI script might know how to handle DAV or other sorts of requests. I know I'm being offtopic here, but CGI script cannot provide DAV service currently because mod_cgi.c does not let script handle OPTIONS request. So even if CGI script knew how to handle it, it's impossible to notify client about that capability. I've looked though past discussion, and it seems there was no negative response on making this behavior configurable. There even was a patch (not the one I posted sometime ago - it existed for several years), but for some reason, it never got incorporated. -- Taisuke Yamada [EMAIL PROTECTED] Internet Initiative Japan Inc., Technical Planning Division
Authentication
Currently, authentication is broken with the standard Windows config file and current HEAD. Where is the documentation on the complete mess-up of the auth modules and how to get it working again? Thanks. -- Jerry Baker
Re: Authentication
Jerry Baker says: Yet, when I access that directory, I am just given an empty directory listing. No prompt for a username or pass. Nevermind. It's just something else that DAV broke. Turning off DAV fixed the problem. -- Jerry Baker
Re: Authentication
Jerry Baker says: Jerry Baker says: Yet, when I access that directory, I am just given an empty directory listing. No prompt for a username or pass. Nevermind. It's just something else that DAV broke. Turning off DAV fixed the problem. Please accept my apologies for the spam. The problem is not with DAV, but with LimitExcept GET HEAD POST. When I remove the LimitExcept directive, basic authentication works again. Makes no difference whether Dav On or Dav Off. -- Jerry Baker
Re: Authentication
Jerry Baker says: Currently, authentication is broken with the standard Windows config file and current HEAD. Where is the documentation on the complete mess-up of the auth modules and how to get it working again? Perhaps I should be more clear. I have a directory containing an .htaccess file. The config for this directory includes AllowOverride All. The contents of the .htaccess are: AuthUserFile D:/Web Sites/www/users.pwd AuthName Protected Area AuthType Basic Require valid-user I have the following modules loaded in the httpd.conf: mod_authn_anon.so mod_authn_dbm.so mod_authn_default.so mod_authn_file.so mod_authz_dbm.so mod_authz_default.so mod_authz_groupfile.so mod_authz_host.so mod_authz_user.so mod_auth_basic.so mod_auth_digest.so Yet, when I access that directory, I am just given an empty directory listing. No prompt for a username or pass. -- Jerry Baker