Re: do we still want sendfile enabled with our default conf files?
That's fine. Pay attention to what I suggested. Default to non-native sendfile, until we have know that it works. If you have an OS that you know for a fact does sendfile correctly, then that would be a case where we know that it works. Instead of #ifndef LINUX|WIN32|AIX|HPUX sendfile() #else fake_sendfile() #endif do: #ifdef SomeOSThatWorks sendfile(.) #else fake_sendfile() #endif Since the beginning of APR, we thought that we knew how to get sendfile working on most of our supported OS'. That's why we included sendfile support. It turns out though, that for many of those platforms, sendfile just doesn't work in some situations. When we have a platform without the bugs, then yes we should use sendfile, but for all other cases I care infinately more about correctness than speed. Ryan On Fri, 18 Mar 2005 08:23:18 -0800, Justin Erenkrantz [EMAIL PROTECTED] wrote: --On Friday, March 18, 2005 11:12 AM -0500 Ryan Bloom [EMAIL PROTECTED] wrote: funny, I took the list of exceptions to be so large and hard to maintain that it made more sense to go with Jeff's original idea of just disabling sendfile by default unless a user specifically decided to enable it. I just had to debug a problem for a friend with sendfile on Linux. I don't know what caused the problem, but disabling sendfile solved it immediately. Seems to me that until our sendfile support is better, we should err on the side of always sending the data correctly instead of absolutely as fast as possible. I absolutely refuse to punish users who are using good OSes because some OSes are brain-dead. This is exactly the role that APR is meant to fill: if we know of conditions where it is unsafe to use sendfile, we won't use it unless explicitly told so by the user. The minimal check can be: if (flags APR_SENDFILE_CHECK) { #ifdef LINUX || WIN32 || AIX || HPUX return APR_ENOTIMPL; #endif } As people determine what conditions sendfile is safe (or causes problems), then we can add those. Feel free to advocate Linux always returning APR_ENOTIMPL for sendfile - I don't care. However, blocking sendfile on non-buggy OSes is not a solution that I am willing to sign off on. -- justin -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: do we still want sendfile enabled with our default conf files?
funny, I took the list of exceptions to be so large and hard to maintain that it made more sense to go with Jeff's original idea of just disabling sendfile by default unless a user specifically decided to enable it. I just had to debug a problem for a friend with sendfile on Linux. I don't know what caused the problem, but disabling sendfile solved it immediately. Seems to me that until our sendfile support is better, we should err on the side of always sending the data correctly instead of absolutely as fast as possible. I would much rather have APR default to not using the native sendfile, and only enable native sendfile when we have a lot of evidence that it does work correctly. Ryan On Fri, 18 Mar 2005 08:07:01 -0800, Justin Erenkrantz [EMAIL PROTECTED] wrote: --On Friday, March 18, 2005 5:59 AM -0500 Jeff Trawick [EMAIL PROTECTED] wrote: ...snip, snip... AIX: Doesn't really fail in the normal sense of not putting the right data on the wire, but can trigger a kernel memory issue if some kernel tuning is incorrect. So always fail if the APR_SENDFILE_AUTODETECT is on. (This kernel tuning is irrelevant unless sendfile or more obscure TCP usage is actually occuring, so the tuning issue has typically been there all along without hurting anything.) Is the kernel turning incorrect on AIX by default? Will this be fixed in some future releases? You could do lots of things to corrupt your kernel by tuning in other ways - so unless this is by default, I can't see why we should block this. ...snip, snip... +1 to this list of exceptions and adding a new flag called APR_SENDFILE_CHECK (or APR_SENDFILE_AUTODETECT) to apr_socket_sendfile. -- justin -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: mod_actions 1.32 patch never made it to 2.0
On Tue, 14 Dec 2004 17:13:23 +0100, André Malo [EMAIL PROTECTED] wrote: * Ryan Bloom [EMAIL PROTECTED] wrote: I have a couple of pretty big issues with this response. 1) You have a configuration in Apache 1.3 that doesn't work in Apache 2.0, but the config directives don't have to be changed at all. This is something that we worked really hard not to do in 2.0. There should never be a config in 1.3 that just gives the wrong results in 2.0 without any way for the user to understand why. Ah? That's what one would call a bug. While breaking the behaviour in 1.3.9/1.3.10 nobody even thought about this issue. It's *still not documented that it was broken*. And a lot of users suffered of it, including me. Suffered how? How exactly did a change that made the code accept more configs break your config? Also, it isn't that nobody thought of this when making the change to 1.3. Looking through the mailing list archives, I see Ben Laurie specifically had a problem with Location and mod_actions that Manoj fixed. I haven't found the whole thread about the problem, just the RM notes about it. So, we have a bug that was fixed in 1.3 that was reintroduced in 2.0, and 2.0 is solving the problem the completely opposite way. Instead of defaulting to doing what 1.3 does, you default to the opposite position. That is what I am saying is so wrong here. Pick the same default as 1.3, and allow the option to override that default. 2) In choosing to default to the 404, you have broken anybody who wants to share 1.3 and 2.0 config snippets. [...] see above. Additionally 1.3 and 2.0 *are* different, so this is null argument at all. I'm sorry, but no it is not. I know something about this, and we spent a lot of time and energy trying to ensure that a config that worked in 1.3 worked the same way in 2.0. We jumped through hoops to ensure that a handler configured as it would be in 1.3 would work even if the handler was moved to a filter. There should not be any examples of a config directive that has the exact same syntax in 1.3 and 2.0 and different behavior. That is what we are talking about here. The directive has the _same_ syntax in the two trees, thus it should act the same way. 4) This isn't documented anywhere at all. If you are going to break old configs, then please add some docs to the docs-2.0 tree. Sure. It's not incorporated yet. It was voted recently and I'm going to commit it soon. And I'm asking that you reconsider doing that so that 2.0 and 1.3 can interoperate more cleanly. You aren't going to change how 1.3 handles this situation, and 2.1 hasn't been released, and the change hasn't been back-ported to 2.0. So, the 2.x tree should do what 1.3 does, and your flag should allow you to override. Ryan -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
mod_actions 1.32 patch never made it to 2.0
A co-worker of mine pointed out that the following works in Apache 1.3, but not 2.0 if the location /foo, doesn't exist on the disk: AddHandler foobar /cgi-bin/printenv Location /foo SetHandler foobar /Location This patch ports this behavior forward into 2.0. Index: modules/mappers/mod_actions.c === --- modules/mappers/mod_actions.c (revision 111773) +++ modules/mappers/mod_actions.c (working copy) @@ -163,11 +163,6 @@ if ((t = apr_table_get(conf-action_types, action ? action : ap_default_type(r { script = t; - if (r-finfo.filetype == 0) { - ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, - File does not exist: %s, r-filename); - return HTTP_NOT_FOUND; - } } if (script == NULL) I'm not subscribed to the list anymore, but please let me know when this is committed. Ryan -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: using APR_STATUS_IS_SUCCESS
Basically, the macro is wrong and needs to be removed. The contract that _all_ APR API's live up to is that on a successful result, they must return APR_SUCCESS. The reason we chose to use 0 as success is simple: 1) Most platforms can check for equality with 0 faster than they can check for any other integer equality. 2) It makes checking for success _really_ easy, because all APIs use the same value for success, there is no guessing or research needed, if the result wasn't 0, then the function didn't succeed. It didn't necessarily fail, because there are status codes that aren't full success and aren't failures, but more research is needed. 3) It provides you an opportunity to have a lot of different values for errors and statuses without having to use a separate variable. I assumed that the original addition of the macro was so that success was handled like any other result code, ie you always use a macro. If the reason was so that some functions could return non-zero success codes, then the macro definately needs to go, because that is a really bad idea. Ryan On Wed, 28 Jul 2004 13:47:20 -0700, Geoffrey Young [EMAIL PROTECTED] wrote: cross-posted to [EMAIL PROTECTED] Garrett Rooney wrote: Geoffrey Young wrote: hi all I was just in garrett's APR talk here at oscon and he was mentioning the APR_STATUS_IS_SUCCESS macro, which I found interesting since httpd only uses it in a few places, opting for a direct comparison to APR_SUCCESS instead. should we move to APR_STATUS_IS_SUCCESS in all places? can someone who groks the nuances of the macro add some insight here? This is actually something I was wondering about as I wrote the presentation. Neither Apache or Subversion use APR_STATUS_IS_SUCCESS everywhere, but I wonder if we should, since if you look at the definition of the macro there are cases when it's more than just (s) == APR_SUCCESS. just another note, I grep'd the code for rc == APR_SUCCESS and it looks like not even APR is using the macro everywhere... --Geoff -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: POST
On Mon, 30 Sep 2002, Greg Stein wrote: On Mon, Sep 30, 2002 at 06:53:09PM -0400, Ryan Bloom wrote: ... The problem is that the default_handler shouldn't be involved. Because mod_dav is now replacing the r-handler field for ALL requests, things Woah! *NOT* all requests. Only those under the URL namespace which has been assigned to mod_dav. It does not just go out and blast r-handler willy-nilly. You have specifically enabled DAV for the URL space in question. ... If those two things are done, then we could have two handlers for the same resource. However, mod_dav shouldn't just be taking over all requests and assuming that they were meant for the core server. Doing so means that all generators are broken if DAV is loaded in the server. It does not just be taking over all requests. It is handling requests for the space that you've assigned to mod_dav. For this particular case, the bug is in default_handler(). Plain and simple. There is no reason for a POST request to return the file contents. Yes, the system should also call the thing as a CGI script, in this case, but that doesn't excuse the default handler. No Greg, I'm sorry, but the bug has nothing to do with the default_handler. Plain and simple. If mod_dav wasn't in the server, the default_handler would never be called, because mod_cgi would have been given an opportunity to handle the request. The bug is in mod_dav, and no fix anyplace else will actually solve the problem. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: POST
On Tue, 1 Oct 2002, Greg Stein wrote: On Tue, Oct 01, 2002 at 11:03:16AM -0400, Ryan Bloom wrote: On Mon, 30 Sep 2002, Greg Stein wrote: ... For this particular case, the bug is in default_handler(). Plain and simple. There is no reason for a POST request to return the file contents. Yes, the system should also call the thing as a CGI script, in this case, but that doesn't excuse the default handler. No Greg, I'm sorry, but the bug has nothing to do with the default_handler. Plain and simple. If mod_dav wasn't in the server, the default_handler would never be called, because mod_cgi would have been given an opportunity to handle the request. The bug is in mod_dav, and no fix anyplace else will actually solve the problem. mod_dav causes the bug in default_handler to be exposed. Nope. The default_handler relies on other handlers to run first, so that it only gets the requests it is supposed to get. Even if we change the default_handler to only serve GET requests, the bug still exists, because the bug is in mod_dav. A secondary issue is how mod_dav alters the dav-handler in a way which disables POST to a CGI. You've fixed this latter issue (altho it breaks the RFC 2518 requirement of checking locks before allowing a POST). I think we should figure out a different hook to use for that check. While the fixups hook isn't really intended for this, it would seem a good place to do the check. mod_dav already hooks it, so that should be fine. The fixups hook is definately intended for this. The handler phase is only intended for actually generating content. Mod_dav isn't generating content for a POST request, thus it shouldn't be trying to handle it in the handler phase. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: Speaking of pipes from cgis causing trouble...
On Tue, 1 Oct 2002, Paul J. Reder wrote: I've been working on the caching code and ran across a core dump... A particular file contains an SSI call to a cgi. The cgi causes a pipe bucket to pass down the chain. cache_in_filter tries to save the bucket away and core dumps. Since a pipe bucket can be of any length, and could take any amount of time to complete, I would assume that the cache code should decide not to cache any response with a pipe bucket, correct? Not necessarily. The cache code should stream the data to the cache, and allow the data to also stream to the core_output_filter. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: PHP POST handling
On Tue, 1 Oct 2002, William A. Rowe, Jr. wrote: At 01:12 PM 10/1/2002, Greg Stein wrote: For PHP, we said make it a filter [so the source can come from anywhere]. I think we really should have said for GET requests, allow it to be processed by PHP. The POST, PROPFIND, COPY, etc should all be possible to handle by PHP, which means that PHP also needs a handler. Agreed, if you write a PHP script we better allow you to PROPFIND or COPY the puppy, in addition to POST. These are two different statements, if I am reading both correctly. Please correct me if I am not. Will, you are saying that if we have a PHP script, then we need to be able to do all DAV operations on the script. Greg, you are saying that a PHP script needs to be able to satisfy a DAV request (meaning that the PHP code actually copies the resource, or generates the PROPFIND data). Assuming I am reading the two statements correctly, I agree with Will, but not with Greg. There is a major difference between satisfying a COPY or PROPFIND request and generating a page that has accepted POST data. A filter will never be able to satisfy COPY or PROPFIND, because those are actions that should be done in the handler phase. However, having two ways to read the PHP script from disk (default_handler and php_handler), and run the page through the interpreter doesn't make sense. That is why PHP was re-written as a filter, to allow it to take advantage of ANY back-end data store that we have. Heck, PHP should also be able to handle a GET request. For example, it should be able to retrieve the content from a database, and then shove that into the filter stack. IOW, PHP is really a handler *and* a filter. It can handle the generation of content, but it can also process generated content when that content is a PHP script. I think I am missing something here. PHP doesn't handle content generation. It never has. In Apache 1.3, PHP could read a script from the disk and interpret it. In Apache 2.0, PHP _should_ be able to read a script from a bucket and interpret it. (The fact that it doesn't right now, is not really germane to this discussion). From my reading of the statement above, you want people to be able to write handlers in PHP, which would find another page or script in a database and send it down the filter stack. That can't be done right now, PHP can't write handlers that way, at least not that I am aware of. This BTW, is why mod_perl has both concepts, handlers and filters. Handlers are used as content endpoints, they generate data. Filters are used to modify data that was already generated. Please let me know if I have misunderstood anything in this mail. Everything I have said above is based on my reading of the message, and I tried to point out where I may have not understood what the original author was saying. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: POST
On Mon, 30 Sep 2002, Greg Stein wrote: On Sun, Sep 29, 2002 at 10:33:08PM -0700, Justin Erenkrantz wrote: On Mon, Sep 30, 2002 at 01:17:55AM -0400, Ryan Bloom wrote: Because 2.0.42 always displays script source for CGI scripts that use POST, I believe that we should put that notice on our main site, and stop suggesting 2.0.42 for production use. I could not reproduce your problem in my tests. Do you have a clear reproduction case? (POSTing to a DAV resource for me yields a 404.) Is it somehow related to mounting a DAV repository at / and the server getting confused about the /cgi-bin/ dir? -- justin snipped In any case, Ryan suggested that mod_dav shouldn't set r-handler for methods that it isn't going to handle. That is arguable. The resource *is* located within the URL space designated as being under mod_dav's control. The bug here is default_handler()'s serving up POST'd documents. A secondary issue is possibly refining mod_dav's handling. But that is a *very* deep problem. The specified resource might not be in the filesystem, so it would be *invisible* to Apache if mod_dav doesn't handle it. The grey area is when you're working with mod_dav_fs. It may be time to bite the bullet and abstract out the FS from the core server. It would be a PITA, and it would mean a major re-factoring of DAV code, because mod_dav wouldn't have the repository code anymore. Instead, it would be a part of Apache itself. I believe that is the only way that you are actually going to solve this problem cleanly though. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: POST
There is already a bug filed. It works if you don't have DAV enabled for that CGI location. I am hoping to look at that today. Ryan On Sun, 29 Sep 2002, Jerry Baker wrote: For some reason POST doesn't work with the latest HEAD. I can't say when it stopped working, but I can tell you that GET works. If I have a form and set the action to GET, the script runs flawlessly. If I set the same form's action to POST, Apache returns the source of the Perl script as if it were text/plain. I was surprised that Apache was giving out the source code to my scripts, and now I'm afraid to imagine for how long it has been doing this. -- ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: POST
On Sun, 29 Sep 2002, Jerry Baker wrote: Ryan Bloom says: There is already a bug filed. It works if you don't have DAV enabled for that CGI location. I am hoping to look at that today. Is there a way to at least stop Apache from giving the script source to the viewer without disabling CGI or DAV? I haven't found one yet, but I am just now looking at code. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: POST
Copying security, because this is a big issue On Sun, 29 Sep 2002, Jerry Baker wrote: Ryan Bloom says: There is already a bug filed. It works if you don't have DAV enabled for that CGI location. I am hoping to look at that today. Is there a way to at least stop Apache from giving the script source to the viewer without disabling CGI or DAV? According to my reading of the code, no it isn't possible. However, I have just commmitted a fix for this. I am hoping that one of the DAV experts will review my fix for correctness, but it is what we did until a few weeks ago. However, from my reading of the dav_module, I have a major concern. The module is currently trying to handle every type of request. But, that is wrong, it isn't how modules are supposed to behave. Mod_dav should only be setting the handler field for requests that it knows it can serve correctly. Because 2.0.42 always displays script source for CGI scripts that use POST, I believe that we should put that notice on our main site, and stop suggesting 2.0.42 for production use. Mod_dav developers, please check my commit. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: How Do I Create A Per-Worker Pool?
On Thu, 26 Sep 2002, Charles Reitzel wrote: Thanks for telling me what I need to know: you can't get there from here. I don't want to start a philosophical debate, but it is a common idiom (yea, verily a pattern) in multi-threaded programming to avoid contention by duplicating singletons in each thread. This can be done either by using platform-specific thread-local storage or, portably, by passing in the needed singleton (or a place to put it) into the worker thread. A mutex-free, per-thread pool would serve this purpose perfectly. Change the word thread to worker in the above, and you have made it portable across MPM models. A per-thread pool would assist a great deal in bringing many Apache 1.x modules over to 2.x. In essence, you move the globals into the per-thread pool, and the remaining application semantics are unchanged. You can't just add synchronization to most applications/modules. Usually, you need to redesign much of it to make it work without killing performance. Convinced? Unfortunately nope. You are confusing a programming model with pools. Pools are about variable life-time. In the case of what you are trying to do, it is possible already. All you need to do, is do the initalization in the child_init phase, and save each version of the data in a table that is accessible by the whole process. The problem with a qorker_rec, is that it doesn't fit with the Apache model. Yes, it would be possible to add the pointer to the request_Rec, but it isn't necessary. Everything in Apache is centered around the request, not the thread/process that is serving the data. By adding a worker_rec, we are actually encouraging programming practices that we don't want, like copying a bunch of global data into a worker_pool. We would much rather that modules were written with thread-safety in mind, so while it is possible to do, it isn't the right solution. Also, we did have a thread_pool at one time that worked for threaded MPMs. The problem was that it just wasn't ever used. This makes sense, because pools are about lifetime, and in all of the MPMs (except perchild), threads have the same lifetime as processes by definition. I hope that makes sense. Ryan
Re: [PATCH] mod_mime and virtual requests
Hmm. We might not even need the fixup. Now that *every* handler is executed nowadays, we can probably just remove the concept of dav-handler and move those checks in type_checker/fixups into dav_handler() itself. Hmm. Perhaps. I'll take a look at it. -- justin Small problem with that. Not all handlers are run, only the first handler to do something is run. Also, we shouldn't be moving stuff into handlers, we should be moving stuff out of them. The idea of handlers is to actually generate the data. If mod_dav is doing checks, those should be done before the handler, so that the dav_handler can return ASAP. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: El-Kabong -- HTML Parser
On Thu, 29 Aug 2002, Aaron Bannert wrote: On Thu, Aug 29, 2002 at 02:24:28PM -0400, Ryan Bloom wrote: +1 from me, I prefer APR actually. I am really uncomfortable with this going under the APR project. As things stand right now, it just doesn't fit with what we have stated our goals to be. If you want to change our stated goals, then go ahead and do it. Just committing code that doesn't fit with our goals isn't the way to do that. (I will defer answering this for an apr-only discussion.) I will make one exception to that statement. If it lands inside of APR-util, under the XML directory, and it is made to work with the XML parser, I can accept that landing spot. As it fits in closer with our goals (I think). Jim, I can't decide if this is what you meant or not. I'm +1 on integrating it into our XML stuff. I consider it to be equivalent to apr-util, so either we put it inside apr-util, or we create a new APR subproject or sub-library for it. I should also mention that I completely do not see this as equivalent to apr-util. I reserve the right to ask for this project to be removed from APR after the APR project has decided on it's stated goals. That does not mean that it would be removed from the ASF (assuming my request is approved), only that it would need to find a new home within the ASF umbrella. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: What evil lurks beneath macros... (an AIX xlc optimizer bug)
On Fri, 16 Aug 2002, Cliff Woolley wrote: On Fri, 16 Aug 2002, Ian Holsman wrote: I'm just wondering if there is any kind of measureable performance benefit in keeping these as a macro vs putting them in a function (possibly inline if the compiler can support it). I'm quite sure there is a performance benefit, though admittedly I don't have numbers on hand to support that at the moment. It's as simple as this: many of the operations, once you get rid of the typecasts and other noise, optimize down to just a few adds/subtracts and a few assignments. Four lines of code in most cases. Adding the overhead of a function call (actually probably two or three function calls) for each operation would be substantial. It just looks messy when you expand the macros because it all gets squished on one line and because lots of noise gets shoved in. But you could say that about most macros when read in fully expanded form. I agree 100%. These macros are relatively simple, just doing a bunch of casting and pointer manipulation. The cost of putting those in functions would most likely be very high. A better question IMHO, is whether any of those macros can be made less complex. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: What evil lurks beneath macros... (an AIX xlc optimizer bug)
On Fri, 16 Aug 2002, Cliff Woolley wrote: On Fri, 16 Aug 2002, Ryan Bloom wrote: A better question IMHO, is whether any of those macros can be made less complex. It's a good question, but IMO the answer is no. The ring macros are very tight and easy to read... like I said, they're about four lines each. The brigade macros are, for the most part, one line wrappers around those. I don't see how they could get any less complicated than they are. The problem is that many of the macros are implemented in terms of other macros. For example, the INSERT_TAIL macro seems overly complex. It may not be, but it is not the easiest macro to chase down. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: Re[2]: perchild on FreeBSD 5?
In view of this It's impossible to tell which libs are thread safe wouldn't it perhaps make sense to have a perchild version that uses processed instead of threads? I sure know that this would result in a insane amount of sleeping processes in a massive Virtualhosting setup but in my tests a very low end 500Mhz FreeBSD 4.6 box with 256MB RAM had no problem whatsoever with 2000 sleeping Apache processes belonging to 250 DISTINCT instances of 1.3.26, all based upon different config files and running on different ports . Swap wasn't used and in idle state, the 2000 processes consumed less than 1.5% CPU... While it is technicalloy possible to create a preforking server that operates like perchild, it isn't a simple thing to do. I would even go so far as to say that it isn't something you really want. If you want that behavior, the best solution is to just run multiple Apache servers. I looked at every MPM before creating perchild, and the only one that made a lot of sense for perchild was a static number of processes with a dynamic number of threads per process. If you have a dynamic number of processes, then the parent process must always determine how many child processes with each uid/gid exist currently. If you are going to do that, you are better off to just run two or three instances of prefork with an IP alias. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: [PATCH] UDP support
I don't believe that this should be incorporated into the core server. The core of Apache is an HTTP server, which means that it should only know how to listen on TCP sockets. The support is there however, for other socket types to be added to the server. For the UDP case, your protocol module should implement it's own listen directive, to allow people to add UDP sockets for your protocol module. Then, the protocol module can set the accept_function pointer in the listen_rec, which in this case should be set to unixd_udp_connect. Now, when the server returns from apr_poll, it will automatically call the correct function for the current socket type. For the actual network I/O, your protocol module will need to have it's own filter to replace the core_output and core_input filters. This makes sense, because the standard filters are very much tuned towards what we expect in the HTTP world. If you replace those functions, you can tune them to give the best possible performance for your protocols. The two places that still need to be fixed for this to work are adding sockets to the listen_rec list and the lingering_close logic. Both of these can be fixed by adding a hook to the server. Adding listeners to the listen_rec list _may_ be possible in the open_logs phase, which is where the core server does it, but I haven't tried to implement that yet. If it can be doine in the open_logs phase, then you won't need another hook for it. The lingering close logic should just be replaced by a simple hook that the core server calls. The core can add lingering_close to that hook, and everything should continue to work. Please let me know if any of this doesn't make sense. I like the idea of allowing the core to be used for UDP-based protocols, but I want to be sure that it is done the best way possible. Ryan On Wed, 14 Aug 2002, Tony Toyofuku wrote: Hi, Many months ago I submitted a patch for UDP support within Apache 2.0. This a resubmission of that patch, which allows for UDP packets to work with Unix versions of Apache 2.0. Here's what I wrote then: This patch adds UDP support to unix versions of Apache 2.0. This patch is set to add UDP support to the prefork MPM; however it should be trivial to add UDP support to other MPMs (changing the MPM_ACCEPT_FUNC definition in mpm.h, and adding the UDP_LISTEN_COMMANDS line in the MPM source code). Here's how it works: 1. At configuration time, there's a new directive UDPListen. This is just like the normal Listen directive, but it sets up a UDP listener. It sits in the httpd.conf file, and looks like this (where 8021 is the port number): UDPListen 8021 2. Since there's no notion of accept() on a UDP socket, there's a new abstraction layer between the accept system call, named unixd_pop_socket. If the incoming request is UDP, the socket gets routed to a UDP version of the unixd_accept function. If the request is TCP, it gets send to the existing unixd_accept function. 3. The network I/O is now done using recvfrom() sendmsg, since UDP must know the port/address of the client. Additionally, rather than using sendfile() for the UDP requests, emulate_sendfile is used instead. This is required since sendfile() won't work with connection-less sockets. That's pretty much it. Although the UDP transport layer will work for HTTP, for me the value of UDP is to use Apache 2.0 with its new multiple protocol support. In this way, I can write an Apache protocol module to communicate with the legacy UDP systems that I've got to support. udp.patch httpd.conf readme.txt udpclient.tar.gz I've included a modified version of one of the APR UDP test apps, and its Makefile to exercise the server. Tony -- ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: perchild on FreeBSD 5?
Assuming that FreeBSD 5 solves the threading problems with FreeBSD, then yes, Perchild will work on that platform. The problem right now is that PErchild doesn't work at all. I am hoping to have time to work on PErchild on Thursday or Friday to finish the work on that MPM. Ryan On Tue, 13 Aug 2002, Gabriel Ambuehl wrote: Hello, I've more or less accepted that perchild on FreeBSD 4.X isn't going to happen (as sad as it is, I always considered it to be THE feature [1] in 2.0 that would warrant an upgrade for us) but what I'd like to know is if there is any chance to see perchild on FreeBSD 5 which gets wholly new threading and SMP libs? My company would happily pay someone a few thousand US$ to come up with a working perchild implementation on FreeBSD 5 and from what I've gathered on the FreeBSD mailing lists, there might be other parties that would contribute to the funding, too. We haven't got any reasonable in-house knowhow to contribute with code but we'd surely help beta testing. [1] Name based vhosts with PHP scripts running under the UID of the user. Great for all ISPs out there. Regards, Gabriel -- ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: Re[2]: perchild on FreeBSD 5?
On Tue, 13 Aug 2002, Gabriel Ambuehl wrote: -BEGIN PGP SIGNED MESSAGE- Hello Ryan, Tuesday, August 13, 2002, 6:10:21 PM, you wrote: Assuming that FreeBSD 5 solves the threading problems with FreeBSD, then That's what we've been told by the people working on 5.0 at least. I dunno whether they actually understand the issue though as they claim that FreeBSD 4.X supports threading, too. yes, Perchild will work on that platform. The problem right now is that PErchild doesn't work at all. I am hoping to have time to work on PErchild on Thursday or Friday to finish the work on that MPM. On another note: will it already work on NetBSD or OpenBSD once it will be back in service on Linux? I don't honestly know. The problem is that there are mulitple ways to pass file descriptors between processes based on the Unix that you are using. Assuming FreeBSD, NetBSD, and OpenBSD all have good thread libraries and that they all support the fd passing mechanism that Liux uses, yes everything will work. IF they don't support the same fd passing logic, then it is a simple matter of implementing the other fd passing logic, which should be easy to do. Assuming I get Perchild working this week, I will try to get it working on a BSD machine that I have as well. If I can't make it work on my BSD machine (because it is FreeBSD), then I will most likely need an account on a Net or Open BSD box. I have recently lost access to most of my different types of machines, and while I'll get some access to new boxes starting next week, it won't be specifically for Apache work. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
Re: user field of conn-rec in change from apache 13 to 20
The user field is in the request_rec now, because users are a request-based entity in HTTP. Ryan On Mon, 12 Aug 2002, Harrie Hazewinkel wrote: HI, Can someone point me to the equivalent in the Apache 2.0.x versions of the user field that was in the Apache 1.3.x in hte conn_rec?? thanks by advance, Harrie -- ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
RE: filename-to-filename hook
The hook was not added for Apache 2.0. If the docs were changed from 1.3 to state that it was added, then it is a bug. Ryan -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] -Original Message- From: Werner Schalk [mailto:[EMAIL PROTECTED]] Sent: Thursday, August 08, 2002 4:11 PM To: [EMAIL PROTECTED] Subject: filename-to-filename hook Hello, I have already asked the people of the users mailing list but nobody did answer my question so I gonna ask it here: In the docs is written that it is not needed to add the [PT] option to a RewriteRule, when passing it through another module because apache2 now supports a filename-to-filename hook. How can I enable it? See http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#rewritemap, there is written: For Apache hackers If the current Apache API had a filename-to-filename hook additionally to the URI-to-filename hook then we wouldn't need this flag! But without such a hook this flag is the only solution. The Apache Group has discussed this problem and will add such a hook in Apache version 2.0. Bye and thanks, Werner.
RE: cvs commit: httpd-2.0/build httpd_roll_release
Even if we don't build it, this is extremely good practice that the folks rolling and releasing the tarball TAG the apr-iconv tree in sync with the current apr and apr-util trees.. I completely disagree. The problem is that the httpd_roll_release script is for rolling httpd releases, not APR releases. This change doesn't help people realize that they have to tag APR-iconv before they can release httpd. Amazing that we tag APR at all, no? That APR gets tagged with Apache, is a side-effect of not having released APR yet, nothing more. In time, we won't tag APR with an Apache tag. I really agree with Cliff, the change to pull apr-iconv out of APR is annoying, and it is going to cause problems. I understand that it is the best solution we have right now, it is still a bad solution. Of course it is bad. That's why I suggest a separate tarball for iconv. But it doesn't matter, we need trees in-sync, so apr-iconv must be tagged with apr's tags, from here forwards. If you want to do that as an rtag, that would be fine too. The other thing, is that httpd_roll_release doesn't do the tag, it simply checks out the code that has already been tagged. Ryan
RE: quick_handler hook is completely bogus.
I agree with everything that Brian said, and he put it far better than I could have. I would be MUCH happier with the quick_handler phase if it occurred after the AAA phases. Potentially, this could be achieved, and the performance could be improved by simply having the cache module register a map_to_storage function, which checked to see if the page was in the cache. If it is, then no other map_to_storage function is required (assuming enough state is saved in the cache to re-authenticate the request). If it isn't, then the server moves on to the next function. This has the advantage of being secure, and it also removes the biggest performance problem, the directory and location walks. Ryan -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] -Original Message- From: Eidelman, Brian [mailto:[EMAIL PROTECTED]] Sent: Wednesday, July 31, 2002 8:01 AM To: '[EMAIL PROTECTED]' Subject: RE: quick_handler hook is completely bogus. Hi All, As an access control module developer I feel compelled to weigh in here. First, I would like to point out that you cannot determine whether a page was password protected based on the Apache configuration files including .htaccess. Modules such as the one we develop, which is widely deployed, plug into the access hook and use more robust notions of access policies. The only safe way you can determine what content is safe to cache is by having a robust set of configurable rules in your caching module (or proxy) that allows a site administrator to set rules that determine what is safe to cache. These rules can be as simple as cache these URLS or as complex as cache all URLS that end with .gif when no cookies are sent inbound or set outbound and when there is no cache control header telling us not to cache. The caching done in mod_cache and in reverse proxy caches are different from browser caches and forward proxy caches from an access control point of view in that it is within the scope and responsibility of the enterprise that owns the content being cached. To put it another way, mod_cache and rproxy caches are responding to real requests made back to the server where as browser caches and forward proxy caches are not. I agree with Graham's notion that any hook designed primarily for caching should be moved to after the access control hooks. I think the direct performance impact will be minimal and easily made up for by the fact that you will be able to cache more content since you will no longer be constrained by whether or not the content is protected (only by whether or not it is dynamic for each user/request). Additionally, I'd like to point out that total reliance on cache control headers is not a good option. Unfortunately, the reality is that most dynamic web applications are unaware of these headers and incapable of setting them or honoring them. At the same time, an access control module may not want to set a do not cache header on every request because it might be acceptable (as described above) for a browser cache to cache a piece of content but not for mod_cache or a reverse proxy cache. Enterprises must strike a delicate balance here. This balance is usually reached by using cache control headers to communicate with the end user and use other intelligent cooperation for the enterprises internal caching mechanisms. Stepping back a little to discuss the hook in general rather than mod_cache, I agree strongly with Ryan that it is dangerous and to some extent conceptually flawed. A well designed API that uses callbacks should allow people to plug in to do the work that was designed to happen in that hook without having to worry too much about whether or not some other module that plugs in somewhere else is going to circumvent that hook. Even if the quick_handler hook is being used appropriately and safely by one module, that does not justify the power given to it.
RE: SSLMutex directive, PR 8122
This PR points out a problem with the current SSLMutex handling in mod_ssl and provides a patch to resolve it. I'd rather see a different solution: 1) give SSLMutex the same syntax as AcceptMutex, except that SSLMutex has an additional choice: none. 2) add SSLMutexFile which is processed like the LockFile directive This is a migration problem for people using 2.0.x with mod_ssl, but for the long term I suspect it will be beneficial not to bundle the optional lock file name with the mutex mechanism. Comments? ++1. Ryan
quick_handler hook is completely bogus.
I realize that this is a strong statement, but I believe that I can back it up. My reasons for not liking this hook at all: 1) If I have a page that I have served and it gets put in the cache, then it will be served out of the quick_handler phase. However, if I then add or modify a .htaccess file to deny access to that page, then my changes won't be honored until the page expires from the cache. This is a security hole, because I don't know of anyway to invalidate cached pages. (This one if from a conversation with wrowe). [ I guess it might be possible to clear the cache with a graceful restart. ] 2) If I have a page that uses access checking to ensure that only certain people can request the page, the cache_filter will put it in the quick handler. However, the page may not be allowed to people who will request it from the cache. I may be wrong about this one, but I see how the cache disallows pages that require authentication. I do not see how it can disallow caching of pages that require access_checking. 3) It isn't possible for a module author to circumvent the quick_handler phase. If I write a module that doesn't want to allow the quick_handler phase, for security reasons, I can't enforce it. While I understand that we are giving people a lot of rope and asking them to use it wisely, this phase gives too much rope, and invites people to hang themselves. I believe that this hook should be removed, and all content should be served out of the handler phase. If we are looking to remove some request phases, then we should make it possible to avoid individual phases when serving requests, not completely skip all of them. Ryan -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED]
RE: quick_handler hook is completely bogus.
1) If I have a page that I have served and it gets put in the cache, then it will be served out of the quick_handler phase. However, if I then add or modify a .htaccess file to deny access to that page, then my changes won't be honored until the page expires from the cache. This is a security hole, because I don't know of anyway to invalidate cached pages. (This one if from a conversation with wrowe). [ I guess it might be possible to clear the cache with a graceful restart. ] How does this differ from the document being cached anywhere else? Such as in squid, or a proxy, or the client's cache? Depending upon the cache-control fields in the original response header, the cache engine may not even do a conditional GET. I can accept that argument. Although, from a user's point of view, I would consider them different specifically because in the cache module, everything required to serve the page is in the same place. 2) If I have a page that uses access checking to ensure that only certain people can request the page, the cache_filter will put it in the quick handler. I thought the caching modules didn't cache anything that required either access or auth/authz checking. FirstBill? I read through the code, and I see where the auth/authz decision is made. However, I can't see where the access control decision is made. If it is there, then I would be more than happy to remove this issue. 3) It isn't possible for a module author to circumvent the quick_handler phase. If I write a module that doesn't want to allow the quick_handler phase, for security reasons, I can't enforce it. How can a module author disallow *any* phase? That's a core function, not up to modules to decide.. In every other hook in the server, I can generally return some value that makes the phase stop processing, while allowing the request to finish. For many phases, that code is DONE, for others it is OK or DECLINED. With quick_handler, there is nothing I can do. Ryan
RE: quick_handler hook is completely bogus.
From: Bill Stoddard [mailto:[EMAIL PROTECTED]] I'm approaching this from a caching perspective, so when a module uses quick_handler for non-caching mechanisms, my comments do not apply but here's an option: FWIW, quick_handler was added to the server to enable more efficient content caching. Graham modified mod_proxy to use it (and it seems a reasonable use to me). This use alone justifies the existence of the quick_handler hook IMHO. I must be missing something: C:\opensource\httpd-2.0\modules\proxygrep quick_handler * grep: CVS: Is a directory grep: Debug: Is a directory mod_proxy doesn't use the quick_handler. Ryan
RE: ldap
Mod_proxy wasn't added back to the server until the developers had proven that there was a development community around it, and most of the bugs had been fixed. The same must be true for ldap before it can be added to the base distribution. Also, as a counter-point to this. Adding a module to the core discourages other people from implementing the same basic functionality. While that is usually a good thing, there are a LOT of versions of auth_ldap for 1.3, each with its own advantages and disadvantages. I know of at least 1 other auth_ldap for 2.0 (proprietary, by Covalent), would any of those modules been created if auth_ldap was in the core. Now, I am trying to stay out of this discussion, because I have an obvious conflict of interests, but I did want to give people something to think about. Ryan -- Ryan Bloom [EMAIL PROTECTED] [EMAIL PROTECTED] -Original Message- From: Brad Nicholes [mailto:[EMAIL PROTECTED]] Sent: Monday, July 29, 2002 8:21 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Re: ldap I see the same thing happening to LDAP. For the most part it has been ignored. If it is considered to be unstable at this point, why not put it in /experimental with the other modules that are considered to be not yet ready for prime-time but still very useful? In this way, it will get the exposure that it needs, documentation can continue (BTW where did the docs go??) and when it is stable, it can be moved into the mainstream. Brad Brad Nicholes Senior Software Engineer Novell, Inc., the leading provider of Net business solutions http://www.novell.com [EMAIL PROTECTED] Monday, July 29, 2002 6:03:26 AM [EMAIL PROTECTED] wrote: People didn't want it to be a part of the core more because of module bloat. As Aaron says, there is no reason to add all these modules to the core only to have to release them on the same schedule - I like it as a sub project. When proxy was a subproject, it received no exposure, which was detrimental to the project as a whole. Bugs were only fixed when proxy went mainstream again. Subprojects mean more work for end-users, and avoiding end-user work is better than avoiding developer work. Regards, Graham -- - [EMAIL PROTECTED] There's a moon over Bourbon Street tonight...
RE: ldap
From: Brad Nicholes [mailto:[EMAIL PROTECTED]] Being Novell which is the leading provider of directory services, we obviously have a great interest in LDAP. What if I were able to get the Novell LDAP developers to step up and support AUTH_LDAP? Would that be enough to get AUTH_LDAP put back into the mainstream? My own opinion, no it wouldn't. Generally, we require a person who has commit access to the tree to own the module. However, like I said, I have a huge conflict of interests, so I am not standing in the way of this. Ryan
RE: daedalus is running httpd-2.0.pre40
I would be in favor of never installing them on an upgrade. They are useless on a production machine that already has a configuration. They are meant as DEFAULT values to help people get up and running. And they also provide examples of how things are done. When those things change on an upgrade -- such as adding the IfModule wrappers -- it makes excellent sense for them to be provided. Okay, I keep asking for the REASON for installing these files. I have heard a couple: 1) People want to be able to get back to the standard config files after they have modified them. -- Cool. Back up the files before you modify them. Having us re-install the -std.conf files on upgrades is not a solution to this problem, because it means that you need to re-install to get the default files. 2) People want to be able to compare previous default config files with new ones. -- Fine, but the solution being proposed doesn't allow for this, because the default config file is over written by the upgrade step. The only solution to this is to add the version to the end of the file, which means that we just keep on adding new files. No thanks. 3) People want to see examples of how we think the config should look. -- Fine, but that is an example, and belongs as a part of the manual. Don't install the -std.conf files in conf/ at all if the goal is for them to be examples, put them in the manual directory. If there is another goal, then lets here it, but right now, we are doing something to solve a problem with an external shell script. With all of the input for why this would be a good thing, the solution we are proposing doesn't ACTUALLY solve any of the problem being proposed. Is there ANY other software package anywhere that actually re-installs the default configuration files on an upgrade? Stop handwaving. No-one is suggesting overwriting httpd.conf. Re-read that paragraph. In no place did I say we were. I said we were re-installing the default config files on an upgrade, which means that after I get my conf/ directory down to: httpd.conf and mime.types, after the upgrade, I now have those files plus httpd-std.conf. And httpd-std.conf DOESN'T have any bearing to my site, and it will always be installed. Sorry, that is bogus. Ryan
RE: daedalus is running httpd-2.0.pre40
From: gregames [mailto:gregames] On Behalf Of Greg Ames Ryan Bloom wrote: You are working around a problem in your script in the Apache upgrade step. no sir. It would have been a piece of cake to change my perl script to insure that conf/ has enough stuff so the server can start. But I couldn't take the easy way out with a clear conscience because other people are likely to get burned in a similar manner. Come off it. That is insane. You have a broken script, fix it. Nobody else will get burned this way, as proof, this is how 1.3 has worked for years, and there were never any complaints. Anybody who has an existing configuration who wants to copy that config to a new server MUST copy all of the config files. You script didn't. That is broken. Period, end of story. Now, on EVERY production machine I have, I will always have *-std.conf files. Another way to word this is that whenever you do a make install your standard configuration files are updated to match the current code. Why would I possibly want that? The -std files have no relationship to my server. None. I may have started with them years ago, but my current config doesn't look anything like them. Why is that a good thing? Most of the time I use daedalus's config files or something that closely resembles them. Those are complex, but don't include every possible combination of config directives (thank goodness). Every once in a while I need to drop back to a simpler config to do benchmarking, or something with new config If you are doing benchmarking on deadalus, then we have issues. Since we are talking about your automation scripts on daedalus that is the only thing you could mean, right? features to work on a problem for instance. It's nice to be able to grab one of the -std.conf files, make minimal/no changes, and have a working server with guaranteed current config directives in short order. Fine, so grab one, but DON'T put it in MY production conf/ directory please. I don't care where you keep a copy of those files, but they don't belong in my product conf/ directory. If you read this whole thread, you'll see that I'm not the only one who likes having current -std.conf files available. They worked this way for ages. I don't recall seeing any complaints about this behavior until yesterday. I don't care how many people like it, it is wrong (besides, if that argument help any water at all, I could point how that in the last few days, there have been at least three users disagreeing with the current code). It has only worked this way since 2.0, and we didn't spend a whole lot of time on the install phases until recently. As for complaints. If there hadn't been at least one (mine!), I wouldn't have changed the install-conf target to match 1.3 weeks ago. Ryan
RE: daedalus is running httpd-2.0.pre40
From: [EMAIL PROTECTED] [mailto:trawick@rdu88-250- Nor do I want spurious -std files copying in there to confuse matters. Some of us want the -std files though. From time to time I (and That is the difference between developers and users. I want the -std files on my DEVELOPER machines, and I have tricks to get them. I don't want them anywhere near my PRODUCTION machines, because they get in the way. potentially many other people who start with Apache's basic httpd.conf) compare httpd.conf and httpd-std.conf to see if I want to merge in some of the Apache changes. I don't think anybody would be in favor of not providing httpd-std.conf; perhaps the issue is just where to put it when the user does an install (from binbuild or from make install it should work the same). I would be in favor of never installing them on an upgrade. They are useless on a production machine that already has a configuration. They are meant as DEFAULT values to help people get up and running. Is there ANY other software package anywhere that actually re-installs the default configuration files on an upgrade? Ryan
RE: daedalus is running httpd-2.0.pre40
From: gregames [mailto:gregames] On Behalf Of Greg Ames David Shane Holden wrote: I agree with Ryan wholeheartedly here. Here's an idea... If conf/ exist, copy httpd.conf, magic, and mime.types (These are basic files that all conf/ should have, right?). If conf/ does not exist, copy everything. uhhh, that clobbers httpd.conf, and they'd tar and feather us for sure. But if we leave out that piece, it's close to what's happening now: . make a conf/ directory if it doesn't already exist . if mime.types or magic don't already exist, copy them . always copy in *-std.conf (httpd-std.conf and ssl-std.conf for now) with appropriate substitutions for paths and modules etc learned during ./configure . look for files whose names match the *-std.conf files copied above with the -std piece of the name removed (i.e., httpd.conf and ssl.conf). If they don't exist, copy them from the corresponding *-std.conf files. Yep, that is the current algorithm, and it doesn't make any sense at all. You are working around a problem in your script in the Apache upgrade step. Once the first installation is done, the conf/ directory doesn't belong to us anymore. If the directory exists, then leave it alone. Now, on EVERY production machine I have, I will always have *-std.conf files. Why is that a good thing? For people with a working server, this change doesn't make sense. Ryan
RE: [PATCH] We have sysvsem on OpenBSD
From: Aaron Bannert [mailto:[EMAIL PROTECTED]] On Wed, Jul 17, 2002 at 10:31:44AM -0400, Jeff Trawick wrote: does everybody agree that this is preferable? Why isn't this being detected by autoconf? SysV semaphore support isn't perfect yet and has some problems. Because this is Apache 1.3 which doesn't use autoconf. :-) Ryan
RE: [PATCH] 64bit compiler issues
From: Greg Stein [mailto:[EMAIL PROTECTED]] On Mon, Jul 15, 2002 at 02:19:06PM +0200, Peter Poeml wrote: Hi, building httpd-2.0.39 on x86_64 (AMD's upcoming 64 bit architecture) there are a few compiler warnings, e.g. due to misfitting type casts. While some of the warnings can be ignored, I believe that the attached patch fixes the relevant issues. Nope, sorry. All of those values (namespace IDs) are integer sized. Making them a long is just too big. What is the specific error that you were seeing? I would guess that the problem can be fixed a bit differently, without changing the type of those indexes. We could force the size, by using apr_int32_t. The problem that he is having, is that pointers on _most_ 64-bit machines (Windows is a notable exception, there may be others), are 64-bits long. But we are using int's, which are 32-bits for the pointers. We have the added problem that throughout the code, we pass in integers for void *'s. :-( Just specify the exact size you want, and this problem should go away. Ryan
RE: [PATCH] 64bit compiler issues
From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] At 07:23 PM 7/15/2002, Ryan Bloom wrote: We could force the size, by using apr_int32_t. The problem that he is having, is that pointers on _most_ 64-bit machines (Windows is a notable exception, there may be others), are 64-bits long. But we are using int's, which are 32-bits for the pointers. We have the added problem that throughout the code, we pass in integers for void *'s. :-( Transposed that statement ;-/ Pointers on Win64 are 64 bits long, just like you expected. Both int and long remain 32 bits long, unlike what you might expect (and certainly different from Unix.) Ah. Okay, thanks for correcting me. When we mean a void*, we need to spell out void*. If we need to pass it through an integer, prove it, and we will consider an apr_intptr_t type, that could be nothing more complicated than a union of the appropriate int type and void* for a given platform. Unfortunately, it isn't something we need to do, just something we have done throughout the code for convenience.:-( Ryan
RE: cvs commit: httpd-2.0 acinclude.m4 CHANGES
From: Aaron Bannert [mailto:[EMAIL PROTECTED]] Look for OpenSSL libraries in /usr/lib64. ... for p in $ap_ssltk_base/lib /usr/local/openssl/lib \ - /usr/local/ssl/lib /usr/local/lib /usr/lib /lib; do + /usr/local/ssl/lib /usr/local/lib /usr/lib /lib /usr/lib64; do If we just told people to add the right include and lib paths to their CPPFLAGS and LDFLAGS variables, then we wouldn't have to add any special case directories to our autoconf routines... *sigh* That would only work if we looked in those variable for the files. But, we don't, so we have to special case this stuff. We should re-write big portions of the detection logic so that we don't have ANY hard-coded paths, but I am not volunteering for that. Ryan
RE: content-length filter and r-bytes_sent
From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] At 02:51 AM 7/14/2002, you wrote: Currently, the content-length filter attempts to compute the length of the entire response before passing any data on to the next filter, and it sets request_rec-bytes_sent to the computed content-length. * r-bytes_sent is used by mod_log_config as a count of bytes sent to the client (minus response header). But the value that's computed in the C-L filter isn't necessarily equal to the number of bytes sent, as there may be additional filters between the C-L filter and the client that affect the output (the chunking filter, for example, or mod_deflate). AFAICT, We do _not_ care about bytes manipulated after body content processing for r-bytes_sent. If you want to create a new r-bytes_transmitted field tracking the actually bytes sent over the wire, that's fine. When folks check their logs, they expect to see the number of bytes of content in that count, which provides some level of confirmation that the response was handled properly (if my index.html file is 1474 bytes, I'm looking that all 1474 bytes are sent.) r-bytes_transmitted would be very interesting to me [looking at SSL connection traffic, for example.] But it shouldn't replace r-bytes_sent. In 1.3, r-bytes_sent was the number of bytes actually sent to the client. That is what it should continue to be in 2.0. Ryan
Re: Faster time_now() [Was: Why not POSIX time_t?]
On Mon, 15 Jul 2002, William A. Rowe, Jr. wrote: At 10:03 PM 7/14/2002, Ryan Bloom wrote: BTW, this whole conversation started because we wanted to speed up Apache. Has anybody considered taking a completely different tack to solve this problem? I know there is a patent on this, but I am willing to ignore it, and I am pretty sure that we can get the patent owner to let us use it. (I won't say who owns the patent, but if the owner wants to stand up, it will be obvious why I say this). We could create a separate thread to keep track of the current time every second. That way, the computation is completely removed from request processing it is just a memory access. On platforms without threads, we just go back to getting the time during request processing. That will hurt performance for those platforms, but I am not too concerned about that. Such code, of course, certainly doesn't belong in apr. That's a higher level construct that would fit well in http, or any other app that needs such performance tweaks. The app would have to decide how to live if it doesn't have threads available. And there is the matter of IP :-) Of course this doesn't belong in APR, but the only reason the whole apr_time_t discussion came up was to fix a performance problem in Apache. If that performance problem hadn't come up, would we have even looked at changing the current implementation? Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
RE: cvs commit: httpd-2.0/server/mpm/beos beos.c
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Modified:server/mpm/beos beos.c Log: Adjust the sizes of the pollsets we create/use so that we work again. With the poll change we seem to have improved performance. :) This change scares me a bit. The code is supposed to take care of adding the 1 for you. Ryan
apr_poll() done.
Thanks to Will doing the work for Windows, the apr_poll() move is done. Most of the current platforms are actually using a select() based implementation, so instead of duplicating logic again, every platform except for OS/2 is using the Unix poll.c file. The only implementation that is likely to work currently is the Unix implementation. The test suite does have a testpoll.c, which can tell you if the implementation is working, although it is hard to determine if the tests were successful. The important lines are the ones that give the socket status. There should be five of these lines, and they should look like: Socket 0Socket 1Socket 2 No wait No wait No wait No wait No wait POLLIN! No wait POLLIN! No wait No wait POLLIN! POLLIN! POLLIN! No wait POLLIN! The other test that you should run is testpipe. Remember that apr_file_read and apr_file_write use apr_wait_for_io_or_timeout, and that now uses apr_poll(). This means that if the apr_poll() implementation doesn't work on your platform, CGI scripts and any other pipe-based request will fail. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA
Tests failing????
I know that there has been some talk of doing a release next week. I just ran my test suite, and we seem to be failing some mod_access checks. I think it has to do with the table re-write, but I haven't looked at it in enough detail to be 100% sure. Is anybody else seeing this problem? Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA
RE: Port 80 vs 8080 when not SU.
From: Lars Eilebrecht [mailto:[EMAIL PROTECTED]] According to Ravindra Jaju: How about an extra echo: if [ x`$aux/getuid.sh` != x0 -a x$port = x ]; then conf_port=8080 echo Non-root process. Server will run on port $conf_port fi +1 The problem with this, is that it gets lost in the noise of autoconf. Whenever we try to put informative messages like this in the configure output, nobody seems it. Ryan
RE: Auth checker - long term goal..
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Wed, 10 Jul 2002, Ryan Bloom wrote: user foo checks. 'require group' can stay in mod_auth or go into a mod_auth_group. Didn't we decide to take this approach like a year ago? Hmm - been asleep as usual - if so - I'd love to make that split right away ! I feel a patch coming... I am pretty sure that splitting auth from authz is even listed in the STATUS file. Ryan
RE: Auth checker - long term goal..
I still believe that everything that is currently in ROADMAP can and should be implemented in 2.0. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] Sent: Wednesday, July 10, 2002 10:37 AM To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: Auth checker - long term goal.. At 12:07 PM 7/10/2002, Aaron Bannert wrote: On Wed, Jul 10, 2002 at 09:39:29AM -0700, Ryan Bloom wrote: I'm sorry, but that is completely bogus. If the API needs to change to make things better, then change the API. Stop trying to open a new dev branch when the current one is still moving forward quickly. We have this discussion every few weeks now, and every few weeks the 2.1 repo gets shot down, because these changes belong in 2.0. I don't recall any strong opinions on this other than from you and OtherBill. My feeling is somewhere between. We shouldn't rush off and branch 2.1 if we don't have any specific goals to solve, nor should we be forcing major changes upon our 2.0 users. The point of inflection comes when someone produces a patch for 2.0 that we aren't quite ready to swallow. As soon as that happens we have a perfect excuse for a branch. The list is in ROADMAP. Every item there was effectively vetoed for the current development tree as too radical an overhaul. Each was pointed to the next version, we are {too close to|already for|already have the ga} release. Improve the ROADMAP. Spell out what 2.1/3.0 will offer. Things like needing to track r-openfile instead of r-filename, needing to follow a new convention to write auth modules {splitting authn/authz into smaller useful chunks, but with no back-compat}, proving pushback as a more effective authoring and performance filtering model (that accomodates both input and output filters in the same schema), async cross-threaded requests, and so forth. S -1 for 2.1 until we have such a patch. I agree we aren't ready for 2.1 until 2.0 is stable and relatively bug free. I thought someone a year and a half ago actually threw one out there for some of the auth, but I too want the group to stay focused on making 2.0 a serious threat to 1.3 and other servers. Without breaking existing 3rd party modules beyond rebuilding, and occasional API changes, that are absolutely required. API changes that break 3rd party 2.0 modules, just because it's better|cooler|faster, are bogus now that we are GA. Bill
RE: Port 80 vs 8080 when not SU.
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED]] On Wed, Jul 10, 2002 at 03:12:07PM -0400, Jim Jagielski wrote: Have there been any complaints about how 1.3 has been doing it for ages? A 'make install; foo/bin/apachectl start' no matter who does the building has always resulted in at least a somewhat functional server. I don't see the reason for stopping a traditional behavior (and a possible expectation from the community) without more compelling reasons. On the contrary, I don't believe we have an obligation to follow 1.3 here. I think what we are doing for 2.0 makes the most sense. The default is the default. Please don't muddle things based on who is compiling it. -- Justin While I agree that changing the port based on the user who is compiling the server is a bit backwards, the argument about how things are done on Nagoya is handwaving IMNSHO. We have fixed our installation step to preserve existing config files, so if you compile as a non-root user, and install over the top of an existing installation, your port won't change. This only has to do with how the server is configured the FIRST time the server is installed. Ryan
RE: Apache 2 instruction count profile (head as of ~15:00 EDT July 10)
From: Bill Stoddard [mailto:[EMAIL PROTECTED]] This is with Ryan's poll patch and some of my patches to mod_cache and mod_mem_cache (which I will publish later on). Unfortuanetely the results are difficult to compare with earlier results because my test tree was just too polluted with patches for Ryan's patch to cleanly apply. Still, the overall direction is good (7.5% reduction in instruction count). The difference between Jeff's wait_for_timeout and Ryan's is noise in this comparison. However, I suspect the apr_poll() is better in general and is contributing to the improvement. Based on these numbers, I would like to commit the new apr_poll() today. This is likely to break everything but Unix for a few days however. Once the code is committed, there are some obvious performance improvements that can still be made. The first, is to use a static array for small poll sets, or use alloca on systems that have it. The second is to go through Apache and remove the functions that used to setup poll sets. That will actually have an impact on all of the Unix MPMs, because we will finally be able to stop re-inserting the same socket into the pollset for every request. Does anybody have a problem with this plan? I would like to commit late this evening, which will give me enough time to fix Windows tonight (hopefully). Ryan
RE: cvs commit: httpd-2.0/server protocol.c
fix folding when the continuation charater is a blank Reported by:one of Jeff T's regression test cases Can this be added to the standard test case? Ryan
RE: conserving file descriptors vs. ap_save_brigade()
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED]] On Sat, Jul 06, 2002 at 01:03:42AM -0700, Brian Pane wrote: As it's currently implemented, the C-L filter is trying to compute the C-L on everything by default. It only gives up in a few cases: Of course, in the common case of a static file with no filters, we already know the content-length (default handler sets it). IIRC, I've brought up skipping the C-L filter when we already know the C-L (as defined by r-headers_out), but that was not met with approval. Who didn't approve that? I was under the impression that we did skip the C-L filter if we already had a C-L, and it was the filters responsibility to remove the C-L if it was changing it. Ryan
RE: PATH_INFO in A2?
From: Rasmus Lerdorf [mailto:[EMAIL PROTECTED]] On Sat, 6 Jul 2002, Dale Ghent wrote: On Sat, 6 Jul 2002, Rasmus Lerdorf wrote: | 2.0.40-dev built on June 23rd Make sure you have AcceptPathInfo On Argh! Why the heck is that off by default? It's on by default for dynamic pages, but there is no way that Apache can tell that a page is going to be served by PHP, so it is off for what Apache thinks are static pages. Ryan
RE: PATH_INFO in A2?
From: Rasmus Lerdorf [mailto:[EMAIL PROTECTED]] Make sure you have AcceptPathInfo On Argh! Why the heck is that off by default? It's on by default for dynamic pages, but there is no way that Apache can tell that a page is going to be served by PHP, so it is off for what Apache thinks are static pages. What is a dynamic page if not a PHP page? Like I said, Apache doesn't know if a file on disk is meant for PHP or not. The best way to fix this would be for mod_php to set the value if the filter is added for the request. I agree, it would be cool if Apache could set this correctly based on the filters that have been added for the request. Ryan
RE: conserving file descriptors vs. ap_save_brigade()
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED]] On Sat, Jul 06, 2002 at 08:25:18AM -0700, Ryan Bloom wrote: From: Justin Erenkrantz [mailto:[EMAIL PROTECTED]] Of course, in the common case of a static file with no filters, we already know the content-length (default handler sets it). IIRC, I've brought up skipping the C-L filter when we already know the C-L (as defined by r-headers_out), but that was not met with approval. Who didn't approve that? I was under the impression that we did skip the C-L filter if we already had a C-L, and it was the filters responsibility to remove the C-L if it was changing it. [EMAIL PROTECTED] I don't know who might have said that. =) -- Justin Pay attention to what that message says please. It says The filter should always be in the stack, and it should always collect information. It doesn't say The filter should always be touching the C-L for the response. We use the data collected by the c-L filter in other places, so we should always try to compute the C-L in the filter. However, we should NEVER buffer unless we absolutely have to. Ryan
RE: PATH_INFO in A2?
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED]] On Sat, Jul 06, 2002 at 03:11:20PM -0400, [EMAIL PROTECTED] wrote: We just added a new function for all input filters to allow this to be done (Justin referenced it in his reply). However, that function doesn't solve the problem, because there should be an ap_filter_is_dynamic(r) that hides the implementation details for all filters. I don't believe that mod_include would want AcceptPathInfo on. Only PHP would. So, I don't know what ap_filter_is_dynamic() would buy here (other than setting r-no_local_copy to 1). -- Justin ap_filter_is_dynamic wouldn't replace the init function, it would be a simple function that the init functions could call to easily setup the filters. Ryan
RE: conserving file descriptors vs. ap_save_brigade()
How big a problem is this really? Most of the time, the content-length filter isn't supposed to actually buffer the brigade. It should only be doing the buffering if we are doing a keepalive request and we can't do chunking. My own opinion is that if we are in that situation, we are potentially best served by just turning off keepalives. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Brian Pane [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 04, 2002 11:44 AM To: [EMAIL PROTECTED] Subject: conserving file descriptors vs. ap_save_brigade() I'm working on a fix to keep file buckets from being mmaped when they're set aside. (The motivation for this is to eliminate the bogus mmap+memcpy+munmap that happens when a client requests a file smaller than 8KB over a keep-alive connection.) The biggest problem I've found is that the scalability of the content-length filter depends on a side-effect of the current mmap-on-file-setaside semantics. Consider the case of an SSI file that includes 10 non-SSI html files of 500 bytes each. As content streams through the output filters, the content-length filter sets aside each bucket until it sees EOS. Currently, the setaside of each of the 10 file buckets turns the file bucket into an mmap bucket. The good news is that this keeps us from running out of file descriptors if a threaded MPM is handling a hundred requests like this at once. The bad news is that the mmap causes a performance degradation. The solutions I've thought of so far are: 1. Change the content-length filter to send its saved brigade on to the next filter if it contains more than one file bucket. (This basically would match the semantics of the core output filter, which starts writing to the socket if it finds any files at all in the brigade.) Pros: We can eliminate the mmap for file bucket setaside and not have to worry about running out of file descriptors. Cons: We lose the ability to compute the content length in this example. (I'm so accustomed to web servers not being able to report a content-length on SSI requests that this wouldn't bother me, though.) 2. Create two variants of ap_save_brigade(): one that does a setaside of the buckets for use in core_output_filter(), and another that does a read of the buckets (to force the mmap of file buckets) for use in ap_content_length_filter(). Pros: Allows us to eliminate the mmap on small keepalive requests, and doesn't limit our ability to report a content-length on SSI requests. Cons: We'll still have a performance problem for a lot of SSI requests due to the mmap in the content-length filter. I like option 1 the best, unless anyone can think of a third option that would work better. --Brian
RE: conserving file descriptors vs. ap_save_brigade()
From: Brian Pane [mailto:[EMAIL PROTECTED]] On Wed, 2002-07-03 at 15:26, Ryan Bloom wrote: How big a problem is this really? Most of the time, the content-length filter isn't supposed to actually buffer the brigade. It should only be doing the buffering if we are doing a keepalive request and we can't do chunking. I'm seeing the buffering in non-keepalive tests, though. Then we have a bug. The C-L filter should only be trying to compute the C-L if we MUST have one on the response, and we don't have to have one in the non-keepalive case. Ryan
RE: [Patch] htpasswd doesn't add a newline to the end of an entry
Committed, thanks. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Thom May [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 02, 2002 10:49 AM To: HTTPD Dev List Subject: [Patch] htpasswd doesn't add a newline to the end of an entry Uh, as the title really. I guess this is a bug - it certainly isn't the old behaviour, as far as I can tell. Cheers, -Thom -- Thom May - [EMAIL PROTECTED] aj *sigh* you'd think a distribution composed of 6000 packages distributed across 13 different architectures (in various stages between pre-alpha and release quality), maintained by 700 amateurs with often conflicting goals who're globally distributed and have rarely met each other -- you'd think a distribution like that would be simpler... Index: htpasswd.c === RCS file: /home/cvspublic/httpd-2.0/support/htpasswd.c,v retrieving revision 1.49 diff -u -u -r1.49 htpasswd.c --- htpasswd.c 19 Jun 2002 17:31:19 - 1.49 +++ htpasswd.c 2 Jul 2002 17:51:14 - @@ -236,6 +236,7 @@ strcpy(record, user); strcat(record, :); strcat(record, cpw); +strcat(record, \n); return 0; }
RE: How to achieve zero-copy when reading request headers?
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED]] On Fri, Jun 28, 2002 at 07:01:34PM -0700, Ryan Bloom wrote: Have you looked at which of these are causing the most performance problems? It seems to me that the easiest thing to do, would be to use a persistent brigade, which removes two steps from this. I'd really prefer that we decouple this persistent brigade from the request_rec. We used to have it coupled (req_cfg-bb) and it made the code too confusing as other places started to rely on this brigade. So, I'd like to not go back down that road and figure out a better scope for such a persistent brigade (a filter context seems about right for the scope). No. Please use the core_request_config-bb variable for this. That structure is protected by CORE_PRIVATE. The vector in the request_rec was specifically designed for this kind of thing, so not using it because people don't understand the API is completely bogus. Document the API better if that is the problem. Until we know where the performance problems are in the current model I believe that it makes little sense to redesign the whole model. I think you can stream-line this by just re-using the same brigade for every header and by moving the copy down to the HTTP_IN filter. HTTP_IN is designed to read the entity - not the headers. HTTP_IN is only active after the request has been setup and the headers have been read. At this stage, HTTP_IN isn't active, so I'm slightly confused what you'd move to HTTP_IN. Yes, you would need to move where HTTP_IN is located in order for the design to work. As far as connection level filters having access to the request, you are right that they do. However, I don't believe that the connection filters should be modifying its request_rec since it may change on each invocation of the filter and something strikes me as wrong if a connection filter plays with the request - to me, that means it is a request filter. Of course they shouldn't be modifying the request_rec directly. I read all of the messages as though you were still going to pass the data up to ap_getline() to do anything with the data, not modify the request_rec inside the filter. -1 on using the filter like that, because it removes any chance for any other input filter to re-write the headers as they are read. Ryan
RE: How to achieve zero-copy when reading request headers?
From: Brian Pane [mailto:[EMAIL PROTECTED]] On Sun, 2002-06-30 at 20:57, Justin Erenkrantz wrote: On Fri, Jun 28, 2002 at 07:01:34PM -0700, Ryan Bloom wrote: Have you looked at which of these are causing the most performance problems? It seems to me that the easiest thing to do, would be to use a persistent brigade, which removes two steps from this. I'd really prefer that we decouple this persistent brigade from the request_rec. We used to have it coupled (req_cfg-bb) and it made the code too confusing as other places started to rely on this brigade. So, I'd like to not go back down that road and figure out a better scope for such a persistent brigade (a filter context seems about right for the scope). The brigade can't be in a filter context, because then it wouldn't be accessible from ap_rgetline_core(). The request_rec is the only place where we could easily put this brigade. We just need to name it something like temp_working_brigade and document the fact that any function that puts things in that brigade must clear them out before it exits. First, please don't put it in the request_rec, it should be placed inside of core_request_config. In fact, if you look at that structure, it already has a brigade specifically for this function. Second, please don't rename that brigade. If people can't read the docs to determine that they aren't supposed to use that brigade, then let them shoot themselves in the foot. In order to find that brigade, they have to read through http_core.h, and they have to define CORE_PRIVATE in their module. Just put a note in the docs that specifically state that 99.9% of module authors shouldn't use that brigade, and that if they do then most likely their filter won't work in all situations. Or, and this is probably better, remove the damned CORE_PRIVATE from our header files. Move those definitions into server/*.h, and stop installing those definitions on user's machines. That will make it FAR more obvious that those structures are private to the core and shouldn't be used by modules. Ryan
RE: How to achieve zero-copy when reading request headers?
stack in ap_read_request() (and remove once we've parsed the request header): let this filter do the scanning for the various forms of HTTP header, remove the header from the brigade, and leave the rest of the brigade for subsequent input filters to handle. It'd have to be a connection-level filter to ensure that we don't lose pipelined-requests since we read too much. But, since we are a connection-level filter, that means you don't have access to the request. So, you'd have to split it out into two filters: - One connection-level filter that will split on a CRLFCRLF and stash the rest away for the next read on this connection. - One request-level filter that will expect a CRLFCRLF-segmented brigade and process the headers and then get out of the way. Connection level filters do have access to the request. I don't understand why you don't think they do. Ryan
RE: How to achieve zero-copy when reading request headers?
From: Brian Pane [mailto:[EMAIL PROTECTED]] One of the biggest remaining performance problems in the httpd is the code that scans HTTP requests. In the current implementation, read_request_line() and ap_get_mime_headers() call ap_rgetline_core(), which has to: - create a temporary brigade - call ap_get_brigade() to get the next line of input from the core input filter, which in turn has to scan for LF and split the bucket - copy the content into a buffer, - destroy the temp brigade - call itself recursively in the (rare) folding case And all this happens for every line of the request header. Have you looked at which of these are causing the most performance problems? It seems to me that the easiest thing to do, would be to use a persistent brigade, which removes two steps from this. The copying into a buffer is required, because you have no way of knowing how many buckets were used in getting the header line from the client. The only way to solve this, is to do the copy in the HTTP_IN filter. That would work just fine, and would remove the copy in the common case. Then, in the getline() function, you can remove the memcpy all together. I don't really care about the recursive case, because is almost never happens. We're creating a ton of temporary brigades and temporary buckets in this process, plus registering and unregistering pool cleanup callbacks and doing at least one memcpy operation per line. I'd like to streamline this processing so that ap_read_request() can do something like this: - get the input brigade from the input filter stack - loop through the brigade and parse the HTTP header - Note: in the common case, the whole header will be in the first bucket No, please don't do this. Currently, it is VERY easy to write a filter that operates on the headers, because you can rely on the code separating the headers into individual brigades. I personally have a filter that modifies the request line, and this model would make that filter MUCH harder to write. - split the bucket where the request header ends, remove the bucket(s) containing the header from the brigade, and hand the remainder of the brigade back to the filter stack This requires push-back, which has been discussed and vetoed, because it adds a lot of complexity with very little gain. Until we know where the performance problems are in the current model I believe that it makes little sense to redesign the whole model. I think you can stream-line this by just re-using the same brigade for every header and by moving the copy down to the HTTP_IN filter. Ryan
RE: 2.0 book
It's being printed now, should be in stores in a week or two. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Aryeh Katz [mailto:[EMAIL PROTECTED]] Sent: Thursday, June 27, 2002 8:18 AM To: [EMAIL PROTECTED] Subject: Re: 2.0 book On Thu, 27 Jun 2002, Aryeh Katz wrote: I saw that Ryan Bloom is (was) scheduled to release a book on the 2.0 server. Does that book have information for developers, or is it intended for sysadmins? It definitely has information for developers. Lots of it. :) Do you know when it it's going to be released? Aryeh --- Aryeh Katz VASCO www.vasco.com
RE: Apache 2.0 Numbers
It would be nice if there was an apxs flag that would return the MPM type. +1 There is. -q will query for any value in config_vars.mk, and MPM_NAME is in that file. So `apxs -q MPM_NAME` will return the configured MPM type. Ryan
RE: core_output_filter buffering for keepalives? Re: Apache 2.0 Numbers
I think we should leave it alone. This is the difference between benchmarks and the real world. How often do people have 8 requests in a row that total less than 8K? As a compromise, there are two other options. You could have the core_output_filter refuse to buffer more than 2 requests, or you could have the core_output_filter not buffer if the full request is in the buffer. Removing the buffering is not the correct solution, because it does have a negative impact in the real world. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Brian Pane [mailto:[EMAIL PROTECTED]] Sent: Sunday, June 23, 2002 9:38 PM To: [EMAIL PROTECTED] Subject: core_output_filter buffering for keepalives? Re: Apache 2.0 Numbers On Sun, 2002-06-23 at 20:58, Brian Pane wrote: For what it's worth, I just tried the test case that you posted. On my test system, 2.0 is faster when I run ab without -k, and 1.3 is faster when I run with -k. I studied this test case and found out why 2.0 runs faster in the non-keepalive case and slower in the non-keepalive case. It's because of the logic in core_output_filter() that tries to avoid small writes when c-keepalive is set. In Rasmus's test, the file size is only 1KB, so core_output_filter reads in and buffers the contents of 8 requests before it finally reaches the 8KB threshold and starts calling writev. 1.3, in contrast, writes out each request's response immediately, with no buffering. I'm somewhat inclined to remove the buffering in this case, and let core_output_filter send the response as soon as it sees EOS for each request, even if that means sending a small packet. I just tried this in my working tree, and it does speed up 2.0 for this particular test case. On the other hand, I'm not a fan of small write calls in general. Anyone else care to offer an opinion on whether we should remove the buffering in this situation? --Brian
RE: worker MPM shutdown
From: Brian Pane [mailto:[EMAIL PROTECTED]] On Sat, 2002-06-22 at 17:01, Ryan Bloom wrote: I believe that the problem is platform specific. The reason that loop was added, was to allow for graceless shutdown on linux. On non-linux platforms, killing the main thread kills the whole process, but on linux this doesn't work. The point of closing the sockets was to force the worker threads to finish ASAP so that the process could die. Why not just pthread_kill() all the workers if we're running on Linux? That would be a much more direct way of stopping them than closing the sockets. Because asynchronous cancellation of threads sucks, and APR doesn't support it. Most OSes leak memory when you use pthread_kill(). Ryan
RE: Query: bugs 8712 and 10156
From: Justin Erenkrantz [mailto:[EMAIL PROTECTED]] At some point, Larry Rosenman [EMAIL PROTECTED] excited the electrons: I submitted 8712 a month or more ago, and have gotten NO feedback at all. FreeBSD is packaging their version with mod_ssl. We don't include mod_ssl with 1.3. We have no control over the configuration that FreeBSD decides to use. I would refer back to the FreeBSD port maintainer and help them come up with a configuration that isn't susceptable to this problem. I'm not really sure how we can fix this problem considering that the ASF has no control over mod_ssl or FreeBSD's configurations that they use. The ASF distribution does not contain IfDefine SSL in its default configuration. As a hint to the FreeBSD maintainer, the idea here is to always load the mod_ssl library and then conditionally execute the SSL directives. I do not believe that loading DSOs conditionally is what we intended with the IfDefine construct. Actually loading the module inside the IfDefine is exactly what was meant. The bug here is in apxs not FreeBSD. I considered answering this PR the same way you did Justin, but it is our problem. The problem is how apxs is adding modules. Currently, apxs looks for the last LoadModule in the config file, and just adds a new line after it. If the last module you load in your config is mod_ssl, then you are susceptible to this problem. The solution is to have a line like: #LoadModule goes here and always key off of that line. Ryan
RE: Apache 2.0 Numbers
From: Rasmus Lerdorf [mailto:[EMAIL PROTECTED]] It would be nice if there was an apxs flag that would return the MPM type. +1 There is. -q will query for any value in config_vars.mk, and MPM_NAME is in that file. So `apxs -q MPM_NAME` will return the configured MPM type. Ah right. Is there a way to check at runtime as well? I've added a PHP configure check now to the apache2filter sapi module so it will come up non-threaded by default if it sees Apache2-prefork. Just a bit worried about someone changing their MPM after the fact, so perhaps a runtime check is needed as well. Runtime is harder, but you can just use ap_mpm_query to get the MPMs characteristics. This won't give you the MPM name, but it will let you know if the MPM is threaded or not. Ryan
RE: Apache 2.0 Numbers
My bad. Post_config is a run_all. If you return DONE the server won't start. This is what the MPMs do if the socket is already taken. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Rasmus Lerdorf [mailto:[EMAIL PROTECTED]] Sent: Monday, June 24, 2002 8:34 AM To: Ryan Bloom Cc: [EMAIL PROTECTED] Subject: RE: Apache 2.0 Numbers On Mon, 24 Jun 2002, Rasmus Lerdorf wrote: What is the correct way to fail in a filter post_config? Do I return -1 from it if my filter finds a fatal error? I can't use ap_log_rerror() at this point, right? How would I log the reason for the failure? I'm confused by the question, but I'll try to answer. If you mean the post_config phase, then you can use ap_log_error or ap_log_perror. If you want to stop the server from starting, just return DECLINED. Right, I found ap_log_error. It was the return value I was looking for. None of the example filter modules had a fatal error check at the config phase. So returning a -1 is the correct way to stop the server from starting. Thanks. Hrm.. Nope. doing 'return DECLINED' from the post_config phase does not stop the server from starting. I have this: static int php_apache_server_startup(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp, server_rec *s) { void *data = NULL; const char *userdata_key = apache2filter_post_config; #ifndef ZTS int threaded_mpm; ap_mpm_query(AP_MPMQ_IS_THREADED, threaded_mpm); if(threaded_mpm) { ap_log_error(APLOG_MARK, APLOG_CRIT, 0, s, Apache is running a threaded MPM, but your PHP Module is not compiled to be threadsafe. You need to recompile PHP.); return DECLINED; } #endif ... } ... ap_hook_pre_config(php_pre_config, NULL, NULL, APR_HOOK_MIDDLE); ap_hook_post_config(php_apache_server_startup, NULL, NULL, APR_HOOK_MIDDLE); And in my log I get: [Mon Jun 24 08:27:23 2002] [crit] Apache is running a threaded MPM, but your PHP Module is not compiled to be threadsafe. You need to recompile PHP. [Mon Jun 24 08:27:23 2002] [crit] Apache is running a threaded MPM, but your PHP Module is not compiled to be threadsafe. You need to recompile PHP. [Mon Jun 24 08:27:23 2002] [notice] Apache/2.0.40-dev (Unix) configured -- resuming normal operations
RE: Apache 2.0 Numbers
From: Cliff Woolley [mailto:[EMAIL PROTECTED]] On Mon, 24 Jun 2002, Rasmus Lerdorf wrote: Hrm.. Nope. doing 'return DECLINED' from the post_config phase does not stop the server from starting. I have this: I thought you were supposed to return HTTP_INTERNAL_SERVER_ERROR. No. That implies that you have an actual HTTP error. You don't, this is during config processing, not request processing. Yes, that value will work, but it is incorrect semantically. Ryan
RE: Apache 2.0 Numbers
From: Rasmus Lerdorf [mailto:[EMAIL PROTECTED]] On Mon, 24 Jun 2002, Cliff Woolley wrote: On Mon, 24 Jun 2002, Rasmus Lerdorf wrote: Hrm.. Nope. doing 'return DECLINED' from the post_config phase does not stop the server from starting. I have this: I thought you were supposed to return HTTP_INTERNAL_SERVER_ERROR. In include/http_config.h it says: /** * Run the post_config function for each module * @param pconf The config pool * @param plog The logging streams pool * @param ptemp The temporary pool * @param s The list of server_recs * @return OK or DECLINED on success anything else is a error */ So I guess I need to return 'anything else' Trying this, ie. returning -2 it does the job. But this seems a little vague. Should we perhaps have a #define FATAL -2 or something similar so I don't get stepped on later on if someone decides to use -2 for something else? As it happens, DONE is defined to be -2. :-) Ryan
RE: Apache 2.0 Numbers
From: Rasmus Lerdorf [mailto:[EMAIL PROTECTED]] As it happens, DONE is defined to be -2. :-) Ok, I will use that, but 'DONE' doesn't really give the impression of being a fatal error return value. I know. It's original use was for use during request processing, when a module wanted to be sure that it was the last function run for a specific hook. Basically this value ensured that no other functions were run for that hook. Ryan
Re: Query: bugs 8712 and 10156
That is just Ken's way of getting stuff onto the development list and stating that he hasn't done anything with it yet. :-) Ryan On 24 Jun 2002, Larry Rosenman wrote: On Mon, 2002-06-24 at 11:36, Rodent of Unusual Size wrote: No acked. Huh? -- #kenP-)} Ken Coar, Sanagendamgagwedweinini http://Golux.Com/coar/ Author, developer, opinionist http://Apache-Server.Com/ Millennium hand and shrimp! From: Larry Rosenman [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Query: bugs 8712 and 10156 Date: 23 Jun 2002 15:19:31 -0500 I submitted 8712 a month or more ago, and have gotten NO feedback at all. I just submitted 10156 and wonder what it would take to get the patch into the next release. LER -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED] US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
RE: state of mpm perchild
There is no date set, because this is all volunteer work. It will not be production quality for a number of months. I hope to have it working again by the end of the week. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Nick De Decker [mailto:[EMAIL PROTECTED]] Sent: Monday, June 24, 2002 10:15 AM To: [EMAIL PROTECTED] Subject: state of mpm perchild Hi guys, Any progress on the mpm perchild for linux ? I'm setting up a production webserver and really would like to use this mpm. Any date set ? Regards Nick DeDecker
RE: core_output_filter buffering for keepalives? Re: Apache 2.0 Numbers
From: Brian Pane [mailto:[EMAIL PROTECTED]] Ryan Bloom wrote: I think we should leave it alone. This is the difference between benchmarks and the real world. How often do people have 8 requests in a row that total less than 8K? As a compromise, there are two other options. You could have the core_output_filter refuse to buffer more than 2 requests, or you could have the core_output_filter not buffer if the full request is in the buffer. In your second option, do you mean full response rather than full request? If so, it's equivalent to what I was proposing yesterday: send the response for each request as soon as we see EOS for that request. I like that approach a lot because it keeps us from doing an extra mmap/memcpy/memunmap write before the write in the real-world case where a client sends a non-pipelined request for a small (8KB) file over a keepalive connection. I do mean full response, and that is NOT equivalent to sending the response as soon as you see the EOS for that request. EVERY request gets its own EOS. If you send the response when you see the EOS, then you will have removed all of the buffering for pipelined requests. You are trying to solve a problem that doesn't exist in the real world IIUIC. Think it through. The problem is that if the page is 1k and you request that page 8 times on the same connection, then Apache will buffer all of the responses. How often are those conditions going to happen in the real world? Now, take it one step further please. The real problem is how AB is measuring the results. If I send 3 requests, and Apache buffers the response, how is AB measuring the time? Does it start counting at the start of every request, or is the time just started at start of the first request? Perhaps a picture: Start of requestresponse received 0 5 10 15 20 25 30 35 --- request 1 --- request 2 - request 3 -- request 4 Did the 4 requests take 35 seconds total, or 85 seconds? I believe that AB is counting this as 85 seconds for 4 requests, but Apache is only taking 35 seconds for the 4 requests. Ryan
RE: core_output_filter buffering for keepalives? Re: Apache 2.0 Numbers
From: Brian Pane [mailto:[EMAIL PROTECTED]] Ryan Bloom wrote: From: Brian Pane [mailto:[EMAIL PROTECTED]] Ryan Bloom wrote: I think we should leave it alone. This is the difference between benchmarks and the real world. How often do people have 8 requests in a row that total less than 8K? As a compromise, there are two other options. You could have the core_output_filter refuse to buffer more than 2 requests, or you could have the core_output_filter not buffer if the full request is in the buffer. In your second option, do you mean full response rather than full request? If so, it's equivalent to what I was proposing yesterday: send the response for each request as soon as we see EOS for that request. I like that approach a lot because it keeps us from doing an extra mmap/memcpy/memunmap write before the write in the real-world case where a client sends a non-pipelined request for a small (8KB) file over a keepalive connection. I do mean full response, and that is NOT equivalent to sending the response as soon as you see the EOS for that request. EVERY request gets its own EOS. If you send the response when you see the EOS, then you will have removed all of the buffering for pipelined requests. -1 on buffering across requests, because the performance problems caused by the extra mmap+munmap will offset the gain you're trying to achieve with pipelining. Wait a second. Now you want to stop buffering to fix a completely differeny bug. The idea that we can't keep a file_bucket in the brigade across requests is only partially true. Take a closer look at what we are doing and why when we convert the file to an mmap. That logic was added so that we do not leak file descriptors across requests. However, there are multiple options to fix that. The first, and easiest one is to just have the core_output_filter call apr_file_close() after it has sent the file. The second is to migrate the apr_file_t to the conn_rec's pool if and only if the file needs to survive the request's pool being killed. Because we are only migrating file descriptors in the edge case, this shouldn't cause a big enough leak to cause a problem. You are trying to solve a problem that doesn't exist in the real world IIUIC. Think it through. The problem is that if the page is 1k and you request that page 8 times on the same connection, then Apache will buffer all of the responses. How often are those conditions going to happen in the real world? That's not the problem that I care about. The problem that matters is the one that happens in the real world, as a side-effect of the core_output_filter() code trying to be too clever: - Client opens a keepalive connection to httpd - Cclient requests a file smaller than 8KB - core_output_filter(), upon seeing the file bucket followed by EOS, decides to buffer the output because it has less than 8KB total. - There isn't another request ready to be read (read returns EAGAIN) because the client isn't pipelining its connections. So we then do the writev of the file that we've just finished buffering. But, this case only happens if the headers + the file are less than 8k. If the file is 10k, then this problem doesn't actually exist at all. As I said above, there are better ways to fix this than removing all ability to pipeline responses. Aside from slowing things down for the user, this hurts the scalability of the httpd (mmap and munmap don't scale well on multiprocessor boxes). What we should be doing in this case is just doing the write immediately upon seeing the EOS, rather than penalizing both the client and the server. By doing that, you are removing ANY benefit to using pipelined requests when serving files. Multiple research projects have all found that pipelined requests show a performance benefit. In other words, you are fixing a performance problem by removing another performance enhancer. Ryan
RE: core_output_filter buffering for keepalives? Re: Apache 2.0 Numbers
From: Bill Stoddard [mailto:[EMAIL PROTECTED]] From: Brian Pane [mailto:[EMAIL PROTECTED]] Ryan Bloom wrote: From: Brian Pane [mailto:[EMAIL PROTECTED]] Ryan Bloom wrote: Wait a second. Now you want to stop buffering to fix a completely differeny bug. The idea that we can't keep a file_bucket in the brigade across requests is only partially true. Take a closer look at what we are doing and why when we convert the file to an mmap. That logic was added so that we do not leak file descriptors across requests. However, there are multiple options to fix that. The first, and easiest one is to just have the core_output_filter call apr_file_close() after it has sent the file. The second is to migrate the apr_file_t to the conn_rec's pool if and only if the file needs to survive the request's pool being killed. Because we are only migrating file descriptors in the edge case, this shouldn't cause a big enough leak to cause a problem. You are trying to solve a problem that doesn't exist in the real world IIUIC. Think it through. The problem is that if the page is 1k and you request that page 8 times on the same connection, then Apache will buffer all of the responses. How often are those conditions going to happen in the real world? That's not the problem that I care about. The problem that matters is the one that happens in the real world, as a side-effect of the core_output_filter() code trying to be too clever: - Client opens a keepalive connection to httpd - Cclient requests a file smaller than 8KB - core_output_filter(), upon seeing the file bucket followed by EOS, decides to buffer the output because it has less than 8KB total. - There isn't another request ready to be read (read returns EAGAIN) because the client isn't pipelining its connections. So we then do the writev of the file that we've just finished buffering. But, this case only happens if the headers + the file are less than 8k. If the file is 10k, then this problem doesn't actually exist at all. As I said above, there are better ways to fix this than removing all ability to pipeline responses. Aside from slowing things down for the user, this hurts the scalability of the httpd (mmap and munmap don't scale well on multiprocessor boxes). What we should be doing in this case is just doing the write immediately upon seeing the EOS, rather than penalizing both the client and the server. By doing that, you are removing ANY benefit to using pipelined requests when serving files. Multiple research projects have all found that pipelined requests show a performance benefit. In other words, you are fixing a performance problem by removing another performance enhancer. Ryan Ryan, Solve the problem to enable setting aside the open fd just long enough to check for a pipelined request will nearly completely solve the worst part (the mmap/munmap) of this problem. On systems with expensive syscalls, we can do browser detection and dynamically determine whether we should attempt the pipelined optimization or not. Not many browsers today support pipelining requests, FWIW. That would be a trivial change. I'll have a patch posted for testing later today. Ryan
RE: core_output_filter buffering for keepalives? Re: Apache 2.0 Numbers
-1 on buffering across requests, because the performance problems caused by the extra mmap+munmap will offset the gain you're trying to achieve with pipelining. Wait a second. Now you want to stop buffering to fix a completely differeny bug. The idea that we can't keep a file_bucket in the brigade across requests is only partially true. Take a closer look at what we are doing and why when we convert the file to an mmap. That logic was added so that we do not leak file descriptors across requests. However, there are multiple options to fix that. The first, and easiest one is to just have the core_output_filter call apr_file_close() after it has sent the file. The second is to migrate the apr_file_t to the conn_rec's pool if and only if the file needs to survive the request's pool being killed. Because we are only migrating file descriptors in the edge case, this shouldn't cause a big enough leak to cause a problem. Migrating the apr_file_t to the conn_rec's pool is an appealing solution, but it's quite dangerous. With that logic added to the current 8KB threshold, it would be too easy to make an httpd run out of file descriptors: Send a pipeline of a few hundred requests for some tiny file (e.g., a small image). They all get setaside into the conn_rec's pool. Then send a request for something that takes a long time to process, like a CGI. Run multiple copies of this against the target httpd at once, and you'll be able to exhaust the file descriptors for a threaded httpd all too easily. That is why we allow people to control how many requests can be sent on the same connection. Or, you can just have a limit on the number of file descriptors that you are willing to buffer. And, the pipe_read function should be smart enough that if we don't get any data off of the pipe, for say 30 seconds, then we flush whatever data we currently have. That's not the problem that I care about. The problem that matters is the one that happens in the real world, as a side-effect of the core_output_filter() code trying to be too clever: - Client opens a keepalive connection to httpd - Cclient requests a file smaller than 8KB - core_output_filter(), upon seeing the file bucket followed by EOS, decides to buffer the output because it has less than 8KB total. - There isn't another request ready to be read (read returns EAGAIN) because the client isn't pipelining its connections. So we then do the writev of the file that we've just finished buffering. But, this case only happens if the headers + the file are less than 8k. If the file is 10k, then this problem doesn't actually exist at all. As I said above, there are better ways to fix this than removing all ability to pipeline responses. We're not removing the ability to pipeline responses. You are removing a perfectly valid optimization to stop us from sending a lot of small packets across pipelined responses. Ryan
RE: worker MPM shutdown
I believe that the problem is platform specific. The reason that loop was added, was to allow for graceless shutdown on linux. On non-linux platforms, killing the main thread kills the whole process, but on linux this doesn't work. The point of closing the sockets was to force the worker threads to finish ASAP so that the process could die. Ryan From: Brian Pane [mailto:[EMAIL PROTECTED]] During the worker MPM non-graceful shutdown, the signal_threads() function attempts to close all open sockets. I have two major objections to this: 1) It's not necessarily safe to close a socket that another thread is using. Note that apr_socket_close() calls the pool cleanup on the pool from which the socket was allocated--bad news if one of the active worker threads happens to be, say, registering a new cleanup in the same pool at the same time. 2) It appears to be contributing to the fact that non-graceful shutdown doesn't work. Without the socket shutdown loop, the child processes shut down promptly. As I understand it, the motivation for closing the sockets during shutdown was to try to fix a race condition in which an active worker thread might be trying to write to a socket at the same time that the underlying pool was being destroyed (due to the recursive destruction of APR's global_pool during apr_terminate()). If so, I think the right solution is to add a way to create parentless pools that aren't implicitly added as children to the global pool, so that a worker thread's pool won't disappear before that thread does. Is there any specific reason why we're not doing this already? Thanks, --Brian
RE: CAN-2002-0392 : what about older versions of Apache?
From: Aaron Bannert [mailto:[EMAIL PROTECTED]] On Sun, Jun 23, 2002 at 05:09:05PM -0700, Roy Fielding wrote: I have re-uploaded a patch to fix the problem on all versions of httpd 1.2.0 through 1.3.22. This time I added the four lines that check for a negative return value from atol, even though there has been no evidence of any such error in the standard C libraries. To the person who deleted my prior patch: You just wasted my Sunday afternoon. Even if the patch didn't, by some stretch of your imagination, suffice for the broken atol case, you prevented people from protecting themselves against a published exploit script that doesn't even use content-length as an attack. Do not remove my patch unless you replace it with a better fix that is known to apply for that version and compile on all platforms. -1 to any additions of ap_strtol to prior versions of Apache. That introduced more problems than it fixed. There is no reason to work around the operating system when a simple fix to our own code is necessary and sufficient to solve the problem. I don't remember seeing any +1's for this patch on the list. Please remove this patch until one can be made that addresses the same issues with the proxy code (which also uses get_chunk_size()). The proxy didn't use that code until it supported HTTP 1.1, which didn't happen until 1.3.24. Roy is right, removing this patch is completely bogus. Ryan
RE: cvs commit: httpd-2.0/modules/generators mod_cgi.c mod_cgid.c
From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] Sent: Saturday, June 22, 2002 9:45 AM To: [EMAIL PROTECTED] Subject: Re: cvs commit: httpd-2.0/modules/generators mod_cgi.c mod_cgid.c Looking for both faults (passing the next filter when we are doing lookups, and passing NULL when we are going to serve the content) ... it really seems that; D:\clean\httpd-2.0\modules\mappers\mod_negotiation.c(2550): sub_req = ap_sub_req_lookup_file(variant-file_name, r, NULL); SHOULD be setting the next filter. Anyone want to dig on this? I am missing something. The third argument has nothing to do with whether the request will be served. It has to do with how the filter stack is initialized in the sub-request. If the argument is NULL, the sub-requests filters are set to the protocol filters of the original request. If the argument isn't NULL, then the sub-request's filters are initialized to the filter passed in. In either case, the data can either be sent or not. Please revert the doc portions of your last change, they are incorrect. The code is safe to leave, because the requests aren't actually served. However, mod_negotiation is 100% correct, and should NOT be changed. Ryan Bill wrowe 2002/06/22 09:32:45 Modified:include http_request.h modules/generators mod_cgi.c mod_cgid.c Log: Note the changed meaning of the NULL next_filter argument to the ap_sub_req_lookup() family, and fix a few oddball cases (those are, PATH_TRANSLATED reference issues.)
RE: 2.0.38-39 lockup problem ?
From: gregames [mailto:gregames] On Behalf Of Greg Ames Paul J. Reder wrote: This looks exactly like the problem that Allan and I ran into when you tried to send a request to http://foo.bar.org:443 (i.e. insecure request over the secure port). It tried to generate an error and went into an infinte loop. Can you try that with current HEAD and let us know what happens? While the problems were similar, they were not caused by the same code, and the SSL problem would not have been touched at all by this patch. The problem with mod_ssl was fixed a while ago, and it never touched the ap_die code. Basically, in mod_ssl's post_read_request phase, we check for a flag to determine if the request was HTTP over the HTTPS port. The problem is that if we find the flag, then we do an internal redirect, which also runs the post_read_request function. Because the flag wasn't cleared, we just returned another error, which caused another internal_redirect, which ran the same phase, etc. Ryan
Re: Linux and the worker model
They should both be there. The version in mpm_common is for MOST mpms, but worker uses a different set of functions. If you worker MPM headers are actually defining AP_MPM_USES_POD, then there is a problem with the source. The version in CVS does not define this for worker. Ryan On Thu, 20 Jun 2002, Jean-Jacques Clar wrote: I am building 2.0.39, but it was the same thing with 2.0.35. [EMAIL PROTECTED] 06/20/02 10:26AM On Thu, Jun 20, 2002 at 09:09:45AM -0600, Jean-Jacques Clar wrote: There are errors when building 2.0.x on Linux RH7.3 with the worker Which version of 2.0 exactly? -aaron -- ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
RE: I asked this on the freebsd-general list...
To the best of my knowledge, we have told FreeBSD that their threading support doesn't work with Apache. In general, we have found inconsistent return values with a lot of the networking APIs when threading was enabled. There are rumors that these are fixed in 5.0, but I haven't tested it. Suffice to say, we worked for a long time to get apache.org to run a threaded MPM, and we always found that pages were being truncated for no reason. After digging into it, we found that the problem was the freebsd network layer when threads were enabled. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Rick Kukiela [mailto:[EMAIL PROTECTED]] Sent: Tuesday, June 18, 2002 8:44 AM To: apache-dev Subject: I asked this on the freebsd-general list... In the last episode (Jun 17), Rick Kukiela said: I asked on the apache-dev list about when perchild processing will work on freebsd and this was the response i got: It will never work with FreeBSD 4.6. Perchild requires a good threading library, and FreeBSD doesn't have one. What is being done about this? THEIR REPLY: Unless they provide more details (i.e. proof that it's bad), there's not much that can be done. The userland threads system is good enough to run MySQL, mozilla, openoffice, and who knows how many other pieces of software. -- Dan Nelson [EMAIL PROTECTED] So what's the deal here? can some one tell them their threads suck so it can be fixed? Thanks Rick
RE: cvs commit: httpd-2.0/support htpasswd.c
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] rbb 2002/06/16 08:52:16 Modified:.CHANGES support htpasswd.c Log: Finish the htpasswd port to APR. This brings the file checking code to APR. I am done changing htpasswd, but I should mention that I severely dislike the current file checking code. For people who don't know, we do a series of checks at the beginning of running the code to determine if we can read/write/update the password file before we ever actually try to open it. IMHO, that is completely incorrect. We should do the checks while opening the file, and fail cleanly. As things stand now, if we want to add more file checks, that logic is just going to get uglier and uglier, and there is the real chance that people will forget to update those checks if they add features to htpasswd. Unfortunately, the code is somewhat ugly to read, and I don't have anymore time this morning. If I have time when I get home tonight, I will look at refactoring the code to flow a bit cleaner. However, if anybody would like to beat me to it, feel free. Ryan
RE: cvs commit: httpd-2.0/modules/mappers mod_userdir.c
From: Aaron Bannert [mailto:[EMAIL PROTECTED]] On Sat, Jun 15, 2002 at 07:20:59AM -, [EMAIL PROTECTED] wrote: 1.50 +5 -5 httpd-2.0/modules/mappers/mod_userdir.c Index: mod_userdir.c === RCS file: /home/cvs/httpd-2.0/modules/mappers/mod_userdir.c,v retrieving revision 1.49 retrieving revision 1.50 diff -u -r1.49 -r1.50 --- mod_userdir.c 31 May 2002 16:59:13 - 1.49 +++ mod_userdir.c 15 Jun 2002 07:20:59 - 1.50 @@ -347,12 +347,12 @@ r-pool)) == APR_SUCCESS || rv == APR_INCOMPLETE))) { r-filename = apr_pstrcat(r-pool, filename, dname, NULL); - /* XXX: Does this walk us around FollowSymLink rules? +/* XXX: Does this walk us around FollowSymLink rules? * When statbuf contains info on r-filename we can save a syscall - * by copying it to r-finfo - */ - if (*userdirs dname[0] == 0) - r-finfo = statbuf; + * by copying it to r-finfo + */ +if (*userdirs dname[0] == 0) +r-finfo = statbuf; /* For use in the get_suexec_identity phase */ apr_table_setn(r-notes, mod_userdir_user, w); Is this anything more than an intentation change, and if so did you mean to include it in the same commit with the shtml AddType fix? It is just indentation, and no I didn't mean to include it, but since it is safe, I left it there. Ryan
RE: cvs commit: httpd-2.0/docs/error/include bottom.html
From: Aaron Bannert [mailto:[EMAIL PROTECTED]] On Sat, Jun 15, 2002 at 11:02:18AM -0400, Joshua Slive wrote: [EMAIL PROTECTED] wrote: rbb 2002/06/15 00:01:25 Modified:docs/error/include bottom.html Log: Comment out the SERVER_STRING variable from our default error documents. Some people do not like having this information in their error pages, and it makes sense to not do it by default. If users want this back, they can uncomment it. PR: 9319 Personally, I think this is silly. The server signature on error pages is there for a good reason: helping people debug problems, especially with requests that pass through proxies, etc. I agree, and the same logic above applies in reverse: If an admin doesn't want to reveal the server string in the error document, they can remove that part themselves. With one major difference. We provide server configuration directives to stop this stuff from being displayed. Whether they are correct or not, many admins do believe that they are improving security by not exposing this information. The problem is that you can change the config and not affect the default error pages that we ship. If you want to get the information, then it is easy to add back. However, I would simply suggest that the default error documents should not be included in the default config. Include the files, comment the config, and this issue goes away. As things stand right now, most admins have no clue that we have replaced the default Apache error documents which is why putting information that they _may_ want to keep private in them is completely wrong. Ryan
RE: questions from my trivial patch
From: Brian Degenhardt [mailto:[EMAIL PROTECTED]] The functions for ap_hook_open_logs, ap_hook_post_config, and ap_hook_pre_config will fail if DECLINE is returned. Is that intentional? Yes. This is because those are configuration phases. If a module tries to do something in those phases and can't, then the server should not continue to run, because something wasn't configured properly. If the server did continue, then it would most likely not behave the way it was expected to. In the quick_handler, returning HTTP_mumble will result in that status being returned in a real request, but it will fallback to the real handler in a subrequest. Is that intentional? I don't know, but it sounds completely incorrect. Bill, can you shed any light here? What's the difference between adding your filters inside your ap_hook_insert_filter hook versus calling ap_register_*_filter right inside of the register_hooks callback? You are confusing registering and adding hooks. You should always register the hooks in register_hooks, but adding the hook to the request should be done during request processing. Basically registering a filter lets the server know that the filter exists, but it doesn't enable it for any requests. Adding the filter enables it for that request. Ryan
RE: 2.0.37 up on icarus
No, it won't. I've been busy, and I haven't been hacking on it recently. Hopefully soon. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Nick De Decker [mailto:[EMAIL PROTECTED]] Sent: Wednesday, June 12, 2002 5:31 AM To: [EMAIL PROTECTED] Subject: Re: 2.0.37 up on icarus Hi, will the perchild MPM work in 2.0.37 (i686 linux) ? A couple of weeks ago i read that it would work in that release. Regards, Nick - Original Message - From: Greg Ames [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Tuesday, June 11, 2002 10:05 PM Subject: Re: 2.0.37 up on icarus Cliff Woolley wrote: 2.0.37 final is up on icarus for prerelease testing. Looks good on daedalus also, except for the libtool version. I plan to put it in production this evening. Greg
Re: nope. HEAD hangs. (was: Re: 2.0.37 ready to release?)
On Wed, 12 Jun 2002, Greg Stein wrote: On Wed, Jun 12, 2002 at 04:54:16PM -0700, Greg Stein wrote: ... $ telnet cvs.apache.org 80 HEAD /viewcvs/ HTTP/1.0 host: cvs.apache.org HANG!!! No idea what is causing it right now, but I found it because I can't make updates on freshmeat.net. It tries to do a HEAD on the URLs you give it. This was happening when icarus was on an older Apache, then it got upgraded to 2.0.3??, and I think it started working (not sure). Now that it is at 2.0.37... it definitely doesn't work. Hmm. I left the telnet running while composing this email, and eventually the server closed the connection. So I re-ran it with time to see how long the timeout was... 5 minutes. After the timeout, then a good-looking response was generated. Cheers, Everytime I see stuff like this it is a module trying to get input data when there is none. I'll look at this tonight. Ryan ___ Ryan Bloom [EMAIL PROTECTED] 550 Jean St Oakland CA 94610 ---
RE: Output filter behaviour
Hi guys, How do we handle errors in an output filter if the headers have already been sent? You can't. You must buffer all data until you know whether you will have an error or not. This should make sense, once the first block of data has been sent to the next filter, you have lost all control of the data, and the headers may have been sent to the client already. Once the headers have been sent to the client, you can't return an error at all. Suppose: static apr_status myfilter( ... ) { if( !ctx ) { } if( ctx-state++ == 0 ) { } else { /* headers already sent??? */ ... ERROR DETECTED HERE!!! } return ap_pass_brigade( f-next, bb ); } What will happen if I 'return APR_EGENERAL' when the error is detected? What is the right thing to do? I don't think anything will happen, except that the response will be terminated early. BTW, the proper way to report an error from within a filter is to create an error bucket that specifies the HTTP error code. Also, how does http_header filter calculate 'content-length'. Aren't output filters feeding data to the client chunks by chunks? So how does it know how big it will be? The Content-Length filter calculates the content-length, not the http_header. It has a lot of logic to skip calculating the C-L, unless it absolutely has to. Remember, that we can use chunked encoding or, if it isn't a keepalive request, we can just terminate the connection, so we don't always need a C-L. The handler can also specify the C-L. If a filter modifies the data, then it has the responsibility to remove the C-L from the response headers. Ryan
RE: cvs commit: httpd-2.0/modules/ssl ssl_engine_kernel.c
It looks like the problem is only encountered if you have an ErrorDocument that is SSI parsed. I obviously hadn't configured everything properly when I was moving configurations around. :-( Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Paul J. Reder [mailto:[EMAIL PROTECTED]] Sent: Tuesday, June 11, 2002 6:59 AM To: [EMAIL PROTECTED] Subject: Re: cvs commit: httpd-2.0/modules/ssl ssl_engine_kernel.c Thank you kind sir, I have confirmed that this does the trick for me. The curious question is why Ryan was never able to reproduce it... [EMAIL PROTECTED] wrote: jwoolley2002/06/10 21:54:01 Modified:modules/ssl ssl_engine_kernel.c Log: fix the infinite recursion problem with HTTP-on-the-HTTPS port. Reported by: Paul J. Reder Submitted by: Ryan Bloom Revision ChangesPath 1.76 +7 -0 httpd-2.0/modules/ssl/ssl_engine_kernel.c Index: ssl_engine_kernel.c === RCS file: /home/cvs/httpd-2.0/modules/ssl/ssl_engine_kernel.c,v retrieving revision 1.75 retrieving revision 1.76 diff -u -d -u -r1.75 -r1.76 --- ssl_engine_kernel.c 11 Jun 2002 03:45:54 - 1.75 +++ ssl_engine_kernel.c 11 Jun 2002 04:54:01 - 1.76 @@ -199,6 +199,13 @@ thisurl, thisurl); apr_table_setn(r-notes, error-notes, errmsg); + +/* Now that we have caught this error, forget it. we are done + * with using SSL on this request. + */ +sslconn-non_ssl_request = 0; + + return HTTP_BAD_REQUEST; } -- Paul J. Reder --- The strength of the Constitution lies entirely in the determination of each citizen to defend it. Only if every single citizen feels duty bound to do his share in this defense are the constitutional rights secure. -- Albert Einstein
RE: HEAD Executes CGI on HEAD
From: Jerry Baker [mailto:[EMAIL PROTECTED]] Is it correct for Apache to be executing includes when a HEAD request is issued for a document that contains includes? Yep. Apache treats a HEAD request exactly like a GET request, except that we don't return the body. The HTTP spec states that we have to return the same headers as we would return in a GET request, which usually means that we need to actually run the request. Ryan
RE: how many EOS buckets should a filter expect? (subrequest, PR 9644)
From: [EMAIL PROTECTED] [mailto:trawick@rdu88-251- Initially I would think that a filter should see at most one EOS. mod_ext_filter doesn't have logic to ignore subsequent ones, resulting in a superfluous error message from a failed syscall when it tries to re-do some cleanup when it hits a second EOS. In this case, the subrequest is handled by default_handler which passes down a FILE bucket and an EOS bucket. After that has completed, ap_finalize_sub_req_protocol() passes down another EOS bucket. Why does ap_finalize_sub_req_protocol() pass down an EOS? Isn't the handler responsible for that? Is this to clean up in case the handler encountered an error and failed to pass down an EOS? Output filters can only support and expect a single EOS bucket. Input filters, however, seem to be moving to a multi-EOS model. Ap_finalize_sub_req_protocol sends down an EOS bucket just like ap_finalize_request does. That means that it is only sent if the handler didn't send it. The sub-request's EOS is stripped off by the SUB_REQ_FILTER, and is only used to signify the end of the sub-request. Ryan
RE: how many EOS buckets should a filter expect? (subrequest, PR 9644)
From: [EMAIL PROTECTED] [mailto:trawick@rdu88-251- Jeff Trawick [EMAIL PROTECTED] writes: I suspect you're talking about this line of code which doesn't exist in CVS: Index: server/protocol.c === RCS file: /home/cvs/httpd-2.0/server/protocol.c,v retrieving revision 1.105 diff -u -r1.105 protocol.c --- server/protocol.c 7 Jun 2002 22:31:34 - 1.105 +++ server/protocol.c 10 Jun 2002 18:33:54 - @@ -1033,7 +1033,10 @@ void ap_finalize_sub_req_protocol(request_rec *sub) { -end_output_stream(sub); +/* tell the filter chain there is no more content coming */ +if (!sub-eos_sent) { +end_output_stream(sub); +} } It probably should have been added here back in Sept 2000 when you added the check to ap_finalize_request_protocol(). I'll add it for the subrequest path now. Yeah, it should have been added at the same time. Ryan
RE: Recursive error processing.
I can't reproduce this. This test case is actually tested for in the test suite. Which SSL library are you using? I was going off of the assumption that the ap_discard_request_body() changes had broken this, but since I have the most up-to-date code, I don't believe that the two are related. Please make sure that your code is up to date, because the server is supposed to have logic that protects us from getting into an infinite loop. Wait a sec, the problem could be the ErrorDocument path. The test suite doesn't exercise that path. Will report back soon. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Ryan Bloom [mailto:[EMAIL PROTECTED]] Sent: Monday, June 10, 2002 2:55 PM To: [EMAIL PROTECTED] Subject: RE: Recursive error processing. From: Paul J. Reder [mailto:[EMAIL PROTECTED]] While Allan Edwards and I were doing some testing of SSL we ran into a case where we were able to send Apache into an infinite loop which eventually consumed the machine's resources. The problem occurs if you send a request to http://some.where.com:443; (instead of https://some.where.com:443;. This was working a few days ago, and is in fact a part of the test suite. The problem seems to be related to the fact that ap_die should be killing the custom_response and just dropping the connection (which is what 1.3 does) rather than falling through and trying to send a custom response via internal_redirect. 1.3 doesn't drop the connection, it sends a custom response. Is this an artifact of the recent changes for 401/413 processing? Is this symptomatic of a bigger problem of infinite loops during error redirects? This all starts because the SSL post_read_request hook function (ssl_hook_ReadReq) returns HTTP_BAD_REQUEST after finding sslconn-non_ssl_request set to 1 (by ssl_io_filter_input after it notices ssl_connect fails in ssl_hook_process_connection). Hold on, I think I know what the problem is, I'll try to commit a fix in a few minutes. Ryan
RE: Recursive error processing.
From: Cliff Woolley [mailto:[EMAIL PROTECTED]] On Mon, 10 Jun 2002, Ryan Bloom wrote: Please make sure that your code is up to date, because the server is supposed to have logic that protects us from getting into an infinite loop. Paul, I notice the line numbers in your back trace don't quite match up with mine... is this HEAD? Or are there local mods? Wait a sec, the problem could be the ErrorDocument path. The test suite doesn't exercise that path. Will report back soon. Ah. Well I'll wait for Ryan to check that then. I've tried everything I can think of to make this fail. It refuses to fail for me. Please make sure that your code is up to date, and let me know what version of the SSL libraries you are using. For completeness, here are my test cases: 1) Run the test suite (this tests http://localhost:8350 where 8350 is the SSL port). Also requested a page through telnet and Konqueror. 2) Add a plain text ErrorDocument for 400 requests. Request a page 3) Copy the HTTP_BAD_REQUEST.html.var files and the config to my test server, request a page. All three scenarios work for me on Linux. There is a problem in the 3rd case, which looks to be from a non-terminated string (bad, but not a buffer overflow, we just forgot to add a \0). I'll fix that quickly. Paul or Allen, can either of you provide more details? There really is logic in the server to stop the ap_die calls from being recursive, so this bug really surprises me. Ryan
RE: Recursive error processing.
I'm running with CVS head as of Friday morning with OpenSSL 0.9.6b [engine] 9 Jul 2001 on Linux (RedHat 7.2). I've attached my httpd.conf, ssl.conf, and config.nice files. I have been able to reproduce it on worker and prefork on two different Linux boxes (both redhat 7.2). All I do is bring the box up and use Mozilla to send the request http://sol.Reders:443; and watch the cpu start spinning. Please update your tree. There were changes to how Apache handles calling ap_die and ap_discard_request_body() on Friday evening. Ryan
RE: Recursive error processing.
I don't have any ideas. I can't reproduce this problem though. I'll keep debugging on my end. Cliff, this may take some time. Ryan -- Ryan Bloom [EMAIL PROTECTED] 645 Howard St. [EMAIL PROTECTED] San Francisco, CA -Original Message- From: Paul J. Reder [mailto:[EMAIL PROTECTED]] Sent: Monday, June 10, 2002 4:51 PM To: [EMAIL PROTECTED] Subject: Re: Recursive error processing. Bad news. I just finished running cvs update -dP httpd-2.0;cd httpd-2.0;make distclean;buildconf;config.nice;make;make install and tested it. The same thing still happens with the config I referenced earlier. Any other ideas? Paul J. Reder wrote: Hmmm, I missed them. I'm updating and building now, I'll have an answer shortly after dinner. Ryan Bloom wrote: I'm running with CVS head as of Friday morning with OpenSSL 0.9.6b [engine] 9 Jul 2001 on Linux (RedHat 7.2). I've attached my httpd.conf, ssl.conf, and config.nice files. I have been able to reproduce it on worker and prefork on two different Linux boxes (both redhat 7.2). All I do is bring the box up and use Mozilla to send the request http://sol.Reders:443; and watch the cpu start spinning. Please update your tree. There were changes to how Apache handles calling ap_die and ap_discard_request_body() on Friday evening. Ryan -- Paul J. Reder --- The strength of the Constitution lies entirely in the determination of each citizen to defend it. Only if every single citizen feels duty bound to do his share in this defense are the constitutional rights secure. -- Albert Einstein