Open patches; all applied?
Quick question; are there any patches (other than some initialization checking that's been brought up) that I've misplaced, before rolling the next candidate of mod_aspdotnet? I did just apply Larry's fix for request host iterator allocation in the request instead of conf pool. Bill
Re: cvs commit: httpd-2.0/server protocol.c
[EMAIL PROTECTED] wrote: server protocol.c Log: This will put some messages in the error log when some people try a lame DoS by just opening a socket, and never sending any data. +else if (r-connection-keepalive != AP_CONN_KEEPALIVE) { +ap_log_rerror(APLOG_MARK, APLOG_NOTICE, rv, r, + request line read error.); +} Is it possible to put a more descriptive message than request line read error. If I was to see that in a logfile, I would have absolutely no idea whatsoever what it was trying to tell me :( Regards, Graham --
Re: cvs commit: httpd-2.0/server protocol.c
On 25 Oct 2004 06:40:08 -, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Index: protocol.c === RCS file: /home/cvs/httpd-2.0/server/protocol.c,v retrieving revision 1.155 retrieving revision 1.156 diff -u -r1.155 -r1.156 --- protocol.c23 Oct 2004 22:39:53 - 1.155 +++ protocol.c25 Oct 2004 06:40:08 - 1.156 @@ -603,7 +603,10 @@ r-proto_num = HTTP_VERSION(1,0); r-protocol = apr_pstrdup(r-pool, HTTP/1.0); } - +else if (r-connection-keepalive != AP_CONN_KEEPALIVE) { +ap_log_rerror(APLOG_MARK, APLOG_NOTICE, rv, r, + request line read error.); +} 1.3 issues such a message only if it is a timeout error. 1.3 uses LOGLEVEL_INFO. With LOGLEVEL_NOTICE there is no way to keep this out of the error log. Some users have relatively large numbers of connection-oriented problems (I see this most often with users in SE Asia). Something like this will result in a lot of messages for the types of conditions they encounter on a regular basis. Checking specifically for a timeout error (like 1.3*) and using LOGLEVEL_INFO seems appropriate. *In 1.3, it is the timeout handling which issues the message; I don't see any similar messages in 1.3 mainline.
Re: cvs commit: apache-1.3/src/modules/standard mod_include.c
On Fri, Oct 22, 2004 at 07:31:09PM -, Jim Jagielski wrote: if (d == len + dest) { +ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, r, + mod_include: directive length exceeds limit + (%d) in %s, len+1, r-filename); This adds a GCC warning on 64-bit platforms, probably should be %lu and (unsigned long)len + 1 if it's really necessary to print the length. mod_include.c: In function `get_directive': mod_include.c:434: warning: int format, different type arg (arg 6) joe
Re: Event MPM
Paul Querna wrote: Non-KeepAlive: `ab -c 25 -n 10 http://10.10.10.10:6080/index.html` Worker MPM: 2138.28 Event MPM: 2147.95 KeepAlive: `ab -k -c 25 -n 10 http://10.10.10.10:6080/index.html` Worker MPM: 4396.38 Event MPM: 4119.40 Have you tried it with higher number of clients -- i.e,. -c 1024? We are interesting in the event mpm mainly for dealing with keep alives. -- Brian Akins Lead Systems Engineer CNN Internet Technologies
Re: cvs commit: httpd-2.0/server protocol.c
On 25-Oct-04, at 3:37 AM, Graham Leggett wrote: +else if (r-connection-keepalive != AP_CONN_KEEPALIVE) { +ap_log_rerror(APLOG_MARK, APLOG_NOTICE, rv, r, + request line read error.); +} Is it possible to put a more descriptive message than request line read error. If I was to see that in a logfile, I would have absolutely no idea whatsoever what it was trying to tell me :( I believe that ap_log_rerror() inserts the OS error description, resulting in something like: [client 127.0.0.1] (70007)The timeout specified has expired: request line read error Of course, this is OS dependent.
Re: cvs commit: httpd-2.0/server protocol.c
Rici Lake wrote: I believe that ap_log_rerror() inserts the OS error description, resulting in something like: [client 127.0.0.1] (70007)The timeout specified has expired: request line read error Of course, this is OS dependent. That still means little to me as an end user, as I would think a request line read error was some OS error that I needed to be concerned about, leading me on a wild goose chase. What would make more sense is Error while reading HTTP request line. (remote browser didn't send a request?). This indicates exactly what httpd was trying to do when the error occurred, and gives a hint of why the error might have occurred. The clearer the error message, the less chance of an admin wasting time fiddling in the dark trying to understand what it all means. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: cvs commit: httpd-2.0/server core.c protocol.c request.c scoreboard.c util.c util_script.c
Brad Nicholes wrote: -1 as well. This is now causing compiler errors on NetWare. Please revert this patch! Can you provide an indication of exactly what broke so we will know what to avoid in future. Or was the breakage actually due to the the mod_cache problem reported last night? Thanks, Allan
Re: cvs commit: httpd-2.0/server protocol.c
Graham Leggett wrote: Rici Lake wrote: I believe that ap_log_rerror() inserts the OS error description, resulting in something like: [client 127.0.0.1] (70007)The timeout specified has expired: request line read error Of course, this is OS dependent. That still means little to me as an end user, as I would think a request line read error was some OS error that I needed to be concerned about, leading me on a wild goose chase. What would make more sense is Error while reading HTTP request line. (remote browser didn't send a request?). This indicates exactly what httpd was trying to do when the error occurred, and gives a hint of why the error might have occurred. Well.. Apache isn't just an HTTP server right? Now Committed: Error while reading request line. (client didn't send a request?) I also changed it to a INFO. I forgot how much NOTICE sucked. Thanks for the comments. -Paul
Re: Mod_Cache build problem on NetWare...
Thanks for reporting the problem Norm, The patch was submitted. Jean-Jacques [EMAIL PROTECTED] 10/24/04 5:08 PM Greetings All,Just trying a 'build' of current 2.1 CVS on a Windows machine for NetWare and get the following error...Calling NWGNUmod_cachCompiling cache_util.cCompiling mod_cache.c### mwccnlm Compiler:# File: mod_cache.c# # 805: (*new) = header;# Error: ^# illegal implicit conversion from 'const char *' to# 'char *'Errors caused tool to abort.make[3]: *** [Release/mod_cache.o] Error 1make[2]: *** [Release/mod_cach.nlm] Error 2make[1]: *** [experimental] Error 2make: *** [modules] Error 2Have a good week,Norm
Re: cvs commit: httpd-2.0/server core.c protocol.c request.c scoreboard.c util.c util_script.c
mod_cache is a different issue. The compiler used to build the netware NLMs is very sensitive to type mismatches. @@ -3793,7 +3794,7 @@ core_net_rec *net = f-ctx; core_ctx_t *ctx = net-in_ctx; const char *str; -apr_size_t len; +apr_ssize_t len; Changing the type from apr_size_t to apr_ssize_t introduced a type mismatch in the call to apr_bucket_read() which expects an apr_size_t. Type casting it back to an apr_size_t to fix the problem seems like it would have defeated the whole purpose of doing it in the first place. Besides the fact that apr_bucket_read() can't give you back anything bigger than an apr_size_t anyway. Brad [EMAIL PROTECTED] Monday, October 25, 2004 10:01:53 AM Brad Nicholes wrote: -1 as well. This is now causing compiler errors on NetWare. Please revert this patch! Can you provide an indication of exactly what broke so we will know what to avoid in future. Or was the breakage actually due to the the mod_cache problem reported last night? Thanks, Allan
Re: Mod_Cache build problem on NetWare...
Jean-Jacques Clar wrote: Thanks for reporting the problem Norm, The patch was submitted. Is casting to (char *) really the right thing to do here? I mean why not just make new const so the cast isn't needed at all? It seems like a cast in this situation is a bit ugly... -garrett
Re: Event MPM
Brian Akins wrote: Paul Querna wrote: Non-KeepAlive: `ab -c 25 -n 10 http://10.10.10.10:6080/index.html` Worker MPM: 2138.28 Event MPM: 2147.95 KeepAlive: `ab -k -c 25 -n 10 http://10.10.10.10:6080/index.html` Worker MPM: 4396.38 Event MPM: 4119.40 Have you tried it with higher number of clients -- i.e,. -c 1024? Nope. I was already maxing out my 100mbit LAN at 25 clients. I don't have a good testing area for static content request benchmarking. I am thinking of trying to find an old pentium I with PCI and putting a GigE card in it just for benchmarking. The current patch does not spawn new threads on demand. You need to set ThreadsPerChild 1200 to test that many clients. This is on my short list of things to change before committing the Event MPM to CVS. We are interesting in the event mpm mainly for dealing with keep alives. Yes, this is the target the Event MPM aims at :) -Paul Querna
Re: Event MPM
Paul Querna wrote: Brian Akins wrote: We are interesting in the event mpm mainly for dealing with keep alives. Yes, this is the target the Event MPM aims at :) If I understand the nature of the patch correctly then you don't need to go increasing the number of clients at all. Instead you should be looking at changing ab to pause for a few seconds between every two requests carried out over the same connection. With that change, and limiting the maximum number of processes to, say, 100, you should be able to obtain meaningful results with your current setup. Worker MPM will max out at some point, with Event MPM going past it to max at a higher value. -- ModSecurity (http://www.modsecurity.org) [ Open source IDS for Web applications ]
Re: Event MPM
Paul Querna wrote: Brian Akins wrote: Paul Querna wrote: [...] Have you tried it with higher number of clients -- i.e,. -c 1024? Nope. I was already maxing out my 100mbit LAN at 25 clients. I don't have a good testing area for static content request benchmarking. I am thinking of trying to find an old pentium I with PCI and putting a GigE card in it just for benchmarking. How about modifying ab to add a delay of a second or two between successive read(2) calls on the same connection (and limiting the max read size to a small value, to make sure each response requires multiple reads)? The throughput numbers wouldn't be very impressive, of course, but you'd be able to see how much memory and how many threads the Event MPM uses in handling a real-world number of concurrent connections, compared to Worker. The current patch does not spawn new threads on demand. You need to set ThreadsPerChild 1200 to test that many clients. This is on my short list of things to change before committing the Event MPM to CVS. This part seems surprising. What's the relationship between clients and threads in the Event MPM--does it use a thread per connection? I haven't had a chance to check out the Event MPM yet, but I'm planning to download it and study the code later this week. For what it's worth, I can think of at least one killer app for the Event MPM beyond keep-alive handling: as an efficient (low memory thread use per connection) connection multiplexer in front of an inefficient (high memory thread use per connection) appserver. -Brian
Re: Event MPM
++1 for moving this along. For any newbies, the object of the game is to increase httpd scalability by reducing the number of worker threads and their associated stack memory. Paul Querna wrote: First is a patch to APR that provides an extension to apr_pollset that optionally make some parts of it threadsafe. This patch has been submitted to APR before, and hopefully it will get accepted there soon. I need to catch up here. quick questions: will this event MPM work with conventional poll (i.e. no epoll or kqueue)? Can it handle a worker thread adding directly to the pollset? The other patch is the actual MPM and with a few changes to other parts of the source to take advantage of the Event Thread for handling Keep Alive Requests. IMO the few changes to other parts are the most important to do first. If we can get consensus on the core parts and commit them to 2.1, much of the work can be done in the experimental mpm dir without bothering uninterested parties. These patches total over 160k. Since many mail servers are extremely lame, I have put them up on the web at: http://www.apache.org/~pquerna/event-mpm/ diffing worker.c to event.c and worker/fdqueue.c to event/fdqueue.c should make the mpm changes easier to follow. MPM Structure: I have made the MPM a single process with multiple threads interesting! but then per process limits (fd's esp. for sockets, virtual memory) could put a ceiling on how high this can scale. Plus if a buggy module seg faults, it's a lot more disruptive when there's only one worker process. It currently does not spawn new threads on demand, but I plan to add this soon. By making it run as a single process, I was able to completely remove the accept() locking. This also greatly simplifies the Listener Thread's poll()`ing of the sockets. understood, but see above. My vote is to keep it multiprocess capable and perhaps allow a single process ./configure option. The Listener thread includes all the Sockets that it is listening to for incoming requests on in the same pollset as all requests waiting for IO/KeepAlive. Greg's original patch had a separate 'Event' Thread, but I have chosen to combine them. +1 That means that some of the changes to how fdqueue.c manipulates idlers aren't strictly necessary since the listener is the only thread that ever tries to reserve a worker thread. OTOH I don't think those changes to fdqueue.c hurt anything either so it's a minor detail. The Worker Threads work much like the 'Worker' MPM, but when a connection is ready for a Keep Alive, they will push the client back to the Listener/Event Thread. This thread does not need to be woken up like in Greg Ames' patch. This is because of the enhancement to apr_pollset that enables other threads to _add() or _remove() while the main thread is inside a _poll(). even with plain ol' vanilla poll() ? The place where the Event MPM should shine is with the more common case of relatively high-latency Internet clients. The Event MPM isn't super powerful on the KeepAlive-over-LAN case because it forces a context switch to process the client again when it is does another request as part of a KeepAlive. I heard of an application where many clients send a heartbeat to a server over http every few seconds. That would benefit greatly by using keepalives and some form of event MPM, even over a LAN. Greg
Re: Event MPM
Brian Pane wrote: Paul Querna wrote: Brian Akins wrote: Have you tried it with higher number of clients -- i.e,. -c 1024? Nope. I was already maxing out my 100mbit LAN at 25 clients. I don't have a good testing area for static content request benchmarking. I am thinking of trying to find an old pentium I with PCI and putting a GigE card in it just for benchmarking. I'd love to find one like this too. I sometimes use a 180 MHz Pentium Pro box. But it only has 160M of EDO memory which is hard to scrounge around here, and probably can't use modern hard drives effectively due to BIOS limitations. How about modifying ab to add a delay of a second or two between successive read(2) calls on the same connection an excellent idea. Or maybe add a delay after any read if -k is in effect. The current patch does not spawn new threads on demand. You need to set ThreadsPerChild 1200 to test that many clients. This is on my short list of things to change before committing the Event MPM to CVS. This part seems surprising. What's the relationship between clients and threads in the Event MPM--does it use a thread per connection? one thread per connection with an active http request, plus the listener/event thread who owns all the connections in keepalive. I believe Paul is saying set ThreadsPerChild to 1200 to handle the worst case behavior - 100% of the connections are doing real work at some instant and none are in keepalive timeouts. Greg
Re: Event MPM
Greg Ames wrote: First is a patch to APR that provides an extension to apr_pollset that optionally make some parts of it threadsafe. This patch has been submitted to APR before, and hopefully it will get accepted there soon. I need to catch up here. quick questions: will this event MPM work with conventional poll (i.e. no epoll or kqueue)? Can it handle a worker thread adding directly to the pollset? The patch to APR adds a new 'APR_POLLSET_THREADSAFE' flag for apr_pollset_create(...). This only works for the EPoll and KQueue backends. This allows a worker thread to _add() or _remove() directly, without having to touch the thread in _poll(). It could be implemented for plain Poll by having the pollset contain an internal pipe. This Pipe could be pushed by an _add() or _remove() and force the _poll() thread to wake up. The thread in _poll() could then add or remove any sockets from it's set, and then start the _poll() again. This is how your original patch did it, but we could push it down to APR. I don't think in the end it will yield good performance with this method. My personal view is that most platforms will have either EPoll() or KQueue() Support. I guess its only an issue for people running 2.2./2.4 Linux, since KQueue has been in *BSD for a long time. As time passes, I believe that implementing it for plain poll() will be come less important. Other platforms like Solaris have better poll() replacements that we should add to apr_pollset_*. [...snip...] I have made the MPM a single process with multiple threads interesting! but then per process limits (fd's esp. for sockets, virtual memory) could put a ceiling on how high this can scale. Plus if a buggy module seg faults, it's a lot more disruptive when there's only one worker process. We don't have buggy modules :-) Isn't the per-process FD limit only really a problem for platforms like older Solaris? I thought that on Linux/BSD it is quite high by default now. It currently does not spawn new threads on demand, but I plan to add this soon. By making it run as a single process, I was able to completely remove the accept() locking. This also greatly simplifies the Listener Thread's poll()`ing of the sockets. understood, but see above. My vote is to keep it multiprocess capable and perhaps allow a single process ./configure option. I am not completely adverse to putting multi-processes back in, I mostly wanted to get a working patch out there for review. The current logic used in Worker is flawed for poll`ing of Listening sockets. I just ripped all of it out, and ended up with a much simpler MPM that avoids extra locking. [...snip...] The Worker Threads work much like the 'Worker' MPM, but when a connection is ready for a Keep Alive, they will push the client back to the Listener/Event Thread. This thread does not need to be woken up like in Greg Ames' patch. This is because of the enhancement to apr_pollset that enables other threads to _add() or _remove() while the main thread is inside a _poll(). even with plain ol' vanilla poll() ? Nope. Its just not easily to implement and support plain old vanilla poll(). You cannot just add an FD to its array of sockets from other threads. The poll() will also not see any changes in its list until you wake it up, and start the poll() again. This is not impossible to implement, but its going to have a huge number of context switches for what should be a very common operation. [...snip..] The place where the Event MPM should shine is with the more common case of relatively high-latency Internet clients. The Event MPM isn't super powerful on the KeepAlive-over-LAN case because it forces a context switch to process the client again when it is does another request as part of a KeepAlive. I heard of an application where many clients send a heartbeat to a server over http every few seconds. That would benefit greatly by using keepalives and some form of event MPM, even over a LAN. Yup. Another place the event MPM would rock is on an FTP server. (see mod_ftpd). Yet Another place is an IMAPv4 server written as an Apache Module. (This is on my ~/TODO list) Thanks for the Feedback! -Paul Querna
Re: Event MPM
Brian Pane wrote: How about modifying ab to add a delay of a second or two between successive read(2) calls on the same connection (and limiting the max read size to a small value, to make sure each response requires multiple reads)? The throughput numbers wouldn't be very impressive, of course, but you'd be able to see how much memory and how many threads the Event MPM uses in handling a real-world number of concurrent connections, compared to Worker. +1, I will look at patching this tonight. I think some other HTTP benchmarkers might already have this feature. Flood? The current patch does not spawn new threads on demand. You need to set ThreadsPerChild 1200 to test that many clients. This is on my short list of things to change before committing the Event MPM to CVS. This part seems surprising. What's the relationship between clients and threads in the Event MPM--does it use a thread per connection? A thread per-connection that is currently being processed. Note that this is not the traditional 'event' model that people write huge papers about and thttpd raves about, but rather a hybrid that uses a Worker Thread todo the processing, and a single 'event' thread to handle places where we are waiting for IO. (Currently accept() and Keep Alive Requests). Perhaps it needs a different name? (eworker?) A future direction to investigate would be to make all of the initial Header parsing be done Async, and then switch to a Worker thread to preform all the post_read hooks, handlers, and filters. I believe this could be done without breaking many 3rd party modules. (SSL could be a bigger challenge here...) I haven't had a chance to check out the Event MPM yet, but I'm planning to download it and study the code later this week. For what it's worth, I can think of at least one killer app for the Event MPM beyond keep-alive handling: as an efficient (low memory thread use per connection) connection multiplexer in front of an inefficient (high memory thread use per connection) appserver. Yes, another thing on my ~/TODO is to use this design this as the base for a perchild replacement. My idea is to use a lightweight 'event' frontend to determine which backend to pass a socket to. These backends would be running as different UIDs, and could be running a threaded or non-threaded manner to optionally support PHP. -Paul Querna
Re: Event MPM
Paul Querna wrote: A thread per-connection that is currently being processed. Note that this is not the traditional 'event' model that people write huge papers about and thttpd raves about, but rather a hybrid that uses a Worker Thread todo the processing, and a single 'event' thread to handle places where we are waiting for IO. (Currently accept() and Keep Alive Requests). Perhaps it needs a different name? (eworker?) A future direction to investigate would be to make all of the initial Header parsing be done Async, and then switch to a Worker thread to preform all the post_read hooks, handlers, and filters. I believe this could be done without breaking many 3rd party modules. (SSL could be a bigger challenge here...) Some academics have played with this model of a event thread + worker threads. Best I can find is the Java 'SEDA: An Architecture for Highly Concurrent Server Applications'. [1] They have some pretty graphs in their papers of their Haboob web server that can scale very nicely compared to traditional event or pure-threaded servers.[2] [3] -Paul Querna [1] http://www.eecs.harvard.edu/~mdw/proj/seda/ [2] http://www.enhyper.com/content/eventsbadidea.pdf [3] http://www.cis.upenn.edu/~hhl/cse434/lectures/seda.pdf
Re: Event MPM
Greg Ames wrote: one thread per connection with an active http request, plus the listener/event thread who owns all the connections in keepalive. I believe Paul is saying set ThreadsPerChild to 1200 to handle the worst case behavior - 100% of the connections are doing real work at some instant and none are in keepalive timeouts. Can you still have multiple processes? We use 10k plus threads per box with worker. -- Brian Akins Lead Systems Engineer CNN Internet Technologies
Re: Mod_Cache build problem on NetWare...
Garrett Rooney wrote: Jean-Jacques Clar wrote: Thanks for reporting the problem Norm, The patch was submitted. Is casting to (char *) really the right thing to do here? I mean why not just make new const so the cast isn't needed at all? It seems like Do I miss anything or would we need *new to be const and not new? a cast in this situation is a bit ugly... -garrett Regards RĂ¼diger
Re: Event MPM
Brian Akins wrote: Can you still have multiple processes? We use 10k plus threads per box with worker. with my patch, yes. with Paul's, no. But Paul's has some very nice features that mine doesn't have, so I think a hybrid is the way to go. Assuming you have a high percentage of threads in keepalive timeouts, you will be able to cut down on the number of threads per box. Greg
Re: Event MPM
Greg Ames wrote: Assuming you have a high percentage of threads in keepalive timeouts, you will be able to cut down on the number of threads per box. Yes, we do. I can certainly provide you guys with some testing, if nothing else. We have some home grown benchmarks that may could help, not to mention lots of live traffic :) -- Brian Akins Lead Systems Engineer CNN Internet Technologies
Re: Event MPM
Brian Akins wrote: I can certainly provide you guys with some testing, if nothing else. excellent! We have some home grown benchmarks that may could help, If they simulate user think time or would otherwise cause a lot of keepalive timeouts, great! Finding the right client/benchmark is a problem for me right now and I believe for Paul too. not to mention lots of live traffic :) ummm, I'm glad you put a smiley on the end of that :) Greg
Re: Event MPM
Greg Ames wrote: Brian Akins wrote: Can you still have multiple processes? We use 10k plus threads per box with worker. with my patch, yes. with Paul's, no. Correct at the moment. I think based on the feedback so far, I will investigate making it multi-processed again.
Re: Event MPM
Greg Ames wrote: Brian Akins wrote: We have some home grown benchmarks that may could help, If they simulate user think time or would otherwise cause a lot of keepalive timeouts, great! Finding the right client/benchmark is a problem for me right now and I believe for Paul too. Yup. This is my biggest problem right now with the benchmarks. `ab` just doesn't give a realistic modeling of how clients use Keep Alives. If you have some home grown programs that do, I would love to use them.
Re: Event MPM
Paul Querna wrote: This only works for the EPoll and KQueue backends. This allows a worker thread to _add() or _remove() directly, without having to touch the thread in _poll(). It could be implemented for plain Poll by having the pollset contain an internal pipe. This Pipe could be pushed by an _add() or _remove() and force the _poll() thread to wake up. The thread in _poll() could then add or remove any sockets from it's set, and then start the _poll() again. This is how your original patch did it, but we could push it down to APR. let's stick with your version at least for now. I don't think in the end it will yield good performance with this method. agreed. I wasn't happy with the extra pipe and the overhead, though some of that could go away. The point was to get something out there as a proof-of-concept that would work for just about anybody. Today the barrier to entry is much lower in the Linux world anyway. I'm not sure if a distro with a 2.6 kernel will install easily on my Pentium Pro benchmarking box, but that's a pretty weird environment. Greg
More musings about asynchronous MPMs Re: Event MPM
Paul Querna wrote: Paul Querna wrote: A thread per-connection that is currently being processed. Note that this is not the traditional 'event' model that people write huge papers about and thttpd raves about, but rather a hybrid that uses a Worker Thread todo the processing, and a single 'event' thread to handle places where we are waiting for IO. (Currently accept() and Keep Alive Requests). Perhaps it needs a different name? (eworker?) A future direction to investigate would be to make all of the initial Header parsing be done Async, and then switch to a Worker thread to preform all the post_read hooks, handlers, and filters. I believe this could be done without breaking many 3rd party modules. (SSL could be a bigger challenge here...) Some academics have played with this model of a event thread + worker threads. Best I can find is the Java 'SEDA: An Architecture for Highly Concurrent Server Applications'. [1] Yeah, SEDA's model of processing stages--basically a succession of thread pools through which a request passes, each with a queue in front--looks like a promising way of mixing event-based and non-event-based processing. Another approach I've looked at is to use Schmidt's Reactor design pattern, in which a central listener thread dispatches each received event to one of a pool of worker threads. The worker is then allowed to do arbitrarily complex processing (provided you have enough worker threads) before deciding to tell the listener thread that it wants to wait for another event. I did some prototyping of a Leader/Followers variant of this: have a pool of worker threads that take turns grabbing the next event from a queue of received events. If there are no events left in the queue, the next worker thread does a select/poll/etc (and any other idle workers block on a condition variable). This seemed to work reasonably well in a toy test app (it was *almost* capable of implementing HTTP/0.9, but I never had time to try 1.0 or 1.1; oh, and it depended on the java.nio classes to do the select efficiently, so I didn't even try to commit it to server/mpm/experimental. :-) On a more general topic...and at the risk of igniting a distracting debate...does anybody else out there have an interest in doing an async httpd in Java? There are a lot of reasons *not* to do so, mostly related to all the existing httpd-2.0 modules that wouldn't work. The things that seem appealing about trying a Java-based httpd, though, are: - The pool memory model at the core of httpd-1.x and -2.x isn't well suited to MPM designs where multiple threads need to handle the same connection--possibly at the same time, for example when a handler that's generating content needs to push output buckets to an I/O completion thread to avoid having to block the (probably heavyweight) handler thread or make it event-based. Garbage collection on a per-object basis would be a lot easier. - Modern Java implementations seem to be doing smart things from a scalability perspective, like using kqueue/epoll/etc. - And (minor issue) it's a lot easier to refactor things in Java than in C, and I expect that building a good async MPM that handles dynamic content and proxying effectively will require a lot of iterations of design trial and error. Brian
Re: Mod_Cache build problem on NetWare...
What about doing: static const char *add_ignore_header(cmd_parms *parms, void *dummy,- const char *header)+ char *header){ cache_server_conf *conf; char **new;@@ -802,7 +802,7 @@ * (When 'None' is passed, IGNORE_HEADERS_SET nelts == 0.) */ new = (char **)apr_array_push(conf-ignore_headers);- (*new) = (char *)header;+ (*new) = header; [EMAIL PROTECTED] 10/25/04 10:32 AM Jean-Jacques Clar wrote: Thanks for reporting the problem Norm, The patch was submitted.Is casting to (char *) really the right thing to do here? I mean why not just make new const so the cast isn't needed at all? It seems like a cast in this situation is a bit ugly...-garrett
Re: Mod_Cache build problem on NetWare...
Jean-Jacques Clar wrote: What about doing: static const char *add_ignore_header(cmd_parms *parms, void *dummy, - const char *header) + char *header) { cache_server_conf *conf; char **new; @@ -802,7 +802,7 @@ * (When 'None' is passed, IGNORE_HEADERS_SET nelts == 0.) */ new = (char **)apr_array_push(conf-ignore_headers); -(*new) = (char *)header; +(*new) = header; That seems fine, the only reason I didn't suggest it was that I wasn't sure if we were ever passing anything const in as an argument, if we aren't then just making header non-const should clean up the problem nicely without the cast. -garrett
Event MPM w/ multiple processes
Brian Akins wrote: Greg Ames wrote: one thread per connection with an active http request, plus the listener/event thread who owns all the connections in keepalive. I believe Paul is saying set ThreadsPerChild to 1200 to handle the worst case behavior - 100% of the connections are doing real work at some instant and none are in keepalive timeouts. Can you still have multiple processes? We use 10k plus threads per box with worker. The updated patch for today adds multiple processes. (same directives as the worker MPM): http://www.apache.org/~pquerna/event-mpm/event-mpm-2004-10-25.patch However, the big thing it doesn't use is accept serialization. This means all event threads are listening for incoming clients. The first one to process the incoming connection gets it. This does not block the other event threads, since they set the listening socket to non-blocking before starting their loop. This seems to work fine on my tests. It has the sucky side effect of waking up threads sometimes when they are not needed, but on a busy server, trying to accept() will likely be fine, as there will be a backlog of clients to accept(). -Paul Querna
mod_cache: Content Generation Dependencies?
I have been doing some stuff with mod_transform (XSLT processor) and mod_cache. The problem is, mod_cache doesn't have any easy way to know if a request needs to be regenerated. Right now, it just blindly caches until a timeout. What I would prefer is that it knows what files or URLs a specific request depends upon, and if any of those change, then regenerate the request. An example: cache_add_depends(r, /home/httpd/site/xsl/foo.xsl); This would add 'foo.xsl' as a dependency of the current request. If the file's mtime changes, mod_cache would invalidate the cache of the current request. Any opinions or suggestions? A stat() call on several files is hundreds of times faster than having mod_transform re-generate the output. While I would hate to stat() hundreds of files on every request, this method could eliminate all unesesary regeneration of cached content. -Paul Querna
Re: mod_cache: Content Generation Dependencies?
Paul Querna wrote: I have been doing some stuff with mod_transform (XSLT processor) and mod_cache. The problem is, mod_cache doesn't have any easy way to know if a request needs to be regenerated. Right now, it just blindly caches until a timeout. What I would prefer is that it knows what files or URLs a specific request depends upon, and if any of those change, then regenerate the request. An example: cache_add_depends(r, /home/httpd/site/xsl/foo.xsl); This would add 'foo.xsl' as a dependency of the current request. If the file's mtime changes, mod_cache would invalidate the cache of the current request. Any opinions or suggestions? A stat() call on several files is hundreds of times faster than having mod_transform re-generate the output. While I would hate to stat() hundreds of files on every request, this method could eliminate all unesesary regeneration of cached content. The cache has no knowledge about underlying files, never mind multiple dependancies. It relies on HTTP/1.1 to work out cache freshness. Dependancies should be tracked by mod_transform not mod_cache - if either the source file, or XSL file changes, then the Etag should change, which will signal mod_cache (and any other caching proxies along the way) that the content is no longer fresh. If mod_transform isn't supporting Etag properly, then I'd say mod_transform was broken, and fixing it would probably solve your problem. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: cvs commit: httpd-2.0/server protocol.c
What would make more sense is Error while reading HTTP request line. (remote browser didn't send a request?). This indicates exactly what httpd was trying to do when the error occurred, and gives a hint of why the error might have occurred. We used to have such a message. It was removed from httpd because too many users complained about the log file growing too fast, particularly since that is the message which will be logged every time a browser connects and then its initial request packet gets dropped by the network. This is not an error that the server admin can solve -- it is normal life on the Internet. We really shouldn't be logging it except when on DEBUG level. Roy
Re: cvs commit: httpd-2.0/server protocol.c
On 25-Oct-04, at 11:04 PM, Roy T. Fielding wrote: This is not an error that the server admin can solve -- it is normal life on the Internet. We really shouldn't be logging it except when on DEBUG level. That was my first reaction, too. However, Ivan Ristic pointed out that (in some cases, anyway) it is the result of a DoS attack, and there may be something a server admin can do about it. Or, if not, at least they might know why their server is suddenly performing badly. For example, we had a problem report on #apache a couple of days ago which turned out, after considerable investigation, to be the result of a single host ip issuing hundreds of request connections in a few minutes. Whether this was a deliberate attack or simply a buggy client is not clear (to me) but the temporary solution of blocking the ip address was certainly within the server admin's abilities.
Re: Mod_Cache build problem on NetWare...
--On Monday, October 25, 2004 3:56 PM -0600 Jean-Jacques Clar [EMAIL PROTECTED] wrote: What about doing: static const char *add_ignore_header(cmd_parms *parms, void *dummy, - const char *header) + char *header) Um, no, you can't change that. The prototype for the ITERATE directive is: const char *(*take1) (cmd_parms *parms, void *mconfig, const char *w); (see cmd_func in http_config.h) I wonder if we could modify apr_array_push's cast to (const char**) instead. I don't see a particular reason why that wouldn't work instead. -- justin
Re: mod_cache: Content Generation Dependencies?
--On Tuesday, October 26, 2004 4:32 AM +0200 Graham Leggett [EMAIL PROTECTED] wrote: If mod_transform isn't supporting Etag properly, then I'd say mod_transform was broken, and fixing it would probably solve your problem. +1. If the content changes, so should the ETag. mod_transform could also set some Cache-Control headers. In short, think about what an intermediary caching HTTP proxy would do with that request. It'd check the expiration header ('freshness' tests), and, if that fails, then the external cache would try to send an If-Modified-Since (or some variant) request to the upstream server: if httpd responds it hasn't changed, then it'd serve from the cache until the timeout. So, it'd exactly act the same as mod_cache. Hence, adding 'private' hooks for mod_cache would still allow stale responses from external HTTP caches. -- justin
Re: cvs commit: httpd-2.0/server protocol.c
For example, we had a problem report on #apache a couple of days ago which turned out, after considerable investigation, to be the result of a single host ip issuing hundreds of request connections in a few minutes. Whether this was a deliberate attack or simply a buggy client is not clear (to me) but the temporary solution of blocking the ip address was certainly within the server admin's abilities. That could have easily just been one of these 'commercial' companies that are just testing sites for 'availabilty' and publishing the results. There are lots of these. Only one example... http://www.internethealthreport.com These guys might hit you hard and fast at any moment and they aren't sending any data... they just want to see how fast you can 'answer the phone' and they turn that into a 'site health' statistic. Roy was right in his previous message. Apache USED to try and log something for all broken inbound connect requests but that, itself, turned into a 'please fix this right away' bug report when people's log files went through the roof. In the case you just mentioned... it is going to take a special 'filter' to 'sense' that a possible DOS attack is in progress. Just fair amounts of 'dataless' connection requests from one or a small number of orgins doesn't qualify. There are plenty of official algorithms around now to 'sense' most of these brute force attacks and ( only then ) pop you an 'alert' or something. Just relying on a gazillion entries in a log file isn't the right way to 'officially' distinguish a DOS attack from just ( as Roy says ) 'life on the Internet'. All major browsers will abandon pending connect threads for a web page whenever you hit the BACK button, as well. Connects in progress at the socket level will still complete but no data will be sent because the threads have all died. Happens 24x7x365. The 5 second rule still applies. If people don't see good content showing up within 5 seconds they will click away from you and all the threads pending connects to you die immediately but all you might see are tons of 'dataless' connect completions on your end. It's not worth logging any of it. Yours... Kevin Kiley In a message dated 10/25/2004 11:23:24 PM Central Daylight Time, [EMAIL PROTECTED] writes: This is not an error that the server admin can solve -- it is normal life on the Internet. We really shouldn't be logging it except when on DEBUG level. That was my first reaction, too. However, Ivan Ristic pointed out that (in some cases, anyway) it is the result of a DoS attack, and there may be something a server admin can do about it. Or, if not, at least they might know why their server is suddenly performing badly. For example, we had a problem report on #apache a couple of days ago which turned out, after considerable investigation, to be the result of a single host ip issuing hundreds of request connections in a few minutes. Whether this was a deliberate attack or simply a buggy client is not clear (to me) but the temporary solution of blocking the ip address was certainly within the server admin's abilities.
Re: Event MPM
--On Monday, October 25, 2004 2:17 PM -0400 Greg Ames [EMAIL PROTECTED] wrote: I am thinking of trying to find an old pentium I with PCI and putting a GigE card in it just for benchmarking. I'd love to find one like this too. I sometimes use a 180 MHz Pentium Pro box. But it only has 160M of EDO memory which is hard to scrounge around here, and probably can't use modern hard drives effectively due to BIOS limitations. Good luck. I really wouldn't recommending throwing money down this path. I spent $300 I really don't have on a GigE switch and cards in order to get more out of mod_cache to little avail other than more useless benchmark numbers. Yet, if you send me some flood scripts and httpd patches, I can throw some numbers back at ya though. No sense other httpd'ers wasting their money. How about modifying ab to add a delay of a second or two between successive read(2) calls on the same connection an excellent idea. Or maybe add a delay after any read if -k is in effect. Um, flood already lets you do this and lots more. ab isn't a load simulator and trying to produce benchmarks with it is laughable. -- justin
Re: cvs commit: httpd-2.0/server protocol.c
With all due respect, I don't think the following scenario will be logged by the patch; it only reports when the attempt to read the initial request line fails, not when the socket is closed prior to data transfer terminating. At least, that was the intent. I'll leave it to Ivan to try and make the case within the context of mod_security, if he chooses to. Reinstating the notification message was basically his request in the first place. On 26-Oct-04, at 12:14 AM, [EMAIL PROTECTED] wrote: All major browsers will abandon pending connect threads for a web page whenever you hit the BACK button, as well. Connects in progress at the socket level will still complete but no data will be sent because the threads have all died. Happens 24x7x365.
Re: cvs commit: httpd-2.0/server protocol.c
--On Monday, October 25, 2004 9:04 PM -0700 Roy T. Fielding [EMAIL PROTECTED] wrote: This is not an error that the server admin can solve -- it is normal life on the Internet. We really shouldn't be logging it except when on DEBUG level. +1. Info loglevel is way way too high for realistic sites. -- justin