Open patches; all applied?

2004-10-25 Thread William A. Rowe, Jr.
Quick question; are there any patches (other than some
initialization checking that's been brought up) that I've
misplaced, before rolling the next candidate of mod_aspdotnet?

I did just apply Larry's fix for request host iterator
allocation in the request instead of conf pool.

Bill



Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Graham Leggett
[EMAIL PROTECTED] wrote:
   server   protocol.c
  Log:
  This will put some messages in the error log when some people try a lame
  DoS by just opening a socket, and never sending any data.

  +else if (r-connection-keepalive != AP_CONN_KEEPALIVE) {
  +ap_log_rerror(APLOG_MARK, APLOG_NOTICE, rv, r,
  +  request line read error.);
  +}
Is it possible to put a more descriptive message than request line read 
error. If I was to see that in a logfile, I would have absolutely no 
idea whatsoever what it was trying to tell me :(

Regards,
Graham
--


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Jeff Trawick
On 25 Oct 2004 06:40:08 -, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
   Index: protocol.c
   ===
   RCS file: /home/cvs/httpd-2.0/server/protocol.c,v
   retrieving revision 1.155
   retrieving revision 1.156
   diff -u -r1.155 -r1.156
   --- protocol.c23 Oct 2004 22:39:53 -  1.155
   +++ protocol.c25 Oct 2004 06:40:08 -  1.156
   @@ -603,7 +603,10 @@
r-proto_num = HTTP_VERSION(1,0);
r-protocol  = apr_pstrdup(r-pool, HTTP/1.0);
}
   -
   +else if (r-connection-keepalive != AP_CONN_KEEPALIVE) {
   +ap_log_rerror(APLOG_MARK, APLOG_NOTICE, rv, r,
   +  request line read error.);
   +}

1.3 issues such a message only if it is a timeout error.  1.3 uses
LOGLEVEL_INFO.  With LOGLEVEL_NOTICE there is no way to keep this out
of the error log.

Some users have relatively large numbers of connection-oriented
problems (I see this most often with users in SE Asia).  Something
like this will result in a lot of messages for the types of conditions
they encounter on a regular basis.

Checking specifically for a timeout error (like 1.3*) and using
LOGLEVEL_INFO seems appropriate.

*In 1.3, it is the timeout handling which issues the message; I don't
see any similar messages in 1.3 mainline.


Re: cvs commit: apache-1.3/src/modules/standard mod_include.c

2004-10-25 Thread Joe Orton
On Fri, Oct 22, 2004 at 07:31:09PM -, Jim Jagielski wrote:
   if (d == len + dest) {
   +ap_log_rerror(APLOG_MARK, APLOG_NOERRNO|APLOG_ERR, r,
   +  mod_include: directive length exceeds limit
   +   (%d) in %s, len+1, r-filename);

This adds a GCC warning on 64-bit platforms, probably should be %lu and
(unsigned long)len + 1 if it's really necessary to print the length.

mod_include.c: In function `get_directive':
mod_include.c:434: warning: int format, different type arg (arg 6)

joe


Re: Event MPM

2004-10-25 Thread Brian Akins
Paul Querna wrote:
Non-KeepAlive:
 `ab -c 25 -n 10 http://10.10.10.10:6080/index.html`
 Worker MPM: 2138.28
 Event MPM: 2147.95
KeepAlive:
 `ab -k -c 25 -n 10 http://10.10.10.10:6080/index.html`
 Worker MPM: 4396.38
 Event MPM: 4119.40
Have you tried it with higher number of clients -- i.e,. -c 1024?
We are interesting in the event mpm mainly for dealing with keep alives.
--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Rici Lake
On 25-Oct-04, at 3:37 AM, Graham Leggett wrote:

  +else if (r-connection-keepalive != 
AP_CONN_KEEPALIVE) {
  +ap_log_rerror(APLOG_MARK, APLOG_NOTICE, rv, r,
  +  request line read error.);
  +}
Is it possible to put a more descriptive message than request line 
read error. If I was to see that in a logfile, I would have 
absolutely no idea whatsoever what it was trying to tell me :(
I believe that ap_log_rerror() inserts the OS error description, 
resulting in something like:

[client 127.0.0.1] (70007)The timeout specified has expired: request 
line read error

Of course, this is OS dependent.


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Graham Leggett
Rici Lake wrote:
I believe that ap_log_rerror() inserts the OS error description, 
resulting in something like:

[client 127.0.0.1] (70007)The timeout specified has expired: request 
line read error

Of course, this is OS dependent.
That still means little to me as an end user, as I would think a 
request line read error was some OS error that I needed to be 
concerned about, leading me on a wild goose chase.

What would make more sense is Error while reading HTTP request line. 
(remote browser didn't send a request?). This indicates exactly what 
httpd was trying to do when the error occurred, and gives a hint of why 
the error might have occurred.

The clearer the error message, the less chance of an admin wasting time 
fiddling in the dark trying to understand what it all means.

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: cvs commit: httpd-2.0/server core.c protocol.c request.c scoreboard.c util.c util_script.c

2004-10-25 Thread Allan Edwards
Brad Nicholes wrote:
-1 as well.  This is now causing compiler errors on NetWare.  Please
revert this patch!
Can you provide an indication of exactly what broke so we
will know what to avoid in future. Or was the breakage
actually due to the the mod_cache problem reported
last night?
Thanks, Allan


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Paul Querna
Graham Leggett wrote:
Rici Lake wrote:
I believe that ap_log_rerror() inserts the OS error description, 
resulting in something like:

[client 127.0.0.1] (70007)The timeout specified has expired: request 
line read error

Of course, this is OS dependent.

That still means little to me as an end user, as I would think a 
request line read error was some OS error that I needed to be 
concerned about, leading me on a wild goose chase.

What would make more sense is Error while reading HTTP request line. 
(remote browser didn't send a request?). This indicates exactly what 
httpd was trying to do when the error occurred, and gives a hint of why 
the error might have occurred.
Well.. Apache isn't just an HTTP server right?
Now Committed:
Error while reading request line. (client didn't send a request?)
I also changed it to a INFO. I forgot how much NOTICE sucked.
Thanks for the comments.
-Paul


Re: Mod_Cache build problem on NetWare...

2004-10-25 Thread Jean-Jacques Clar


Thanks for reporting the problem Norm,
The patch was submitted.
Jean-Jacques [EMAIL PROTECTED] 10/24/04 5:08 PM 
Greetings All,Just trying a 'build' of current 2.1 CVS on a Windows machine for NetWare and get the following error...Calling NWGNUmod_cachCompiling cache_util.cCompiling mod_cache.c### mwccnlm Compiler:# File: mod_cache.c# # 805: (*new) = header;# Error: ^# illegal implicit conversion from 'const char *' to# 'char *'Errors caused tool to abort.make[3]: *** [Release/mod_cache.o] Error 1make[2]: *** [Release/mod_cach.nlm] Error 2make[1]: *** [experimental] Error 2make: *** [modules] Error 2Have a good week,Norm


Re: cvs commit: httpd-2.0/server core.c protocol.c request.c scoreboard.c util.c util_script.c

2004-10-25 Thread Brad Nicholes
   mod_cache is a different issue.  The compiler used to build the
netware NLMs is very sensitive to type mismatches.  

   @@ -3793,7 +3794,7 @@
core_net_rec *net = f-ctx;
core_ctx_t *ctx = net-in_ctx;
const char *str;
   -apr_size_t len;
   +apr_ssize_t len;


Changing the type from apr_size_t to apr_ssize_t introduced a type
mismatch in the call to apr_bucket_read() which expects an apr_size_t. 
Type casting it back to an apr_size_t to fix the problem seems like it
would have defeated the whole purpose of doing it in the first place. 
Besides the fact that apr_bucket_read() can't give you back anything
bigger than an apr_size_t anyway.

Brad 

 [EMAIL PROTECTED] Monday, October 25, 2004 10:01:53 AM 
Brad Nicholes wrote:
 -1 as well.  This is now causing compiler errors on NetWare.  Please
 revert this patch!

Can you provide an indication of exactly what broke so we
will know what to avoid in future. Or was the breakage
actually due to the the mod_cache problem reported
last night?

Thanks, Allan



Re: Mod_Cache build problem on NetWare...

2004-10-25 Thread Garrett Rooney
Jean-Jacques Clar wrote:
Thanks for reporting the problem Norm,
The patch was submitted.
Is casting to (char *) really the right thing to do here?  I mean why 
not just make new const so the cast isn't needed at all?  It seems like 
a cast in this situation is a bit ugly...

-garrett


Re: Event MPM

2004-10-25 Thread Paul Querna
Brian Akins wrote:
Paul Querna wrote:
Non-KeepAlive:
 `ab -c 25 -n 10 http://10.10.10.10:6080/index.html`
 Worker MPM: 2138.28
 Event MPM: 2147.95
KeepAlive:
 `ab -k -c 25 -n 10 http://10.10.10.10:6080/index.html`
 Worker MPM: 4396.38
 Event MPM: 4119.40
Have you tried it with higher number of clients -- i.e,. -c 1024?
Nope. I was already maxing out my 100mbit LAN at 25 clients.  I don't 
have a good testing area for static content request benchmarking.

I am thinking of trying to find an old pentium I with PCI and putting a 
GigE card in it just for benchmarking.

The current patch does not spawn new threads on demand. You need to set 
ThreadsPerChild 1200 to test that many clients.  This is on my short 
list of things to change before committing the Event MPM to CVS.

We are interesting in the event mpm mainly for dealing with keep alives.
Yes, this is the target the Event MPM aims at :)
-Paul Querna


Re: Event MPM

2004-10-25 Thread Ivan Ristic
Paul Querna wrote:

 Brian Akins wrote:

 We are interesting in the event mpm mainly for dealing with keep alives.

 Yes, this is the target the Event MPM aims at :)

  If I understand the nature of the patch correctly then you don't
  need to go increasing the number of clients at all. Instead you
  should be looking at changing ab to pause for a few seconds
  between every two requests carried out over the same connection.
  With that change, and limiting the maximum number of processes to,
  say, 100, you should be able to obtain meaningful results with your
  current setup. Worker MPM will max out at some point, with Event MPM
  going past it to max at a higher value.

-- 
ModSecurity (http://www.modsecurity.org)
[ Open source IDS for Web applications ]


Re: Event MPM

2004-10-25 Thread Brian Pane
Paul Querna wrote:
Brian Akins wrote:
Paul Querna wrote:

[...]
Have you tried it with higher number of clients -- i.e,. -c 1024?

Nope. I was already maxing out my 100mbit LAN at 25 clients.  I don't 
have a good testing area for static content request benchmarking.

I am thinking of trying to find an old pentium I with PCI and putting 
a GigE card in it just for benchmarking.

How about modifying ab to add a delay of a second or two between 
successive read(2) calls
on the same connection (and limiting the max read size to a small value, 
to make sure each
response requires multiple reads)?  The throughput numbers wouldn't be 
very impressive, of
course, but you'd be able to see how much memory and how many threads 
the Event MPM
uses in handling a real-world number of concurrent connections, compared 
to Worker.

The current patch does not spawn new threads on demand. You need to 
set ThreadsPerChild 1200 to test that many clients.  This is on my 
short list of things to change before committing the Event MPM to CVS.

This part seems surprising.  What's the relationship between clients and 
threads in the
Event MPM--does it use a thread per connection?

I haven't had a chance to check out the Event MPM yet,  but I'm planning 
to download it and
study the code later this week.

For what it's worth, I can think of at least one killer app for the 
Event MPM beyond keep-alive
handling: as an efficient (low memory  thread use per connection) 
connection multiplexer in
front of an inefficient (high memory  thread use per connection) appserver.

-Brian


Re: Event MPM

2004-10-25 Thread Greg Ames
++1 for moving this along.
For any newbies, the object of the game is to increase httpd scalability by 
reducing the number of worker threads and their associated stack memory.

Paul Querna wrote:
First is a patch to APR that provides an extension to apr_pollset that 
optionally make some parts of it threadsafe.  This patch has been 
submitted to APR before, and hopefully it will get accepted there soon.
I need to catch up here.  quick questions: will this event MPM work with 
conventional poll (i.e. no epoll or kqueue)?  Can it handle a worker thread 
adding directly to the pollset?

The other patch is the actual MPM and with a few changes to other parts 
of the source to take advantage of the Event Thread for handling Keep 
Alive Requests.
IMO the few changes to other parts are the most important to do first.  If we 
can get consensus on the core parts and commit them to 2.1, much of the work can 
be done in the experimental mpm dir without bothering uninterested parties.

These patches total over 160k. Since many mail servers are extremely 
lame, I have put them up on the web at:
http://www.apache.org/~pquerna/event-mpm/
diffing worker.c to event.c and worker/fdqueue.c to event/fdqueue.c should make 
the mpm changes easier to follow.

MPM Structure:
I have made the MPM a single process with multiple threads  
interesting!  but then per process limits (fd's esp. for sockets, virtual 
memory) could put a ceiling on how high this can scale.  Plus if a buggy module 
seg faults, it's a lot more disruptive when there's only one worker process.

   It currently 
does not spawn new threads on demand, but I plan to add this soon.  By 
making it run as a single process, I was able to completely remove the 
accept() locking.  This also greatly simplifies the Listener Thread's 
poll()`ing of the sockets.
understood, but see above.  My vote is to keep it multiprocess capable and 
perhaps allow a single process ./configure option.

The Listener thread includes all the Sockets that it is listening to for 
incoming requests on in the same pollset as all requests waiting for 
IO/KeepAlive.  Greg's original patch had a separate 'Event' Thread, but 
I have chosen to combine them.  
+1
That means that some of the changes to how fdqueue.c manipulates idlers aren't 
strictly necessary since the listener is the only thread that ever tries to 
reserve a worker thread.  OTOH I don't think those changes to fdqueue.c hurt 
anything either so it's a minor detail.

The Worker Threads work much like the 'Worker' MPM, but when a 
connection is ready for a Keep Alive, they will push the client back to 
the Listener/Event Thread.  This thread does not need to be woken up 
like in Greg Ames' patch.  This is because of the enhancement to 
apr_pollset that enables other threads to _add() or _remove() while the 
main thread is inside a _poll().
even with plain ol' vanilla poll() ?
The place where the Event MPM should shine is with the more common case 
of relatively high-latency Internet clients.  The Event MPM isn't super 
powerful on the KeepAlive-over-LAN case because it forces a context 
switch to process the client again when it is does another request as 
part of a KeepAlive.
I heard of an application where many clients send a heartbeat to a server over 
http every few seconds.  That would benefit greatly by using keepalives and some 
form of event MPM, even over a LAN.

Greg


Re: Event MPM

2004-10-25 Thread Greg Ames
Brian Pane wrote:
Paul Querna wrote:
Brian Akins wrote:

Have you tried it with higher number of clients -- i.e,. -c 1024?

Nope. I was already maxing out my 100mbit LAN at 25 clients.  I don't 
have a good testing area for static content request benchmarking.

I am thinking of trying to find an old pentium I with PCI and putting 
a GigE card in it just for benchmarking.
I'd love to find one like this too.  I sometimes use a 180 MHz Pentium Pro box. 
 But it only has 160M of EDO memory which is hard to scrounge around here, and 
probably can't use modern hard drives effectively due to BIOS limitations.

How about modifying ab to add a delay of a second or two between 
successive read(2) calls
on the same connection 
an excellent idea.  Or maybe add a delay after any read if -k is in effect.
The current patch does not spawn new threads on demand. You need to 
set ThreadsPerChild 1200 to test that many clients.  This is on my 
short list of things to change before committing the Event MPM to CVS.

This part seems surprising.  What's the relationship between clients and 
threads in the
Event MPM--does it use a thread per connection?
one thread per connection with an active http request, plus the listener/event 
thread who owns all the connections in keepalive.  I believe Paul is saying set 
ThreadsPerChild to 1200 to handle the worst case behavior - 100% of the 
connections are doing real work at some instant and none are in keepalive 
timeouts.

Greg


Re: Event MPM

2004-10-25 Thread Paul Querna
Greg Ames wrote:
First is a patch to APR that provides an extension to apr_pollset that 
optionally make some parts of it threadsafe.  This patch has been 
submitted to APR before, and hopefully it will get accepted there soon.

I need to catch up here.  quick questions: will this event MPM work with 
conventional poll (i.e. no epoll or kqueue)?  Can it handle a worker 
thread adding directly to the pollset?
The patch to APR adds a new 'APR_POLLSET_THREADSAFE' flag for 
apr_pollset_create(...).

This only works for the EPoll and KQueue backends.  This allows a worker 
thread to _add() or _remove() directly, without having to touch the 
thread in _poll().

It could be implemented for plain Poll by having the pollset contain an 
internal pipe.  This Pipe could be pushed by an _add() or _remove() and 
force the _poll() thread to wake up. The thread in _poll() could then 
add or remove any sockets from it's set, and then start the _poll() 
again.

This is how your original patch did it, but we could push it down to 
APR.  I don't think in the end it will yield good performance with this 
method.

My personal view is that most platforms will have either EPoll() or 
KQueue() Support.  I guess its only an issue for people running 2.2./2.4 
Linux, since KQueue has been in *BSD for a long time.  As time passes, I 
believe that implementing it for plain poll() will be come less important.

Other platforms like Solaris have better poll() replacements that we 
should add to apr_pollset_*.

[...snip...]
I have made the MPM a single process with multiple threads  

interesting!  but then per process limits (fd's esp. for sockets, 
virtual memory) could put a ceiling on how high this can scale.  Plus if 
a buggy module seg faults, it's a lot more disruptive when there's only 
one worker process.
We don't have buggy modules :-)
Isn't the per-process FD limit only really a problem for platforms like 
older Solaris?  I thought that on Linux/BSD it is quite high by default now.


   It currently does not spawn new threads on demand, but I plan to 
add this soon.  By making it run as a single process, I was able to 
completely remove the accept() locking.  This also greatly simplifies 
the Listener Thread's poll()`ing of the sockets.

understood, but see above.  My vote is to keep it multiprocess capable 
and perhaps allow a single process ./configure option.
I am not completely adverse to putting multi-processes back in, I mostly 
wanted to get a working patch out there for review.

The current logic used in Worker is flawed for poll`ing of Listening 
sockets.  I just ripped all of it out, and ended up with a much simpler 
MPM that avoids extra locking.

[...snip...]
The Worker Threads work much like the 'Worker' MPM, but when a 
connection is ready for a Keep Alive, they will push the client back 
to the Listener/Event Thread.  This thread does not need to be woken 
up like in Greg Ames' patch.  This is because of the enhancement to 
apr_pollset that enables other threads to _add() or _remove() while 
the main thread is inside a _poll().

even with plain ol' vanilla poll() ?
Nope. Its just not easily to implement and support plain old vanilla 
poll().  You cannot just add an FD to its array of sockets from other 
threads.  The poll() will also not see any changes in its list until you 
wake it up, and start the poll() again.

This is not impossible to implement, but its going to have a huge number 
of context switches for what should be a very common operation.

[...snip..]
The place where the Event MPM should shine is with the more common 
case of relatively high-latency Internet clients.  The Event MPM isn't 
super powerful on the KeepAlive-over-LAN case because it forces a 
context switch to process the client again when it is does another 
request as part of a KeepAlive.

I heard of an application where many clients send a heartbeat to a 
server over http every few seconds.  That would benefit greatly by using 
keepalives and some form of event MPM, even over a LAN.
Yup.
Another place the event MPM would rock is on an FTP server. (see mod_ftpd).
Yet Another place is an IMAPv4 server written as an Apache Module. (This 
is on my ~/TODO list)

Thanks for the Feedback!
-Paul Querna


Re: Event MPM

2004-10-25 Thread Paul Querna
Brian Pane wrote:
How about modifying ab to add a delay of a second or two between 
successive read(2) calls
on the same connection (and limiting the max read size to a small value, 
to make sure each
response requires multiple reads)?  The throughput numbers wouldn't be 
very impressive, of
course, but you'd be able to see how much memory and how many threads 
the Event MPM
uses in handling a real-world number of concurrent connections, compared 
to Worker.
+1, I will look at patching this tonight.  I think some other HTTP 
benchmarkers might already have this feature. Flood?

The current patch does not spawn new threads on demand. You need to 
set ThreadsPerChild 1200 to test that many clients.  This is on my 
short list of things to change before committing the Event MPM to CVS.
This part seems surprising.  What's the relationship between clients and 
threads in the
Event MPM--does it use a thread per connection?
A thread per-connection that is currently being processed.
Note that this is not the traditional 'event' model that people write 
huge papers about and thttpd raves about, but rather a hybrid that uses 
a Worker Thread todo the processing, and a single 'event' thread to 
handle places where we are waiting for IO. (Currently accept() and Keep 
Alive Requests).  Perhaps it needs a different name? (eworker?)

A future direction to investigate would be to make all of the initial 
Header parsing be done Async, and then switch to a Worker thread to 
preform all the post_read hooks, handlers, and filters.  I believe this 
could be done without breaking many 3rd party modules. (SSL could be a 
bigger challenge here...)

I haven't had a chance to check out the Event MPM yet,  but I'm planning 
to download it and
study the code later this week.

For what it's worth, I can think of at least one killer app for the 
Event MPM beyond keep-alive
handling: as an efficient (low memory  thread use per connection) 
connection multiplexer in
front of an inefficient (high memory  thread use per connection) 
appserver.
Yes, another thing on my ~/TODO is to use this design this as the base 
for a perchild replacement.

My idea is to use a lightweight 'event' frontend to determine which 
backend to pass a socket to.  These backends would be running as 
different UIDs, and could be running a threaded or non-threaded manner 
to optionally support PHP.

-Paul Querna


Re: Event MPM

2004-10-25 Thread Paul Querna
Paul Querna wrote:
  A thread per-connection that is currently being processed.
Note that this is not the traditional 'event' model that people write 
huge papers about and thttpd raves about, but rather a hybrid that uses 
a Worker Thread todo the processing, and a single 'event' thread to 
handle places where we are waiting for IO. (Currently accept() and Keep 
Alive Requests).  Perhaps it needs a different name? (eworker?)

A future direction to investigate would be to make all of the initial 
Header parsing be done Async, and then switch to a Worker thread to 
preform all the post_read hooks, handlers, and filters.  I believe this 
could be done without breaking many 3rd party modules. (SSL could be a 
bigger challenge here...)
Some academics have played with this model of a event thread + worker 
threads.  Best I can find is the Java 'SEDA: An Architecture for Highly 
Concurrent Server Applications'. [1]

They have some pretty graphs in their papers of their Haboob web server 
that can scale very nicely compared to traditional event or 
pure-threaded servers.[2] [3]

-Paul Querna
[1] http://www.eecs.harvard.edu/~mdw/proj/seda/
[2] http://www.enhyper.com/content/eventsbadidea.pdf
[3] http://www.cis.upenn.edu/~hhl/cse434/lectures/seda.pdf


Re: Event MPM

2004-10-25 Thread Brian Akins
Greg Ames wrote:
one thread per connection with an active http request, plus the 
listener/event thread who owns all the connections in keepalive.  I 
believe Paul is saying set ThreadsPerChild to 1200 to handle the worst 
case behavior - 100% of the connections are doing real work at some 
instant and none are in keepalive timeouts.

Can you still have multiple processes?  We use 10k plus threads per box 
with worker.

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Mod_Cache build problem on NetWare...

2004-10-25 Thread =?ISO-8859-1?Q?R=FCdiger_Pl=FCm?=

Garrett Rooney wrote:
Jean-Jacques Clar wrote:
Thanks for reporting the problem Norm,
The patch was submitted.

Is casting to (char *) really the right thing to do here?  I mean why 
not just make new const so the cast isn't needed at all?  It seems like 
Do I miss anything or would we need *new to be const and not new?
a cast in this situation is a bit ugly...
-garrett

Regards
RĂ¼diger


Re: Event MPM

2004-10-25 Thread Greg Ames
Brian Akins wrote:
Can you still have multiple processes?  We use 10k plus threads per box 
with worker.
with my patch, yes.  with Paul's, no.
But Paul's has some very nice features that mine doesn't have, so I think a 
hybrid is the way to go.

Assuming you have a high percentage of threads in keepalive timeouts, you will 
be able to cut down on the number of threads per box.

Greg


Re: Event MPM

2004-10-25 Thread Brian Akins
Greg Ames wrote:
Assuming you have a high percentage of threads in keepalive timeouts, 
you will be able to cut down on the number of threads per box.
Yes, we do.
I can certainly provide you guys with some testing, if nothing else.  We 
have some home grown benchmarks that may could help, not to mention lots 
of live traffic :)

--
Brian Akins
Lead Systems Engineer
CNN Internet Technologies


Re: Event MPM

2004-10-25 Thread Greg Ames
Brian Akins wrote:
I can certainly provide you guys with some testing, if nothing else.  
excellent!
We have some home grown benchmarks that may could help, 
If they simulate user think time or would otherwise cause a lot of keepalive 
timeouts, great!  Finding the right client/benchmark is a problem for me right 
now and I believe for Paul too.

not to mention lots of live traffic :)
ummm, I'm glad you put a smiley on the end of that :)
Greg


Re: Event MPM

2004-10-25 Thread Paul Querna
Greg Ames wrote:
Brian Akins wrote:
Can you still have multiple processes?  We use 10k plus threads per 
box with worker.

with my patch, yes.  with Paul's, no.
Correct at the moment.
I think based on the feedback so far, I will investigate making it 
multi-processed again.


Re: Event MPM

2004-10-25 Thread Paul Querna
Greg Ames wrote:
Brian Akins wrote:
We have some home grown benchmarks that may could help, 

If they simulate user think time or would otherwise cause a lot of 
keepalive timeouts, great!  Finding the right client/benchmark is a 
problem for me right now and I believe for Paul too.
Yup. This is my biggest problem right now with the benchmarks.
`ab` just doesn't give a realistic modeling of how clients use Keep 
Alives.  If you have some home grown programs that do, I would love to 
use them.


Re: Event MPM

2004-10-25 Thread Greg Ames
Paul Querna wrote:
This only works for the EPoll and KQueue backends.  This allows a worker 
thread to _add() or _remove() directly, without having to touch the 
thread in _poll().

It could be implemented for plain Poll by having the pollset contain an 
internal pipe.  This Pipe could be pushed by an _add() or _remove() and 
force the _poll() thread to wake up. The thread in _poll() could then 
add or remove any sockets from it's set, and then start the _poll() 
again.

This is how your original patch did it, but we could push it down to 
APR.  
let's stick with your version at least for now.
I don't think in the end it will yield good performance with this method.
agreed.  I wasn't happy with the extra pipe and the overhead, though some of 
that could go away.  The point was to get something out there as a 
proof-of-concept that would work for just about anybody.  Today the barrier to 
entry is much lower in the Linux world anyway.  I'm not sure if a distro with a 
2.6 kernel will install easily on my Pentium Pro benchmarking box, but that's a 
pretty weird environment.

Greg


More musings about asynchronous MPMs Re: Event MPM

2004-10-25 Thread Brian Pane
Paul Querna wrote:
Paul Querna wrote:
  A thread per-connection that is currently being processed.
Note that this is not the traditional 'event' model that people write 
huge papers about and thttpd raves about, but rather a hybrid that 
uses a Worker Thread todo the processing, and a single 'event' thread 
to handle places where we are waiting for IO. (Currently accept() and 
Keep Alive Requests).  Perhaps it needs a different name? (eworker?)

A future direction to investigate would be to make all of the initial 
Header parsing be done Async, and then switch to a Worker thread to 
preform all the post_read hooks, handlers, and filters.  I believe 
this could be done without breaking many 3rd party modules. (SSL 
could be a bigger challenge here...)

Some academics have played with this model of a event thread + worker 
threads.  Best I can find is the Java 'SEDA: An Architecture for 
Highly Concurrent Server Applications'. [1]

Yeah, SEDA's model of processing stages--basically a succession of 
thread pools through
which a request passes, each with a queue in front--looks like a 
promising way of mixing
event-based and non-event-based processing.

Another approach I've looked at is to use Schmidt's Reactor design 
pattern, in which a central
listener thread dispatches each received event to one of a pool of 
worker threads.  The worker
is then allowed to do arbitrarily complex processing (provided you have 
enough worker threads)
before deciding to tell the listener thread that it wants to wait for 
another event.  I did some
prototyping of a Leader/Followers variant of this: have a pool of worker 
threads that take turns
grabbing the next event from a queue of received events.  If there are 
no events left in the queue,
the next worker thread does a select/poll/etc (and any other idle 
workers block on a condition
variable).  This seemed to work reasonably well in a toy test app (it 
was *almost* capable of
implementing HTTP/0.9, but I never had time to try 1.0 or 1.1; oh, and 
it depended on the
java.nio classes to do the select efficiently, so I didn't even try to 
commit it to
server/mpm/experimental. :-)

On a more general topic...and at the risk of igniting a distracting 
debate...does anybody else
out there have an interest in doing an async httpd in Java?

There are a lot of reasons *not* to do so, mostly related to all the 
existing httpd-2.0 modules
that wouldn't work.  The things that seem appealing about trying a 
Java-based httpd, though,
are:

- The pool memory model at the core of httpd-1.x and -2.x isn't well 
suited to MPM
designs where multiple threads need to handle the same 
connection--possibly at the same
time, for example when a handler that's generating content needs to push 
output buckets
to an I/O completion thread to avoid having to block the (probably 
heavyweight) handler
thread or make it event-based.  Garbage collection on a per-object basis 
would be a lot
easier.
- Modern Java implementations seem to be doing smart things from a 
scalability perspective,
like using kqueue/epoll/etc.
- And (minor issue) it's a lot easier to refactor things in Java than in 
C, and I expect that
building a good async MPM that handles dynamic content and proxying 
effectively will
require a lot of iterations of design trial and error.

Brian


Re: Mod_Cache build problem on NetWare...

2004-10-25 Thread Jean-Jacques Clar


What about doing:

static const char *add_ignore_header(cmd_parms *parms, void *dummy,- const char *header)+ char *header){ cache_server_conf *conf; char **new;@@ -802,7 +802,7 @@ * (When 'None' is passed, IGNORE_HEADERS_SET  nelts == 0.) */ new = (char **)apr_array_push(conf-ignore_headers);- (*new) = (char *)header;+ (*new) = header; [EMAIL PROTECTED] 10/25/04 10:32 AM 
Jean-Jacques Clar wrote: Thanks for reporting the problem Norm, The patch was submitted.Is casting to (char *) really the right thing to do here? I mean why not just make new const so the cast isn't needed at all? It seems like a cast in this situation is a bit ugly...-garrett


Re: Mod_Cache build problem on NetWare...

2004-10-25 Thread Garrett Rooney
Jean-Jacques Clar wrote:
What about doing:
 
 static const char *add_ignore_header(cmd_parms *parms, void *dummy,
- const char *header)
+ char *header)
 {
 cache_server_conf *conf;
 char **new;
@@ -802,7 +802,7 @@
  * (When 'None' is passed, IGNORE_HEADERS_SET  nelts == 0.)
  */
 new = (char **)apr_array_push(conf-ignore_headers);
-(*new) = (char *)header;
+(*new) = header;

That seems fine, the only reason I didn't suggest it was that I wasn't 
sure if we were ever passing anything const in as an argument, if we 
aren't then just making header non-const should clean up the problem 
nicely without the cast.

-garrett


Event MPM w/ multiple processes

2004-10-25 Thread Paul Querna
Brian Akins wrote:
Greg Ames wrote:
one thread per connection with an active http request, plus the 
listener/event thread who owns all the connections in keepalive.  I 
believe Paul is saying set ThreadsPerChild to 1200 to handle the worst 
case behavior - 100% of the connections are doing real work at some 
instant and none are in keepalive timeouts.

Can you still have multiple processes?  We use 10k plus threads per box 
with worker.
The updated patch for today adds multiple processes. (same directives as 
the worker MPM):

http://www.apache.org/~pquerna/event-mpm/event-mpm-2004-10-25.patch
However, the big thing it doesn't use is accept serialization.
This means all event threads are listening for incoming clients. The 
first one to process the incoming connection gets it.  This does not 
block the other event threads, since they set the listening socket to 
non-blocking before starting their loop.

This seems to work fine on my tests.  It has the sucky side effect of 
waking up threads sometimes when they are not needed, but on a busy 
server, trying to accept() will likely be fine, as there will be a 
backlog of clients to accept().

-Paul Querna


mod_cache: Content Generation Dependencies?

2004-10-25 Thread Paul Querna
I have been doing some stuff with mod_transform (XSLT processor) and 
mod_cache.

The problem is, mod_cache doesn't have any easy way to know if a request 
needs to be regenerated.  Right now, it just blindly caches until a 
timeout.  What I would prefer is that it knows what files or URLs a 
specific request depends upon, and if any of those change, then 
regenerate the request.

An example:
cache_add_depends(r, /home/httpd/site/xsl/foo.xsl);
This would add 'foo.xsl' as a dependency of the current request.  If the 
file's mtime changes, mod_cache would invalidate the cache of the 
current request.

Any opinions or suggestions?
A stat() call on several files is hundreds of times faster than having 
mod_transform re-generate the output.  While I would hate to stat() 
hundreds of files on every request, this method could eliminate all 
unesesary regeneration of cached content.

-Paul Querna


Re: mod_cache: Content Generation Dependencies?

2004-10-25 Thread Graham Leggett
Paul Querna wrote:
I have been doing some stuff with mod_transform (XSLT processor) and 
mod_cache.

The problem is, mod_cache doesn't have any easy way to know if a request 
needs to be regenerated.  Right now, it just blindly caches until a 
timeout.  What I would prefer is that it knows what files or URLs a 
specific request depends upon, and if any of those change, then 
regenerate the request.

An example:
cache_add_depends(r, /home/httpd/site/xsl/foo.xsl);
This would add 'foo.xsl' as a dependency of the current request.  If the 
file's mtime changes, mod_cache would invalidate the cache of the 
current request.

Any opinions or suggestions?
A stat() call on several files is hundreds of times faster than having 
mod_transform re-generate the output.  While I would hate to stat() 
hundreds of files on every request, this method could eliminate all 
unesesary regeneration of cached content.
The cache has no knowledge about underlying files, never mind multiple 
dependancies. It relies on HTTP/1.1 to work out cache freshness.

Dependancies should be tracked by mod_transform not mod_cache - if 
either the source file, or XSL file changes, then the Etag should 
change, which will signal mod_cache (and any other caching proxies along 
the way) that the content is no longer fresh.

If mod_transform isn't supporting Etag properly, then I'd say 
mod_transform was broken, and fixing it would probably solve your problem.

Regards,
Graham
--


smime.p7s
Description: S/MIME Cryptographic Signature


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Roy T . Fielding
What would make more sense is Error while reading HTTP request line. 
(remote browser didn't send a request?). This indicates exactly what 
httpd was trying to do when the error occurred, and gives a hint of 
why the error might have occurred.
We used to have such a message.  It was removed from httpd because too
many users complained about the log file growing too fast, particularly
since that is the message which will be logged every time a browser
connects and then its initial request packet gets dropped by the 
network.

This is not an error that the server admin can solve -- it is normal
life on the Internet.  We really shouldn't be logging it except when
on DEBUG level.
Roy


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Rici Lake
On 25-Oct-04, at 11:04 PM, Roy T. Fielding wrote:
This is not an error that the server admin can solve -- it is normal
life on the Internet.  We really shouldn't be logging it except when
on DEBUG level.
That was my first reaction, too. However, Ivan Ristic pointed out that
(in some cases, anyway) it is the result of a DoS attack, and there may
be something a server admin can do about it. Or, if not, at least they
might know why their server is suddenly performing badly.
For example, we had a problem report on #apache a couple of days ago
which turned out, after considerable investigation, to be the result
of a single host ip issuing hundreds of request connections in a few
minutes. Whether this was a deliberate attack or simply a buggy
client is not clear (to me) but the temporary solution of blocking
the ip address was certainly within the server admin's abilities.


Re: Mod_Cache build problem on NetWare...

2004-10-25 Thread Justin Erenkrantz
--On Monday, October 25, 2004 3:56 PM -0600 Jean-Jacques Clar 
[EMAIL PROTECTED] wrote:

What about doing:
 static const char *add_ignore_header(cmd_parms *parms, void *dummy,
- const char *header)
+ char *header)
Um, no, you can't change that.  The prototype for the ITERATE directive is:
const char *(*take1) (cmd_parms *parms, void *mconfig, const char *w);
(see cmd_func in http_config.h)
I wonder if we could modify apr_array_push's cast to (const char**) instead. 
I don't see a particular reason why that wouldn't work instead.  -- justin


Re: mod_cache: Content Generation Dependencies?

2004-10-25 Thread Justin Erenkrantz
--On Tuesday, October 26, 2004 4:32 AM +0200 Graham Leggett [EMAIL PROTECTED] 
wrote:

If mod_transform isn't supporting Etag properly, then I'd say mod_transform
was broken, and fixing it would probably solve your problem.
+1.  If the content changes, so should the ETag.  mod_transform could also set 
some Cache-Control headers.

In short, think about what an intermediary caching HTTP proxy would do with 
that request.  It'd check the expiration header ('freshness' tests), and, if 
that fails, then the external cache would try to send an If-Modified-Since (or 
some variant) request to the upstream server: if httpd responds it hasn't 
changed, then it'd serve from the cache until the timeout.  So, it'd exactly 
act the same as mod_cache.  Hence, adding 'private' hooks for mod_cache would 
still allow stale responses from external HTTP caches.  -- justin


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread TOKILEY

 For example, we had a problem report on #apache a couple of days ago
 which turned out, after considerable investigation, to be the result
 of a single host ip issuing hundreds of request connections in a few
 minutes. Whether this was a deliberate attack or simply a buggy
 client is not clear (to me) but the temporary solution of blocking
 the ip address was certainly within the server admin's abilities.

That could have easily just been one of these 'commercial'
companies that are just testing sites for 'availabilty' and 
publishing the results. There are lots of these.

Only one example...
http://www.internethealthreport.com

These guys might hit you hard and fast at any moment
and they aren't sending any data... they just want to
see how fast you can 'answer the phone' and they
turn that into a 'site health' statistic.

Roy was right in his previous message.

Apache USED to try and log something for all
broken inbound connect requests but that, itself,
turned into a 'please fix this right away' bug report
when people's log files went through the roof.

In the case you just mentioned... it is going to take
a special 'filter' to 'sense' that a possible DOS 
attack is in progress. Just fair amounts of 'dataless'
connection requests from one or a small number of orgins
doesn't qualify. There are plenty of official
algorithms around now to 'sense' most of these
brute force attacks and ( only then ) pop you an
'alert' or something.

Just relying on a gazillion entries in a log file isn't
the right way to 'officially' distinguish a DOS attack
from just ( as Roy says ) 'life on the Internet'.

All major browsers will abandon pending connect threads
for a web page whenever you hit the BACK button, as well.
Connects in progress at the socket level will still complete
but no data will be sent because the threads have all died.
Happens 24x7x365.

The 5 second rule still applies.
If people don't see good content showing up within 5 seconds
they will click away from you and all the threads pending
connects to you die immediately but all you might
see are tons of 'dataless' connect completions on your end.
It's not worth logging any of it.

Yours...
Kevin Kiley


In a message dated 10/25/2004 11:23:24 PM Central Daylight Time, [EMAIL PROTECTED] writes:


 This is not an error that the server admin can solve -- it is normal
 life on the Internet. We really shouldn't be logging it except when
 on DEBUG level.

That was my first reaction, too. However, Ivan Ristic pointed out that
(in some cases, anyway) it is the result of a DoS attack, and there may
be something a server admin can do about it. Or, if not, at least they
might know why their server is suddenly performing badly.

For example, we had a problem report on #apache a couple of days ago
which turned out, after considerable investigation, to be the result
of a single host ip issuing hundreds of request connections in a few
minutes. Whether this was a deliberate attack or simply a buggy
client is not clear (to me) but the temporary solution of blocking
the ip address was certainly within the server admin's abilities.





Re: Event MPM

2004-10-25 Thread Justin Erenkrantz
--On Monday, October 25, 2004 2:17 PM -0400 Greg Ames [EMAIL PROTECTED] 
wrote:

I am thinking of trying to find an old pentium I with PCI and putting
a GigE card in it just for benchmarking.
I'd love to find one like this too.  I sometimes use a 180 MHz Pentium Pro
box.   But it only has 160M of EDO memory which is hard to scrounge around
here, and probably can't use modern hard drives effectively due to BIOS
limitations.
Good luck.  I really wouldn't recommending throwing money down this path.  I 
spent $300 I really don't have on a GigE switch and cards in order to get more 
out of mod_cache to little avail other than more useless benchmark numbers.

Yet, if you send me some flood scripts and httpd patches, I can throw some 
numbers back at ya though.  No sense other httpd'ers wasting their money.

How about modifying ab to add a delay of a second or two between
successive read(2) calls
on the same connection
an excellent idea.  Or maybe add a delay after any read if -k is in effect.
Um, flood already lets you do this and lots more.  ab isn't a load simulator 
and trying to produce benchmarks with it is laughable.  -- justin


Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Rici Lake
With all due respect, I don't think the following scenario
will be logged by the patch; it only reports when the attempt
to read the initial request line fails, not when the socket
is closed prior to data transfer terminating. At least, that
was the intent.
I'll leave it to Ivan to try and make the case within the
context of mod_security, if he chooses to. Reinstating the
notification message was basically his request in the first
place.
On 26-Oct-04, at 12:14 AM, [EMAIL PROTECTED] wrote:
 All major browsers will abandon pending connect threads
 for a web page whenever you hit the BACK button, as well.
 Connects in progress at the socket level will still complete
 but no data will be sent because the threads have all died.
 Happens 24x7x365.



Re: cvs commit: httpd-2.0/server protocol.c

2004-10-25 Thread Justin Erenkrantz
--On Monday, October 25, 2004 9:04 PM -0700 Roy T. Fielding 
[EMAIL PROTECTED] wrote:

This is not an error that the server admin can solve -- it is normal
life on the Internet.  We really shouldn't be logging it except when
on DEBUG level.
+1.  Info loglevel is way way too high for realistic sites.  -- justin