Re: cvs commit: httpd-2.0/modules/experimental mod_cache.c

2001-09-04 Thread Greg Marr

At 06:19 PM 09/03/2001, Graham Leggett wrote:
Greg Marr wrote:

  How exactly do you use Cache-Control directives so that the 
 content that
  is cached is before includes are processed, and that when it is
  retrieved from the cache, the includes are processed?  It just 
 doesn't
  work that way.  In this case you have to put the includes before 
 the
  cache.

mod_cache and mod_include should have no knowledge of each other.

Exactly.  They shouldn't care in which order they are run, that 
should be up to the admin.  You're saying the opposite, that 
mod_cache should only be able to run after mod_include.

mod_cache is interested in content going to the browser.

You're interested in having mod_cache only interested in content 
going to the browser.  There's a difference.

There are already two caches in the loop (a transparent ISP cache, 
and the browser cache) so if mod_include doesn't generate proper 
Cache-Control: headers
then mod_includes is broken already without any help from mod_cache.

This has absolutely nothing to do with using mod_cache to cache the 
page just before it is processed by mod_include, and processing the 
page retrieved from the cache using mod_include.

I don't see any reason why mod_cache should interfere with 
mod_include (or vice versa).

Apparently you do.

mod_cache only cares about what is spat out at the end of the filter 
chain

Why should it be restricted like that?  That makes it less useful.

One of the fundamental design points about mod_cache was to separate 
it from all other modules (specifically mod_proxy).  Tying mod_cache 
behaviour to mod_include (or any other module) is a step backwards 
in the design.

So why are you saying it should be done that way?  It should be able 
to be placed at any point in the filter chain.  Requiring it to come 
after mod_whatever is tying its behavior to mod_whatever, saying 
that mod_whatever must process the   file before it is cached.

-- 
Greg Marr
[EMAIL PROTECTED]
We thought you were dead.
I was, but I'm better now. - Sheridan, The Summoning




Re: cvs commit: httpd-2.0/server util_filter.c

2001-09-04 Thread Ryan Bloom

On Monday 03 September 2001 23:57, [EMAIL PROTECTED] wrote:
 jerenkrantz01/09/03 23:57:58

   Modified:server   util_filter.c
   Log:
   **NO CODE CHANGES**
   This is a reformat commit *ONLY*
   Please drive on through.

   (One spelling tpyo fixed...)

Didn't we decide a LONG time ago not to do this unless it was absolutely
necessary?  Format changes just add cruft to the CVS logs.  I have noticed
a lot of changes to the format in this patch that were more opinion than code
style.  For example, the ap_pass_brigade declaration.  We had three characters
wrapping to the next line, but this patch split it into two lines.  That was
unnecessary.

Formatting changes should only be committed to CVS if you are about to do 
a major re-writing of the code, and need the style to be changed for read-ability.
And even then, it is a bad idea, because it makes it much harder to actually
trace through the history, because we have a useless patch in the middle
of the history of the file.

Ryan

__
Ryan Bloom  [EMAIL PROTECTED]
Covalent Technologies   [EMAIL PROTECTED]
--



Re: cvs commit: httpd-2.0/server util_filter.c

2001-09-04 Thread Jeff Trawick

[EMAIL PROTECTED] writes:

 jerenkrantz01/09/03 23:50:52
 
   Modified:server   util_filter.c
   Log:
   The ap_add_input_filter/ap_add_output_filter functions do an O(n) scan
   through the list of registered filters.  This patch replaces the linear
   list with a hash table for better performance.
   Submitted by:   Brian Pane [EMAIL PROTECTED]
   Reviewed by:Justin Erenkrantz
   
   Revision  ChangesPath
   1.66  +30 -10httpd-2.0/server/util_filter.c
   
   Index: util_filter.c
   ===
   RCS file: /home/cvs/httpd-2.0/server/util_filter.c,v
   retrieving revision 1.65
   retrieving revision 1.66
   diff -u -r1.65 -r1.66
   --- util_filter.c   2001/08/30 05:25:31 1.65
   +++ util_filter.c   2001/09/04 06:50:52 1.66
   @@ -126,12 +132,26 @@

static ap_filter_t *add_any_filter(const char *name, void *ctx, 
  request_rec *r, conn_rec *c, 
   -  ap_filter_rec_t *frec,
   +   apr_hash_t *reg_filter_set,
  ap_filter_t **r_filters,
  ap_filter_t **c_filters)
{
   -for (; frec != NULL; frec = frec-next) {
   -if (!strcasecmp(name, frec-name)) {
   +if (reg_filter_set) {
   +ap_filter_rec_t *frec;
   +int len = strlen(name);
   +int size = len + 1;
   +char name_lower[size];

not portable, not to mention unbounded stack size; the HP C compiler
won't compile it (maybe a reason to keep it as-is :) )

array size must be constant expression

-- 
Jeff Trawick | [EMAIL PROTECTED] | PGP public key at web site:
   http://www.geocities.com/SiliconValley/Park/9289/
 Born in Roswell... married an alien...



Re: cvs commit: httpd-2.0/server util_script.c

2001-09-04 Thread Cliff Woolley

On Mon, 3 Sep 2001, Ryan Bloom wrote:

  !--#include virtual=/cgi-bin/redir.cgi-- ASSERT FAILED (see below)

 I would be willing to bet that this is a bug in mod_include, not the
 change that I made earlier today.

Oh, I'm right with you on that one... I seriously doubt the two are
related.  I should have changed the subject to a new thread... (these
tests were just to see what would happen, not to try to show that your
change broke something).

--Cliff

--
   Cliff Woolley
   [EMAIL PROTECTED]
   Charlottesville, VA





Re: [PATCH] RE: make distclean doesn't

2001-09-04 Thread Ryan Bloom

On Sunday 02 September 2001 01:22, Greg Stein wrote:
 On Fri, Aug 31, 2001 at 09:16:15PM -0700, Ryan Bloom wrote:
  On Friday 31 August 2001 19:31, William A. Rowe, Jr. wrote:
   From: Greg Stein [EMAIL PROTECTED]
   Sent: Friday, August 31, 2001 9:30 PM
  
On Fri, Aug 31, 2001 at 03:02:32PM -0700, Ryan Bloom wrote:
...
 exports.c shouldn't be cleaned, correct, because it is a part of
 the distribution, or at least it should be if it isn't already.
 config.nice is not a part of the distribution however, and should
 be removed by make distclean.
   
-1 on *any* form of clean that tosses config.nice
   
That holds *my* information about how I repeatedly configure Apache.
That is a file that I use, and is outside of the scope of the
config/build/whatever processes. Its entire existence is to retain
the information. Cleaning it is not right.
  
   What are you talking about?  We are talking about cleaning for
   packaging to _other_ computers, not yours.  That's what rbb is speaking
   of by 'distclean', clean enough for redistribution.
 
  Exactly.  The whole point and definition of make distclean, is that it
  cleans things to the point that it could be redistributed to another
  machine.  If you are just trying to clean the directory, then make clean
  is what you want.  If make clean doesn't remove enough for you, then
  something else is wrong.

 I use distclean on my computer all the time. Along with extraclean. Neither
 of those targets should toss config.nice. *That* is what I mean.

 To be clear: nothing in our build/config/whatever should remove config.nice


 Clean rules are about cleaning out state that might affect a build in
 some way. So we toss object files, generate makefiles, the configure
 script, whatever. But config.nice doesn't fall into that camp because it is
 not a stateful file. It is for the user to rebuild what they had before.

Just to point out, Apache 1.3 had config.status which is analogous to 2.0's
config.nice.  It turns out that make distclean in 1.3 removes config.status.

I would say this is proof that we should be removing config.nice with 2.0.

Ryan
__
Ryan Bloom  [EMAIL PROTECTED]
Covalent Technologies   [EMAIL PROTECTED]
--



Re: cvs commit: httpd-2.0/modules/experimental mod_cache.c

2001-09-04 Thread William A. Rowe, Jr.

From: Bill Stoddard [EMAIL PROTECTED]
Sent: Tuesday, September 04, 2001 10:02 AM


  I agree with this. Our current AP_FTYPE_* classifications is not granular enough to
  support this but that is easily fixed. Patch on the way...
 
 
 Err or not. Jeff convinced me that it was premature to add additional AP_FTYPES. 
For
 now, everything FTYPE_HEADERS (cache, content encoding, charset translations, et. 
al.)
 should be an FTYPE_CONTENT filter.

That works for me ... I'd agree, if this is what it looks like;

  Bill
 
   A transfer encoding isn't a byterange or chunking output.  It's a compression
   scheme, and that we _want_ to cache, to avoid the cpu overhead.
  
   Handler (e.g. Core/autoindex/mod_webapp etc)

FTYPE_CONTENT 

   Includes and other Filters
V
   Charset Translation (Transform to Client's preference)
V
   Content Encoding (gz) (Body - large packets - higher compression)

 END FTYPE_CONTENT

X  cache here
V
   Byterange

V

I forgot to insert about here...

  Transfer-Encoding (gz) (Really unsupported today by any clients per Kiley)
V
   Chunking
V
   Headers
V
   SSL Crypto
V
   Network I/O
  
  
  
 
 
 




Re: cvs commit: httpd-2.0/support Makefile.in

2001-09-04 Thread Ian Holsman

[EMAIL PROTECTED] wrote:

 rbb 01/09/02 20:27:48
 
   Modified:buildrules.mk.in
support  Makefile.in
   Log:
   Make Apache 2.0 install all files in the same location as Apache 1.3
   did.
   PR: 7626
   
   Revision  ChangesPath
   1.4   +2 -2  httpd-2.0/build/rules.mk.in
   
   Index: rules.mk.in
   ===
   RCS file: /home/cvs/httpd-2.0/build/rules.mk.in,v
   retrieving revision 1.3
   retrieving revision 1.4
   diff -u -r1.3 -r1.4
   --- rules.mk.in 2001/08/31 17:02:23 1.3
   +++ rules.mk.in 2001/09/03 03:27:48 1.4
   @@ -197,9 +197,9 @@

local-install: $(TARGETS) $(SHARED_TARGETS) $(INSTALL_TARGETS)
   @if test -n '$(PROGRAMS)'; then \
   -   test -d $(bindir) || $(MKINSTALLDIRS) $(bindir); \
   +   test -d $(sbindir) || $(MKINSTALLDIRS) $(sbindir); \
   list='$(PROGRAMS)'; for i in $$list; do \
   -   $(INSTALL_PROGRAM) $$i $(bindir); \
   +   $(INSTALL_PROGRAM) $$i $(sbindir); \
   done; \
   fi

   
   
   
   1.21  +5 -5  httpd-2.0/support/Makefile.in
   
   Index: Makefile.in
   ===
   RCS file: /home/cvs/httpd-2.0/support/Makefile.in,v
   retrieving revision 1.20
   retrieving revision 1.21
   diff -u -r1.20 -r1.21
   --- Makefile.in 2001/07/09 02:31:09 1.20
   +++ Makefile.in 2001/09/03 03:27:48 1.21
   @@ -13,12 +13,12 @@

install:
   @test -d $(bindir) || $(MKINSTALLDIRS) $(bindir)
   -   @cp -p $(top_srcdir)/server/httpd.exp $(bindir)
   -   @cp -p apachectl $(bindir)
   -   chmod 755 $(bindir)/apachectl
   +   @cp -p $(top_srcdir)/server/httpd.exp $(libexecdir)
   +   @cp -p apachectl $(sbindir)


hi Ryan.
is this right??
every 1.3 install I remember always put these in apache/bin

If it is right,
you need to modify apachectl so that HTTPD points to apache/sbin
instead of bin.

..Ian


   +   chmod 755 $(sbindir)/apachectl
   @if test -f $(builddir)/apxs; then \
   -   cp -p apxs $(bindir); \
   -   chmod 755 $(bindir)/apxs; \
   +   cp -p apxs $(sbindir); \
   +   chmod 755 $(sbindir)/apxs; \
   fi

htpasswd_OBJECTS = htpasswd.lo
   
   
   
 






Re: cvs commit: httpd-2.0/support Makefile.in

2001-09-04 Thread Ryan Bloom

On Tuesday 04 September 2001 08:52, Ian Holsman wrote:

Index: rules.mk.in
===
RCS file: /home/cvs/httpd-2.0/build/rules.mk.in,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- rules.mk.in   2001/08/31 17:02:23 1.3
+++ rules.mk.in   2001/09/03 03:27:48 1.4
@@ -197,9 +197,9 @@
 
 local-install: $(TARGETS) $(SHARED_TARGETS) $(INSTALL_TARGETS)
  @if test -n '$(PROGRAMS)'; then \
- test -d $(bindir) || $(MKINSTALLDIRS) $(bindir); \
+ test -d $(sbindir) || $(MKINSTALLDIRS) $(sbindir); \
  list='$(PROGRAMS)'; for i in $$list; do \
- $(INSTALL_PROGRAM) $$i $(bindir); \
+ $(INSTALL_PROGRAM) $$i $(sbindir); \
  done; \
  fi
diff -u -r1.20 -r1.21
--- Makefile.in   2001/07/09 02:31:09 1.20
+++ Makefile.in   2001/09/03 03:27:48 1.21
@@ -13,12 +13,12 @@
 
 install:
  @test -d $(bindir) || $(MKINSTALLDIRS) $(bindir)
- @cp -p $(top_srcdir)/server/httpd.exp $(bindir)
- @cp -p apachectl $(bindir)
- chmod 755 $(bindir)/apachectl
+ @cp -p $(top_srcdir)/server/httpd.exp $(libexecdir)
+ @cp -p apachectl $(sbindir)

 hi Ryan.
 is this right??
 every 1.3 install I remember always put these in apache/bin

 If it is right,
 you need to modify apachectl so that HTTPD points to apache/sbin
 instead of bin.

It is correct.  In the stock Apache layout, sbindir == bindir.  Hence, not
modifications to apachectl are necessary.

Ryan
__
Ryan Bloom  [EMAIL PROTECTED]
Covalent Technologies   [EMAIL PROTECTED]
--



Re: cvs commit: httpd-2.0/server util_filter.c

2001-09-04 Thread Justin Erenkrantz

On Tue, Sep 04, 2001 at 06:40:03AM -0700, Ryan Bloom wrote:
 Didn't we decide a LONG time ago not to do this unless it was absolutely
 necessary?  Format changes just add cruft to the CVS logs.  I have noticed
 a lot of changes to the format in this patch that were more opinion than code
 style.  For example, the ap_pass_brigade declaration.  We had three characters
 wrapping to the next line, but this patch split it into two lines.  That was
 unnecessary.

Well, I use terminals with 80 column width.  Therefore, long lines are 
a PITA for me.  I will *not* work with source files that do not fit 
our standard.  I'm an asshole in this respect.

I feel that code in our repository should meet our code standard:

http://dev.apache.org/styleguide.html

If *anyone* wants to submit patches to bring a file into compliance with
our published style guide, I *will* commit it.

Ryan, you may veto this commit.  However, if you wish to repeal or 
modify our style guide, I suggest you call a vote.  -- justin




Re: [PATCH] Add mod_gz to httpd-2.0

2001-09-04 Thread Günter Knauf

Hi Kevin,

 Guenter Knauf wrote...

 Hi,
 I was glad as Ian contributed his mod_gz; I tested it on Linux and Win32
 and it works for me.
 What did you test?
that it compiles, loads into server and compresses.
 How 'heavily loaded' was the Server?
you're right, I did only a quick test with some huge text pages; and I didnt compare 
against your mod_gzip; but real comparing isnt possible yet because then I have to 
compare also Apache 1.3 with 2.0: I dont have your 2.0 gzip module.

 Did you just ask for 1 thing, see if it came back compressed, and you
 are calling that 'success'?
see above: I cannot compare against what I dont have.
 There's a LOT more to it than that.
maybe; and after I've read your long letter I see some things clearer...

 And even if a module compiles without changes and no
 porting is needed it's not guaranteed to run.
 The best sample is mod_gzip: I use it on Linux and Win32, but on
 NetWare the module compiles fine but doesnt work!
first of all you see I use your module and I know of the benefits of saving bandwidth 
or else I woudnt try to get it on all platforms.

 This was/is an Apache 1.3.x issue only and this issue
 was resolved on the mod_gzip forum. mod_gzip forum users have been
 VERY good at helping each other out. mod_gzip for Apache 1.3.x doesn't
 work 'out of the box' for IBM's rewrite of Apache, either, and in both
 cases it's because those vendors are re-writing Apache headers and making
 changes that are not in the standard Apache distributions ( or even
 available anywhere online ).
That's not true! I have downloaded your complete 12MB archive and searched for 
'netware': not one thread dealing with NetWare, so please direct me to the message 
where the issue is solved!!
I found 'netware' a couple of times and it was always a #ifdef line from a patch for 
getting POST to work. I found nothing related to the general failure of mod_gzip on 
NetWare! Also you can see in your archive what happens to other platforms which are 
not delivered with a complete C/C++ development as Unix/Linux: they try to compile 
with Borland and many other things...
That's my mainly reason why I want to see a gzip module in the Apache sources: then it 
will be build for all platforms and distributed with the binaries. 
With all non *nix platforms you had to buy very expensive compilers; only since a 
short time we're now able to compile with gcc for Win32 (CygWin) and NetWare 
(gcc/nlmconv)...

 That being said... if I recall a number of the Netware problem
 reports were simply from people that didn't realize you CAN use
I wonder who should this be? I know only one other who compiles for NetWare self...
and again please show me this reports, in the 12 MB mail archive is nothing!

 mod_gzip to compress your SSL output but it takes a special
 configuration. People were reporting output lengths of ZERO
 in the compression statistics in ACCESS_LOG and didn't realize
 that what happens is that SSL 'steals away' the connection handles
 under Apache 1.3.x and delivers the responses outside of the
 normal Apache delivery process. The pages were being delivered
 fine but without the special configuration for mod_gzip they
 were simply not being compressed.
Well Kevin, if you believe I'm too stupid to check if I have loaded SSL, TLS or 
something like that, then we stop discussion here. I told you more than one time that 
I DONT USE SSL nor TLS nor ANY OTHER MODULE AS THE STANDARD COMPILED IN ONES!!!

Also I dont know why you discontinued in helping me find the bug (whereever it sits). 
I was glad that you immediatly replied a few times and it seemed to me that you were 
interested in supporting a new platform. Last thing was that I sent you a debug file 
created from mod_gzip, and I hoped that you could point me to something ot give some 
hints because you know exactly what your code should do, but then nothing came back...

Again: I'm interested in getting mod_gzip working on NetWare with Apache 1.3, I'm able 
to compile your module as well as the whole server, I have a testmachine and when you 
give me patches, hints, tips or whatever you have I will test it and let you know the 
results.

 There are 'patches' available for mod_gzip that solve the Netware
 and IBM HTTPD issues. I believe Bill Stoddard himself is currently
please point me to this patch or send it to me and I will immediatly check it...

 I don't personally have/use Netware ( or IBM's HTTPD ) but other users on
 the mod_gzip forum worked this out for themselves, as any good forum group
 will do.
again: please point me to this patch or send it to me and I will immediatly check it...

please notice that I like your module and still interested in getting it working on 
NetWare...

Thanks, Guenter.




Re: per-dir config

2001-09-04 Thread William A. Rowe, Jr.

From: Rasmus Lerdorf [EMAIL PROTECTED]
Sent: Tuesday, September 04, 2001 11:05 AM


 Ok, so the Apache 1.3 walks/hooks flow looks like this:

I'll rework this for you just a bit...
 
 
 ::post read-request hook::
 
 location_walk
   - checks r-uri and sets server-wide config
 
 XXuri-filename translation hookXX -- let's refer to this as
  ::uri translate name hook::

Modules translate the URI to r-filename, and/or determine they
will interfere^H^H^H^H^H^H^H^H^Hserve this request without the
filesystem at all :)

  ::map_to_storage hook::

Modules merge whatever per_dir_configs apply to their request (mod_proxy
does a proxy_walk), or return OK if there are no applicable, resource
specific per_dir_configs (say, for server-status), or let the reqeust fall
through to the...

--core map_to_storage function (hooked last)--
 directory_walk
 - checks fullpath r-filename and sets per-dir config
 file_walk
 - checks file part of r-filename
 
 location_walk
   - second call (in case anyone changed r-uri or r-per_dir_config)
 
 ::header parser hook::
 
 ::check access hook::
   ::check user id hook::
   ::check auth hook::
 
 ::type check hook::
 
 ::fixup hook::
 
 ::content handler hook::
 
 ::logger hook::
 
 
 Could someone more intimate with the 2.0 code post a similar flow for 2.0?
 I am just trying to make sure that this code I am writing for 1.3 right
 now won't have to be completely redesigned for 2.0.

It shouldn't.  You potentially need to protect yourself from NULL r-filename
(we haven't decided what to do yet with *this*is*not*a*file* requests.)  Since
PHP generally serves -files-, this won't be a huge change for you.  For serving
from an SQL database, or proxied requests, or internal handlers (even php-status)
you can speed things up/protect the right resources with the map_to_storage hook.

mod_proxy now keeps it's own Proxy[Match] {uri} sections instead of using the
Directory proxy:{uri} fooness.  Apache 2.0 won't run bogus paths through the
Directory  sections anymore, see the (rather straightforward) patch to mod_proxy
on how to implement containers against a different resource for non-filesystem
requests, using map to storage.  Instead of the core map_to_storage function,
proxy hooks the map_to_storage phase and uses proxy_walk instead.

I'll try to document this complete on my way into the office tommorow ;)

Bill





Re: cvs commit: httpd-2.0/server util_filter.c

2001-09-04 Thread Justin Erenkrantz

On Tue, Sep 04, 2001 at 04:28:45PM -, [EMAIL PROTECTED] wrote:
 jerenkrantz01/09/04 09:28:45
 
   Modified:server   util_filter.c
   Log:
   Jeff pointed out that the character array must be constant.
   Well, it's not, so make it allocated from the correct pool rather than
   the heap.

Obviously I meant s/heap/stack/ -- justin




Re: per-dir config

2001-09-04 Thread Rasmus Lerdorf

 It shouldn't.  You potentially need to protect yourself from NULL r-filename
 (we haven't decided what to do yet with *this*is*not*a*file* requests.)  Since
 PHP generally serves -files-, this won't be a huge change for you.  For serving
 from an SQL database, or proxied requests, or internal handlers (even php-status)
 you can speed things up/protect the right resources with the map_to_storage hook.

Well, I am actually writing the code that will let people supply PHP
scripts that get called by any of the hooks which means that in this case
the request will not necessarily result in a file being served.

 I'll try to document this complete on my way into the office tommorow ;)

That would be appreciated.  Especially if it references the 1.3 API as
well and highlights the differences.

-Rasmus




Re: cvs commit: httpd-2.0/server util_filter.c

2001-09-04 Thread Ryan Bloom

On Tuesday 04 September 2001 09:16, Justin Erenkrantz wrote:
 On Tue, Sep 04, 2001 at 06:40:03AM -0700, Ryan Bloom wrote:
  Didn't we decide a LONG time ago not to do this unless it was absolutely
  necessary?  Format changes just add cruft to the CVS logs.  I have
  noticed a lot of changes to the format in this patch that were more
  opinion than code style.  For example, the ap_pass_brigade declaration. 
  We had three characters wrapping to the next line, but this patch split
  it into two lines.  That was unnecessary.

 Well, I use terminals with 80 column width.  Therefore, long lines are
 a PITA for me.  I will *not* work with source files that do not fit
 our standard.  I'm an asshole in this respect.

 I feel that code in our repository should meet our code standard:

 http://dev.apache.org/styleguide.html

 If *anyone* wants to submit patches to bring a file into compliance with
 our published style guide, I *will* commit it.

 Ryan, you may veto this commit.  However, if you wish to repeal or
 modify our style guide, I suggest you call a vote.  -- justin

That is complete BS.  We have a long standing tradition of NOT making
commits just to follow the code style.  There is no need for a vote, because
this has been discussed to death and formatting only commits have been
vetoed in the past in every thread that they come up in.  Review the archives
for Roy and Dean's opinions of formatting changes.  They are completely
bogus, and just serve to make CVS hard to use.

I won't veto the commit, because backing it out just serves to muck up
CVS even worse.  Please do not commit any more of them.  The group
decided a LONG time ago that they were completely bogus, and you are
going against a long-standing decision by the group to NOT make commits
like this.

Ryan
__
Ryan Bloom  [EMAIL PROTECTED]
Covalent Technologies   [EMAIL PROTECTED]
--



Re: cvs commit: httpd-2.0/modules/experimental mod_cache.c

2001-09-04 Thread Bill Stoddard


 just to add my 2c to the picture.
 I dont see why the cache-filter could not live anywhere in the filter chain
 it could sit just after the handler (where it could cache a report generator)
 which would then feed into php/include

 it could sit after php/include and possibly after the gzip/byte ranges as well.

 the thing it needs to do, (which I haven't seen yet) is address how it uniquely
 identifies what is the same request.

Are you referring to the fact that a single URL may actually represent multiple views 
(and
perhaps multiple cache entries)?

I believe Graham's idea was to allow caching multiple views using the same URL. 
Searching
the cache using a URL as key would return a collection of cache objects.
cache_select_url() would negotiate across the collection to determine which object, if
any, should be served. An alternate solution would be to cache content using complex
search keys, rather than just the URL (i.e., don't return collections, only unique 
single
cache objects).


 It needs to be cofigurable so that it could use the
 * incoming vhost+request (simple case)
 * the user agent
 * a cookie value (or part of)
 * any part of the incoming request header.


 so.. you might have something like
 SetOutputfilter CACHE INCLUDE GZ
 CACHECRITERIA REQ, UserAgent
 or
 SetOutputFilter INCLUDE GZ CACHE
 CACHECRITERIA REQ, Cookie: Foo+8 (last 8 bytes of Foo)


We need a whole new class of configuration directives to control how cache search keys 
are
built.  For ESI (and other HTML tag driven caches) you would probably want to allow
dynamically constructing a rule set that contains rules for building search keys.
cache_select_url() would use the rule set as a guide for building sophisticated search
keys using cookies, request parameters, etc.  The rule set could be built at startup 
based
on config directives or at runtime based on whatever runtime criterion you choose.
Interesting stuff...

Bill




[PATCH] Make filter names lowercase

2001-09-04 Thread Justin Erenkrantz

Both Cliff and OtherBill sent me emails complaining about the
change to strcasecmp in some places to support Brian Pane's
new hash code in util_filter.c.  (Never mind that a lot of 
the code already did strcasecmp...)

This patch makes the filter name lowercase for the searches
and lets us use strcmp.  (I only searched for files that 
used frec-name directly...)

Please review and test.  My commits are getting an awful lot
of negative feedback lately, so I'm going to switch to R-T-C.
So, I'll need 3 +1s to commit this.  -- justin

Index: ./modules/experimental/mod_charset_lite.c
===
RCS file: /home/cvs/httpd-2.0/modules/experimental/mod_charset_lite.c,v
retrieving revision 1.51
diff -u -r1.51 mod_charset_lite.c
--- ./modules/experimental/mod_charset_lite.c   2001/09/04 07:59:55 1.51
+++ ./modules/experimental/mod_charset_lite.c   2001/09/04 19:03:08
@@ -116,9 +116,9 @@
 } ees_t;
 
 /* registered name of the output translation filter */
-#define XLATEOUT_FILTER_NAME XLATEOUT
+#define XLATEOUT_FILTER_NAME xlateout
 /* registered name of input translation filter */
-#define XLATEIN_FILTER_NAME  XLATEIN 
+#define XLATEIN_FILTER_NAME  xlatein 
 
 typedef struct charset_dir_t {
 /** debug level; -1 means uninitialized, 0 means no debug */
@@ -379,7 +379,7 @@
 struct ap_filter_t *filter = filter_list;
 
 while (filter) {
-if (!strcasecmp(filter_name, filter-frec-name)) {
+if (!strcmp(filter_name, filter-frec-name)) {
 return 1;
 }
 filter = filter-next;
@@ -623,7 +623,7 @@
 charset_filter_ctx_t *curctx, *last_xlate_ctx = NULL,
 *ctx = f-ctx;
 int debug = ctx-dc-debug;
-int output = !strcasecmp(f-frec-name, XLATEOUT_FILTER_NAME);
+int output = !strcmp(f-frec-name, XLATEOUT_FILTER_NAME);
 
 if (ctx-noop) {
 return;
@@ -634,7 +634,7 @@
  */
 curf = output ? f-r-output_filters : f-r-input_filters;
 while (curf) {
-if (!strcasecmp(curf-frec-name, f-frec-name) 
+if (!strcmp(curf-frec-name, f-frec-name) 
 curf-ctx) {
 curctx = (charset_filter_ctx_t *)curf-ctx;
 if (!last_xlate_ctx) {
Index: ./modules/experimental/mod_disk_cache.c
===
RCS file: /home/cvs/httpd-2.0/modules/experimental/mod_disk_cache.c,v
retrieving revision 1.10
diff -u -r1.10 mod_disk_cache.c
--- ./modules/experimental/mod_disk_cache.c 2001/09/04 07:59:55 1.10
+++ ./modules/experimental/mod_disk_cache.c 2001/09/04 19:03:08
@@ -94,7 +94,7 @@
  * again.
  */
 for ((f = r-output_filters); (f = f-next);) {
-if (!strcasecmp(f-frec-name, CACHE)) {
+if (!strcmp(f-frec-name, cache)) {
 ap_remove_output_filter(f);
 }
 }
Index: ./modules/experimental/cache_util.c
===
RCS file: /home/cvs/httpd-2.0/modules/experimental/cache_util.c,v
retrieving revision 1.4
diff -u -r1.4 cache_util.c
--- ./modules/experimental/cache_util.c 2001/08/24 16:57:13 1.4
+++ ./modules/experimental/cache_util.c 2001/09/04 19:03:08
@@ -84,9 +84,9 @@
 ap_filter_t *f = r-output_filters;
 
 while (f) {
-if (!strcasecmp(f-frec-name, CORE) ||
-!strcasecmp(f-frec-name, CONTENT_LENGTH) ||
-!strcasecmp(f-frec-name, HTTP_HEADER)) {
+if (!strcmp(f-frec-name, core) ||
+!strcmp(f-frec-name, content_length) ||
+!strcmp(f-frec-name, http_header)) {
 f = f-next;
 continue;
 }
Index: ./modules/http/http_request.c
===
RCS file: /home/cvs/httpd-2.0/modules/http/http_request.c,v
retrieving revision 1.113
diff -u -r1.113 http_request.c
--- ./modules/http/http_request.c   2001/08/31 03:49:42 1.113
+++ ./modules/http/http_request.c   2001/09/04 19:03:08
@@ -96,11 +96,11 @@
 ap_filter_t *f = r-output_filters;
 int has_core = 0, has_content = 0, has_http_header = 0;
 while (f) {
-if(!strcasecmp(f-frec-name, CORE))
+if(!strcmp(f-frec-name, core))
 has_core = 1; 
-else if(!strcasecmp(f-frec-name, CONTENT_LENGTH))
+else if(!strcmp(f-frec-name, content_length))
 has_content = 1; 
-else if(!strcasecmp(f-frec-name, HTTP_HEADER)) 
+else if(!strcmp(f-frec-name, http_header)) 
 has_http_header = 1;
 f = f-next;
 }
Index: ./server/mpm/perchild/perchild.c
===
RCS file: /home/cvs/httpd-2.0/server/mpm/perchild/perchild.c,v
retrieving revision 1.78
diff -u -r1.78 perchild.c
--- ./server/mpm/perchild/perchild.c2001/09/04 07:59:55 1.78
+++ ./server/mpm/perchild/perchild.c2001/09/04 19:03:08
@@ -1490,7 +1490,7 @@
  

Re: [PATCH] Add mod_gz to httpd-2.0

2001-09-04 Thread TOKILEY


In a message dated 01-09-04 09:35:50 EDT, Jim wrote...

 That's right and one of them is...
  
  Will Apache accept ZLIB into the Apache source tree in either
  source or binary library format for all platforms.
  
  Check one box only...
  
  [__] Yes
  [__] No
   
  
  Actually, it's not a binary answer... Since Yes implies that the
  ASF would accept binary library, which ain't the case, and No implies
  that we wouldn't accept source, which may not be

Yea... I guess the minute I hit send I realized that 'if' statement
was missing some 'else'. ROFL

How about this...

Will the Apache group accept the ZLIB source code
into the distribution tree at this time?...

[__] Yes
[__] No

if ( Yes )
  {
1. Where will it be? /srclib/zlib?
2. Who is going to add it and when?
  }
else
  {
Will Apache accept pre-compiled ZLIB libraries into the
 source tree ( for any/all platforms )?

[__] Yes
[__] No

if ( Yes )
  {
1. Where will the ZLIB libraries be? /srclib/?  /support/zlib?
 
2. Who is going to add them and when?
  }  
else
  {
1. How can anyone vote on adding Ian's mod_gz to the source tree
when the public domain libraries it needs can't be in the source 
tree?
2. How is anything in Apache that ever needs ZLIB supposed to  
compile? Users must always 'go and get' the libs themselves?
  }

  }/* End 'else' */

  zlib is a very large chunk of code, 

Not really... but I guess some might think so.

ASIDE: You really only need the 'compression' part of it. You
are a Server, not a client.

 and we resist requiring external
 code unless:
  
1. It's small
2. It's very stable (ie: don't have to keep updating it all the
   time, nor worry about passing patches back).
3. It's used by a large chunk of the Apache code (eg: regex).

  Does zlib fit the bill?? Well, not for #1 and not really for #3...

Numbers 1 and 2 already sound like the compressor that's
in mod_gzip and is NOT ZLIB ( minus all the debug code, of course ).
It's VERY stable and VERY small ( 20k or so ).

As for number 3... I am still of the opinion that your perspective
is wrong... you are playing 'chicken or egg'. I am of the opinion
that if ZLIB WAS there then in very short order there WOULD 
be a 'large chunk of Apache code' that uses it. This is ( and I guess
always has been ) my point.

'Build it and they will come...'
'Include the libs and they will be used ( a lot )...'

Something like that

How in the heck tod the REGEX stuff ever make it in?
Was that some huge 'catch 22' or 'chicken and egg' scenario
as well when it was being proposed?

What was the final straw that broke the stalemate over the
'regexec' library inclusion(s)?

  Does that mean we'd never consider it... I'm not willing to say so.

Then it's happening all over again.

Everyone has an opinion but no one will VOTE one
way or the other.

What's it going to take to find out once and for all if
ZLIB can be included in the Apache source tree?

I don't know anymore. I've tried to explain why I think
it would be a great benefit to Apache to have it there
( numerous times going back over a year or more )
and I have tried to supply as much information as I
have about the licensing issues ( IANAL ) and I
have asked for 'real' votes about it... nothing happens...
just more talk.

Somebody else needs to take this into the end-zone.
I don't even know where the football is on this one
anymore.

Yours...
Kevin Kiley



Re: cvs commit: httpd-2.0/server util_filter.c

2001-09-04 Thread dean gaudet



On Tue, 4 Sep 2001, Ryan Bloom wrote:

 On Tuesday 04 September 2001 09:16, Justin Erenkrantz wrote:

  Ryan, you may veto this commit.  However, if you wish to repeal or
  modify our style guide, I suggest you call a vote.  -- justin

 That is complete BS.  We have a long standing tradition of NOT making
 commits just to follow the code style.  There is no need for a vote, because
 this has been discussed to death and formatting only commits have been
 vetoed in the past in every thread that they come up in.  Review the archives
 for Roy and Dean's opinions of formatting changes.  They are completely
 bogus, and just serve to make CVS hard to use.

heheheh.  this thread again :)

aside from the obvious difficulty of dealing with differences across style
commits, here's my favourite way of explaining my position on this issue:

coding styles are much like language dialects and accents.  if you can
operate in only one coding-style then you can live happily in your own
little valley, but once you wander up and over the hills and visit other
valleys you will be confused as all hell.  learning other coding styles
allows one to share in the intellectual wealth of others.

(i thought i'd posted this to the list years ago, but it seems i've only
sent it in private email.)

-dean





Re: [PATCH] Add mod_gz to httpd-2.0

2001-09-04 Thread TOKILEY


In a message dated 01-09-04 12:39:44 EDT, Guenter writes...

  Guenter Knauf wrote...
  
   Hi,
   I was glad as Ian contributed his mod_gz; I tested it on Linux and Win32
   and it works for me.
   What did you test?
  that it compiles, loads into server and compresses.
   How 'heavily loaded' was the Server?
  you're right, I did only a quick test with some huge text pages; and I 
didnt 
 compare against your mod_gzip; but real comparing isnt possible yet because 
 then I have to compare also Apache 1.3 with 2.0: I dont have your 2.0 gzip 
 module.

I wasn't asking if you had tested mod_gzip... You said you had
tested 'mod_gz' and that 'it worked' and I was just curious what
you were calling 'success'. You don't even say what MIME 
type you tested. text/plain or text/html? Did you try anything
else and did you try it under real circumstances like a fully
loaded Server?

[rest of message snipped]

Guenter... the rest of the message really should have been
sent to the mod_gzip forum. Check the title of this message
thread... it really concerns someone suggesting that a 
filtering demo called 'mod_gz' be added to the Apache 
base tarball. This is not really 'about' mod_gzip at all.

I WILL address every point you raised and I will check
my notes... I am SURE someone else found the answer
as to why Novell's version of Apache is not a standard
'distribution' of Apache and doesn't 'behave' the way 
'normal' Apache does. I will locate the info and send it
to you. It may be that you have to get the patch from
Novell since it's only their 'altered' version of Apache
that isn't playing by the rules ( Actually... IBM's rewrite
falls into same category but that's not your concern ).

Please direct any further mod_gzip 'support' questions to
either the mod_gzip forum or to myself. mod_gzip is not
part of Apache and these guys are going to explode if
this 'please support my Netware' discussion goes any
farther on this forum under this message thread.

Yours...
Kevin Kiley




RE: [PATCH] hash table for registered filter list

2001-09-04 Thread Charles Randall

What type of performance improvements did you see with this (under what
workload)?

Charels

-Original Message-
From: Brian Pane [mailto:[EMAIL PROTECTED]]
Sent: Monday, September 03, 2001 11:50 PM
To: [EMAIL PROTECTED]
Subject: [PATCH] hash table for registered filter list


The ap_add_input_filter/ap_add_output_filter functions do an O(n) scan
through the list of registered filters.  This patch replaces the linear
list with a hash table for better performance.

--Brian


Index: server/util_filter.c
===
RCS file: /home/cvspublic/httpd-2.0/server/util_filter.c,v
retrieving revision 1.65
diff -u -r1.65 util_filter.c
--- server/util_filter.c2001/08/30 05:25:311.65
+++ server/util_filter.c2001/09/04 05:42:17
@@ -54,16 +54,18 @@
 
 #define APR_WANT_STRFUNC
 #include apr_want.h
+#include apr_lib.h
+#include apr_hash.h
+#include apr_strings.h
 
 #include httpd.h
 #include http_log.h
 #include util_filter.h
 
 /* ### make this visible for direct manipulation?
- * ### use a hash table
  */
-static ap_filter_rec_t *registered_output_filters = NULL;
-static ap_filter_rec_t *registered_input_filters = NULL;
+static apr_hash_t *registered_output_filters = NULL;
+static apr_hash_t *registered_input_filters = NULL;
 
 /* NOTE: Apache's current design doesn't allow a pool to be passed thu,
so we depend on a global to hold the correct pool
@@ -92,16 +94,21 @@
 static void register_filter(const char *name,
 ap_filter_func filter_func,
 ap_filter_type ftype,
-ap_filter_rec_t **reg_filter_list)
+apr_hash_t **reg_filter_set)
 {
 ap_filter_rec_t *frec = apr_palloc(FILTER_POOL, sizeof(*frec));
 
-frec-name = name;
+if (!*reg_filter_set) {
+*reg_filter_set = apr_hash_make(FILTER_POOL);
+}
+
+frec-name = apr_pstrdup(FILTER_POOL, name);
+ap_str_tolower((char *)frec-name);
 frec-filter_func = filter_func;
 frec-ftype = ftype;
+frec-next = NULL;
 
-frec-next = *reg_filter_list;
-*reg_filter_list = frec;
+apr_hash_set(*reg_filter_set, frec-name, APR_HASH_KEY_STRING, frec);
 
 apr_pool_cleanup_register(FILTER_POOL, NULL, filter_cleanup, 
apr_pool_cleanup_null);
 }
@@ -126,12 +133,26 @@
 
 static ap_filter_t *add_any_filter(const char *name, void *ctx,
request_rec *r, conn_rec *c,
-   ap_filter_rec_t *frec,
+   apr_hash_t *reg_filter_set,
ap_filter_t **r_filters,
ap_filter_t **c_filters)
 {
-for (; frec != NULL; frec = frec-next) {
-if (!strcasecmp(name, frec-name)) {
+if (reg_filter_set) {
+ap_filter_rec_t *frec;
+int len = strlen(name);
+int size = len + 1;
+char name_lower[size];
+char *dst = name_lower;
+const char *src = name;
+
+/* Normalize the name to all lowercase to match 
register_filter() */
+do {
+*dst++ = apr_tolower(*src++);
+} while (--size);
+
+frec = (ap_filter_rec_t *)apr_hash_get(reg_filter_set,
+   name_lower, len);
+if (frec) {
 apr_pool_t *p = r ? r-pool : c-pool;
 ap_filter_t *f = apr_pcalloc(p, sizeof(*f));
 ap_filter_t **outf = r ? r_filters : c_filters;




Re: [PATCH] Make filter names lowercase

2001-09-04 Thread Brian Pane

Justin Erenkrantz wrote:

Both Cliff and OtherBill sent me emails complaining about the
change to strcasecmp in some places to support Brian Pane's
new hash code in util_filter.c.  (Never mind that a lot of 
the code already did strcasecmp...)

This patch makes the filter name lowercase for the searches
and lets us use strcmp.  (I only searched for files that 
used frec-name directly...)

Please review and test.  My commits are getting an awful lot
of negative feedback lately, so I'm going to switch to R-T-C.
So, I'll need 3 +1s to commit this.  -- justin

Alternatively, how about just leaving everything capitalized
and removing the case-insensitivity support in add_any_filter?
I wasn't happy about having to normalize the case in add_any_filter
in order to make the hash tables match the semantics of the
strcasecmp loop that they replaced, so I'd rather drop the
lowercasing in add_any_filter and just use the capital forms
throughout (with case-insensitive comparisons) unless that
breaks something else.

--Brian







Re: [PATCH] Add mod_gz to httpd-2.0

2001-09-04 Thread Ian Holsman

On Tue, 2001-09-04 at 12:29, [EMAIL PROTECTED] wrote:
 
 
 ASIDE: You really only need the 'compression' part of it. You
 are a Server, not a client.

we also can be a client.
think mod-proxy

..The Weekend Warrior
 
 Yours...
 Kevin Kiley
-- 
Ian Holsman  [EMAIL PROTECTED]
Performance Measurement  Analysis
CNET Networks   -   (415) 364-8608




mod_include performance update

2001-09-04 Thread Brian Pane

The good news:

With some of the recent performance patches (thanks to Justin for all
the commits), the throughput of 2.0 on SSI requests has improved quite
a bit.  IanH ran the old and new code through his benchmark setup in
which clients request a .shtml file with two included files (threaded mpm,
Solaris, 8-CPU server).  The latest CVS code has about 60% higher throughput
than the code base from a week ago.

  http://webperf.org/a2/v25/

The bad news:

The usr CPU usage (see the 'server stats' links on the benchmark results
pages) is roughly equal to the sys CPU usage.  Even for SSI requests, I'm
surprised that the percentage of usr CPU is that high.  (To look at it from
a more positive angle, these numbers mean that there may be 
opportunities for
significant additional speedups.  Time to do some more profiling...)

One phenomenon in the truss data looks a bit strange:
  http://webperf.org/a2/v25/truss.2001_01_04

The server appears to be logging the request (the write to file descriptor
4) before closing its connection to the client (the shutdown that 
follows the
write).  For a non-keepalive request, shouldn't it do the shutdown first?

--Brian





Re: [PATCH] Make filter names lowercase

2001-09-04 Thread Ryan Bloom

On Tuesday 04 September 2001 12:08, Justin Erenkrantz wrote:

Wouldn't this be a much cleaner patch if we went to upper case
instead of lower-case?  I realize that is a detail, but currently, most
of the filters are registered as upper case, so we would have fewer
places to modify.

Ryan

 Both Cliff and OtherBill sent me emails complaining about the
 change to strcasecmp in some places to support Brian Pane's
 new hash code in util_filter.c.  (Never mind that a lot of
 the code already did strcasecmp...)

 This patch makes the filter name lowercase for the searches
 and lets us use strcmp.  (I only searched for files that
 used frec-name directly...)

 Please review and test.  My commits are getting an awful lot
 of negative feedback lately, so I'm going to switch to R-T-C.
 So, I'll need 3 +1s to commit this.  -- justin

__
Ryan Bloom  [EMAIL PROTECTED]
Covalent Technologies   [EMAIL PROTECTED]
--



Re: [PATCH] hash table for registered filter list

2001-09-04 Thread Greg Stein

On Mon, Sep 03, 2001 at 11:53:59PM -0700, Justin Erenkrantz wrote:
 On Mon, Sep 03, 2001 at 10:50:01PM -0700, Brian Pane wrote:
  The ap_add_input_filter/ap_add_output_filter functions do an O(n) scan
  through the list of registered filters.  This patch replaces the linear
  list with a hash table for better performance.
 
 Yup.  The ### use a hash table was a dead giveaway that someone
 else thought this was needed.

That was me :-)  +1 on the concept, but the patch is busted.

Seeing that this is early in my set of new messages to read, I bet somebody
has already discovered that... :-)

 Committed.  Thanks.  -- justin

eek.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



Re: [PATCH] RE: make distclean doesn't

2001-09-04 Thread Greg Stein

On Tue, Sep 04, 2001 at 08:15:08AM -0700, Ryan Bloom wrote:
 On Sunday 02 September 2001 01:22, Greg Stein wrote:
...
  I use distclean on my computer all the time. Along with extraclean. Neither
  of those targets should toss config.nice. *That* is what I mean.
 
  To be clear: nothing in our build/config/whatever should remove config.nice
 
 
  Clean rules are about cleaning out state that might affect a build in
  some way. So we toss object files, generate makefiles, the configure
  script, whatever. But config.nice doesn't fall into that camp because it is
  not a stateful file. It is for the user to rebuild what they had before.
 
 Just to point out, Apache 1.3 had config.status which is analogous to 2.0's
 config.nice.  It turns out that make distclean in 1.3 removes config.status.
 
 I would say this is proof that we should be removing config.nice with 2.0.

That isn't proof, that is an opinion -- that you happen to like what was
done in 1.3. I see it as 1.3 attempted to look like ./configure and create a
config.status, and along those lines it torched config.status.

But in Apache 2.0, we have a *real* config.status which gets tossed because
it is stateful and you should be tossing it. config.nice is for the user to
retain the information about how to reconfigure their Apache after a
thorough cleaning. It contains no state that could mess up a future config
and build. And it retains *very* useful information for the user.

The only thing that tossing config.nice will do is inconvenience our users.
What is the point in that? I'm for helping users, not pissing them off.


How many more times do I need to say this, Ryan? Here is number three: -1 on
removing config.nice. Drop it already.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



Re: [PATCH] Add mod_gz to httpd-2.0

2001-09-04 Thread Greg Stein

On Mon, Sep 03, 2001 at 05:47:02PM -0700, Ryan Bloom wrote:
...
 I have a big problem with this.  We had a hard enough time contributing
 patches back to MM.  The only reason we keep expat and pcre up to date,
 is that we NEVER make any changes to them.  I would be very much against
 adding zlib to our tree.

Not to mention that I'm also an Expat developer, so I can cross-port changes
back and forth between the trees. :-)

But yes: the stability of PCRE and Expat are a big help. However, I'd point
out that we didn't make a lot of changes to MM either; it was quite stable,
too. No idea why the changes didn't go back and forth, tho.

In fact, the ASF has been a very good influence on Expat. Our ideas are
being used for its config/build setup. A lot of the portability work and
research that the ASF is good for, is being reflected into Expat to ensure
that it is just as portable.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



Re: [PATCH] pre-merge of per-dir configs to speed up directory_walk

2001-09-04 Thread William A. Rowe, Jr.

From: Brian Pane [EMAIL PROTECTED]
Sent: Tuesday, September 04, 2001 6:20 PM


 William A. Rowe, Jr. wrote:
 
 
 We are on the same page :)  FWIW, I've made a few comments around your patch,
 and am reviewing it in more detail.  I'll have my comments by the end of the
 week (I'm all over creation this week trying to get things done.)
 
 Interestingly, I'm not seeing a measurable speedup from the patch in
 benchmark testing (which is surprising because Quantify data has shown
 dir_merge functions to be a major consumer of CPU time).  It may not be
 worth considering the patch further.

May I first finish the work, then, on allowing location_walk to cache preserve
the individual 'merge steps' on it's way to creating the end-result?

The patch I will commit tommorow aftn extends location_walk's cache in absolute
contradiction to my earlier, proposed optimization :)  It will save each unit
of work, so that any subreq/redirect will use each sequential cache match, until
that subreq/redirect misses a step that was cached, or hits a location that wasn't
originally cached.  Then it starts merging again, from however far it got, until
the end of the locations list.

If broad locations are listed _first_, followed by very specific (file entity)
locations, then the usual subreq/redirect will have a substantial match, and only
fill in the 'little bits' at the end.

I've also added peek-at-parent/peek-at-previous to help the cache along.  This
becomes _absolutely_ critical in the directory_walk-for-dirent-subrequest code 
path since we are absolutely in a jam without the prior optimizations we just
ripped from the subrequest code.  But it goes ONE STEP FURTHER, since even the
directory entries as subrequests pick right back up from where the parent request
left off :)

Obviously, all of these optimizations are ment to be applied to proxy_walk and
file_walk (almost exact ports, feel free to beat me to these as soon as the final
location_walk patch is submitted) and, with more effort, directory_walk :)

Bill






Re: [PATCH] pre-merge of per-dir configs to speed up directory_walk

2001-09-04 Thread Brian Pane

William A. Rowe, Jr. wrote:

From: Brian Pane [EMAIL PROTECTED]
Sent: Tuesday, September 04, 2001 6:20 PM


William A. Rowe, Jr. wrote:

We are on the same page :)  FWIW, I've made a few comments around your patch,
and am reviewing it in more detail.  I'll have my comments by the end of the
week (I'm all over creation this week trying to get things done.)

Interestingly, I'm not seeing a measurable speedup from the patch in
benchmark testing (which is surprising because Quantify data has shown
dir_merge functions to be a major consumer of CPU time).  It may not be
worth considering the patch further.


May I first finish the work, then, on allowing location_walk to cache preserve
the individual 'merge steps' on it's way to creating the end-result?

The patch I will commit tommorow aftn extends location_walk's cache in absolute
contradiction to my earlier, proposed optimization :)  It will save each unit
of work, so that any subreq/redirect will use each sequential cache match, until
that subreq/redirect misses a step that was cached, or hits a location that wasn't
originally cached.  Then it starts merging again, from however far it got, until
the end of the locations list.

This sounds good.  If the static-cache concept survives, I can just apply it
against the new location_walk.

Meanwhile, I'm looking at profile data that explains why I couldn't measure
the speedup: while dir_merge operations are one of the more time-consuming
parts of Apache, they only account for around 1% of the total CPU usage.
As other things get optimized, and the dir_merge functions represent a
bigger percentage of the remaining CPU time, the effect of the pre-merge
may be more noticeable.

--Brian





zlib inclusion and mod_gz(ip) recap (was: [PATCH] Add mod_gz to httpd-2.0)

2001-09-04 Thread Greg Stein

On Tue, Sep 04, 2001 at 03:29:04PM -0400, [EMAIL PROTECTED] wrote:
...
 Will the Apache group accept the ZLIB source code
 into the distribution tree at this time?...
 
 [__] Yes
 [__] No

No. The zlib library is popular enough (read: typically installed) that we
will link against it, rather than include it. A Windows installer may bundle
it, though. (yes, zlib *is* available for Windows; Python uses it)

 if ( Yes )
..
 else
   {
 Will Apache accept pre-compiled ZLIB libraries into the
  source tree ( for any/all platforms )?

Definitely not.

...
 if ( Yes )
...
 else
   {
 1. How can anyone vote on adding Ian's mod_gz to the source tree
 when the public domain libraries it needs can't be in the source 
 tree?

See above.

 2. How is anything in Apache that ever needs ZLIB supposed to  
 compile? Users must always 'go and get' the libs themselves?

$ rpm -i zlib-devel-1.1.3-6.i386.rpm

...
 ASIDE: You really only need the 'compression' part of it. You
 are a Server, not a client.

True. But it is all pretty irrelevant if we use zlib by reference.

...
 As for number 3... I am still of the opinion that your perspective
 is wrong... you are playing 'chicken or egg'. I am of the opinion
 that if ZLIB WAS there then in very short order there WOULD 
 be a 'large chunk of Apache code' that uses it. This is ( and I guess
 always has been ) my point.

We haven't need it so far. When we do, then we write some config macros.

...
 How in the heck tod the REGEX stuff ever make it in?
 Was that some huge 'catch 22' or 'chicken and egg' scenario
 as well when it was being proposed?
 
 What was the final straw that broke the stalemate over the
 'regexec' library inclusion(s)?

The regex directives. Regexes are really needed for file pattern handling.
Real world situations usually require some kind of regex matching, so it
went in.

The inclusion of Expat was definitely a chicken/egg thing: the intent was to
make it easier for Apache modules to use XML (given XML's increasing use and
importance; altho it still boggles my mind that we haven't seen mod_soap or
mod_xmlrpc yet... all the blocks are there).

...
 What's it going to take to find out once and for all if
 ZLIB can be included in the Apache source tree?

It won't go in. No need for it. That hasn't been well-stated, but if you
take a half hour and reread the notes, it kind of surfaces in there. This is
also how we would handle a library like this.

As stated elsewhere, pcre and expat are in there because they aren't
typically available, like zlib is.

 I don't know anymore. I've tried to explain why I think
 it would be a great benefit to Apache to have it there
 ( numerous times going back over a year or more )
 and I have tried to supply as much information as I
 have about the licensing issues ( IANAL ) and I
 have asked for 'real' votes about it... nothing happens...
 just more talk.

Nothing needs to happen, so it shouldn't be surprising :-). If/when we need
it, then we will include it. As I said, it is just config macros.

 Somebody else needs to take this into the end-zone.
 I don't even know where the football is on this one
 anymore.

There are three options on the table:

1) include mod_gzip
2) include mod_gz
3) include neither

I believe there is consensus that (3) is not an option. Despite yours and
Peters pushing and stressing and overbearing sell job to get mod_gz(ip)
type functionality into the core, it was just preaching to the choir. (well,
okay: maybe Ryan didn't want to see it in there :-)  That sell job mostly
served to create an air of hostility.

So now the question comes down to using (1) or (2). People are *not* voting
on including mod_gz because they want to see your alternative. I think it is
pretty much that simple.

But then you say to look at the 1.3 version. That is the problem -- people
don't want to see that. They want to see your 2.0 version, which you already
have working in house. Since you've said it exists, they would rather go
that direction, than have to port the 1.3 version up to 2.0. We're all so
busy, that this kind of laziness is excusable :-) Whether the changes are
large or small, they'd rather see your current work because we already know
the port has been completed and *tested*. We'd have to redo all of that
work, which just seems silly.

So the inclusion of either is blocked on seeing the source to mod_gzip for
Apache 2.0.

Now: you state that you don't want to release it until we hit beta. But that
is not how we work, and you should know that by now. We want the module in
there now, *before* beta hits. You say that you don't want to release it
while the APIs are in flux -- that they should be stable. But that is bogus.
If we include mod_gzip *today*, then it will get fixed along with everything
else as we change the APIs. You aren't going to be responsible for keeping
it up to date with the changes. We are. That is part of what going into the
core means 

Re: [PATCH] Make filter names lowercase

2001-09-04 Thread Cliff Woolley

On Tue, 4 Sep 2001, Justin Erenkrantz wrote:

 All uppercase.  6 of one, half-dozen of the other.  -- justin

I'll take the 6.  I like this a whole lot better.  Thanks, Justin.  :)

--Cliff

--
   Cliff Woolley
   [EMAIL PROTECTED]
   Charlottesville, VA





the rollup issue (was: Re: [PATCH] Add mod_gz to httpd-2.0)

2001-09-04 Thread Greg Stein

On Sat, Sep 01, 2001 at 06:19:32PM -0700, Ryan Bloom wrote:
...
 3)  I don't believe that we
 should be adding every possible module to the core distribution.  I
 personally think we should leave the core as minimal as possible, and
 only add more modules if they implement a part of the HTTP spec.

The list has already covered the minimalist thing, but the rollup issue is
still outstanding.

A suggestion: create httpd-rollup to hold the tools/scripts/web pages and
whatnot for creating a combo of httpd releases plus supporting modules.

The -rollup project could create rollups of just ASF bits (proxy and ???),
but could also be a way to rollup third-party modules (like Ralf's module
bundles). I believe we have also discussed how the Apache Toolbox could
become an ASF project. I'd suggest that it goes into httpd-rollup as one of
its outputs.

For example:

  /home/cvs/httpd-rollup
contrib-1.3/Apache 1.3 plus a bunch of contrib modules
toolbox/Apache Toolbox
asf-2.0/Apache 2.0 plus ASF bits
contrib-2.0/Apache 2.0 plus ASF plus contribs
...

Under httpd-site, we'd create the /rollup/ subdirectory and tightly
incorporate references to it with our distribution pages (to ensure that the
blob with proxy in it is just as easy to find/download as the core).

Does this seem like a reasonable approach to get out of the rollup logjam?
Given this kind of arrangement, would some module contributions or inclusion
have a lower bar to become part of ASF-distributed bits? etc.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



Re: [PATCH] Add mod_gz to httpd-2.0

2001-09-04 Thread Ian Holsman

[EMAIL PROTECTED] wrote:

 In a message dated 01-09-04 19:17:25 EDT, Ian writes...
 
 
ASIDE: You really only need the 'compression' part of it. You

  are a Server, not a client.
 
 we also can be a client.
 think mod-proxy

 
 What current ( or future ) operation would require mod_proxy
 to 'decompress' something? I would imagine that if mod_proxy
 can ever accept Content-Encodings it would do what SQUID
 does... it just stores the response and pays attention to the
 'Vary' headers and such but there's never a need to 'decompress' 
 anything.
 

I was thinking that the proxy could request gzip'd data, and possibly
get gzip'd returned. it would then examine the client's request and
if it can't handle a gzip reply it would ungzip it.

(this is the approach that rsync's rproxy takes.)

the proxy also reverse-proxies, and the output of that needs to be
plain text so it could be merged (via mod-include) (and then possibly
gzip'd again)

I'm going to run mod_gz through our benchmark lab tomorrow (8-way machine)

to see what happens (CPU Usage mainly,and to see if there are any resource leaks)
If I get a copy of mod_gzip for 2.0 I will do the same for it.


 Maintaining an 'object compression' cache is a bit different from
 regular caching and has to follow different 'rules'. Even if there is
 a way to store both a compressed and non-compressed version
 of the same URL in a cache there still isn't any real need to
 'decompress' anything.
 
 Transfer-encoding: gzip, chunked  or some other 'hop to hop' deal
 is a different story but there's still no known browser that can handle 
 that ( nor any known Proxy that can, either, AFAIK ).
 

check out the work being down by the rproxy.samba.org on rproxy.
if we had rsync compression (and a proxy to un-rysnc it) it would be
really cool... (especially if mozilla could talk rproxy) but that is
pie in the sky.


 It was just a thought.
 If people think that ZLIB is 'too big' you can easily cut it in
 half by only including what you need.
 

nah... I'd rather just link to the one already living on people's systems.

the only advantage is if apache's memory pool handling is much faster than
zlibs plain old malloc.. but I'm not sold on that (we could link in hoard)


 Ian... are you a committer?

not on Apache, just APR  Proxy


 What do you say about adding ZLIB to Apache source ASAP.
 Yea or nay?
 
 Yours...
 Kevin Kiley
 

..Ian







remaining CPU bottlenecks in 2.0

2001-09-04 Thread Brian Pane

I'm currently studying profiling data from an httpd built from
a CVS snapshot earlier today.

In general, the performance of 2.0 is starting to look good.

Here are the top 30 functions, ranked according to their CPU utilization.
(This summary is derived from a Quantify profile of a threaded-MPM httpd
handling mod_include requests on Solaris; thanks to Blaise Tarr at CNET
for the profile data.)

 CPU time
   function (% of total)
    
 1. find_start_sequence23.9
 2. mmap   14.1
 3. munmap 10.2
 4. _writev 9.4
 5. _so_accept  4.7
 6. _lwp_mutex_unlock   4.3
 7. stat3.2
 8. _close  2.4
 9. _write  2.4
10. __door_return   2.3
  cumulative 
total: 76.9%
11. __open  2.0
12. _so_getsockname 1.6
13. find_entry  1.5
14. strlen  1.2
15. strcasecmp  1.2
16. _so_shutdown1.2
17. _read   0.84
18. fstat64 0.82
19. __lwp_sema_post 0.80
20. memset  0.78
 cumulative total: 88.8%
21. apr_hash_next   0.53
22. apr_file_read   0.46
23. _lrw_unlock 0.39
24. __fork1 0.39
25. apr_palloc  0.39
26. strncpy 0.28
27. memchr  0.26
28. _ltzset_u   0.24
29. overlay_extension_mappings0.20
30. apr_hash_set0.20
 cumulative total: 92.18%

Some notes on the data:

* The discussion here covers only CPU utilization.  There are other
  aspects of performance, like multiprocessor scalability, that
  are independent of this data.

* 7 of the top 10 functions are syscalls that I'd expect to see in
  the top 10 for an httpd. :-)

* find_start_sequence() is the main scanning function within
  mod_include.  There's some research in progress to try to speed
  this up significantly.

* With the exception of find_start_sequence(), all the opportunities
  for optimization are individually small.  I expect that the only
  way to get an additional 5% reduction in CPU usage, for example,
  will be to fix ten things that each account for 0.5%.

* _lwp_mutex_unlock() gets from pthread_mutex_unlock(),
  but only from a small fraction of pthread_mutex_unlock calls
  (Can someone familiar with Solaris threading internals explain
  this one?)

* Collectively, stat and open comprise 5% of the total CPU time.
  It would be faster to do open+fstat rather than stat+open (as
  long as the server is delivering mostly 200s rather than 304s),
  but that might be too radical a change.  Anybody have thoughts
  on this?

* I think the __door_return() call is an anomaly of the profiling
  instrumentation itself.

* find_entry() is part of the apr_hash_t implementation.  Most of
  the calls to find_entry()and apr_hash_next() are coming from the
  mod_mime dir-merge function.

* strlen() is called from lots of places throughout the code, with
  the most frequent calls being from apr_pstrdup, apr_pstrcat, and
  time-formatting functions used in apr_rfc822_date.

* strcasecmp() is used heavily by the apr_table_t implementation
  (sort_overlap, apr_table_get, apr_table_setn).  Converting tables
  to hashes is is a potential speedup.  However, most of the table
  API calls are on request_rec-headers_(in|out); changing those
  would impact a *lot* of code.

* fstat64() happens during fcntl() on sockets (it appears to be a
  Solaris implementation detail)

* memset() is called mostly from apr_pcalloc(), which in turn is
  used in too many places to yield any easy optimization opportunities.

* memchr() is called from apr_pstrndup(), used mostly in apr_uri_parse()
  and mod_mime's analyze_ct().  It may be possible to optimize away
  some of the calls in the latter.

--Brian





Re: remaining CPU bottlenecks in 2.0

2001-09-04 Thread Justin Erenkrantz

On Tue, Sep 04, 2001 at 08:00:35PM -0700, Brian Pane wrote:
 I'm currently studying profiling data from an httpd built from
 a CVS snapshot earlier today.
 
 In general, the performance of 2.0 is starting to look good.

Cool.  This probably means the code is starting to look good, too.

 * The discussion here covers only CPU utilization.  There are other
   aspects of performance, like multiprocessor scalability, that
   are independent of this data.

Once we get the syscalls optimized (I'm reminded of Dean's attack
on our number of syscalls in 1.3 - I believe he went through syscall
by syscall trying to eliminate all of the unnecessary ones), I think 
the next performance point will be MP scalability (see below for
lock scalability on solaris).  But, we do need to see what we can 
do about optimizing the syscalls though...

 * find_start_sequence() is the main scanning function within
   mod_include.  There's some research in progress to try to speed
   this up significantly.

Based on the patches you submitted (and my quasi-errant formatting
patch), I had to read most of the code in mod_include, so I'm more 
familiar with mod_include now.  I do think there are some obvious 
ways to optimize find_start_sequence.  I wonder if we could apply 
a KMP-string matching algorithm here.  I dunno.  I'll take a look 
at it though.  Something bugs me about the restarts.  I bet that 
we spend even more time in find_start_sequence when a HTML file 
has lots of comments.  =-)

 * strlen() is called from lots of places throughout the code, with
   the most frequent calls being from apr_pstrdup, apr_pstrcat, and
   time-formatting functions used in apr_rfc822_date.

I think someone has brought up that apr_pstrdup does an extra strlen.
I'll have to review that code.

 * _lwp_mutex_unlock() gets from pthread_mutex_unlock(),
   but only from a small fraction of pthread_mutex_unlock calls
   (Can someone familiar with Solaris threading internals explain
   this one?)

The LWP scheduler may also call _lwp_mutex_unlock() implicitly -
LWP scheduler is a user-space library so it gets thrown in
with our numbers I bet.

Here's some background on Solaris's implementation that I
think may provide some useful information as to how the locks 
will perform overall.  (If you spot any inconsistencies, it is
probably my fault...I'm going to try to explain this as best as
I can...)

First off, Solaris has adaptive locks.  Depending if the owner of 
the lock is currently active, it will spin.  If the system sees
that the owner of the held lock is not currently active, it will
sleep (they call this an adaptive lock - it now enters a turnstile).

Okay, so what happens when a mutex unlocks?  This depends on
whether you are in spin or adaptive lock.  Spin locks 
immediately see the freed lock and the first one on the CPU
grabs it (excluding priority inversion here.).

But, the more interesting issue is what happens when we are in
an adaptive lock.  According to Mauro, Solaris 7+ has a thundering 
herd condition for adaptive kernel locks.  It will wake up *all* 
waiters and lets them fight it out.  This was a difference from 
Solaris 2.5.1 and 2.6.  

Since you should never have a lot of threads sitting in a mutex 
(according to Sun, it is typical in practice to only have one 
kernel thread waiting), this thundering herd is okay and 
actually performs better than freeing the lock and only waking 
up one waiter.  They say it makes the code much cleaner.  I'm
betting we're overusing the mutexes which changes the equation
considerably.  =-)

Okay, that tells you how it is done in kernel-space.  For
user-space (see below how to remove user-space threading), 
it is slightly different.  Remember in Solaris, we have
a two-tier thread-model - user-space and kernel threads.

In user-space, when we call pthread_mutex_unlock, we hit 
_mutex_unlock in liblwp, which calls mutex_unlock_adaptive 
as we aren't a special case lock.

So, what does mutex_unlock_adaptive do in lwp?  In pseudocode (as 
best as I can explain it), it first tries to see if there is a 
waiter in the current LWP, if so, it clears the lock bit and the 
other thread in the same LWP takes it.  If no thread in the same 
LWP is waiting, it will then sleep for ~500 while loop 
iterations to let another thread in the same LWP take the thread.
(On a UP box, it'll exit here before doing the while loop as it 
knows that spinlocks are stupid.  I think you were testing on a 
MP box.)  If the while loop iteration concludes without anyone 
acquiring the lock, it sends it to the kernel as no one in this
LWP cares for that lock (and we hit the semantics described above).

I'm wondering if this while loop iteration (essentially a spin
lock) may be the root cause of the lwp_mutex_unlock utilization 
you are seeing.

Anyway, in Solaris 9, IIRC, they have removed the userspace 
scheduler and all threads are now bound (all threads map 
directly to kernel threads).  You may acheive the same result 
in 

Re: Fw: Regarding lower-case HTML tags

2001-09-04 Thread Justin Erenkrantz

On Tue, Sep 04, 2001 at 10:12:32PM -0500, William A. Rowe, Jr. wrote:
 Josh just came up with what I believe is the best explanation of if and
 when to reformat - I think this applies equally well to sources and docs.
 
 I'd added only one caviat - changes to the format should always -preceed-
 the patch to the actual code, and there shouldn't be format changes if there
 is no work to commit that the existing format didn't interfere with.
 
 Note that +/- whitespace patches (including newlines) are _simple_ to ignore.
 Changes to anything else (capitalization, etc) are most definately not.
 
 I personally reformat often - but only if it 1. increases legibility in
 2. a module I'm actively refactoring.  But Joshes' explanation is great :)

I'm wondering if we can add this on the site somewhere so that future 
committers don't end up in the same trap I found myself in.  

(There is no place anywhere that says that we don't actively follow 
the style guide...I was under the (false) impression that we should 
be actively following it...) -- justin




Re: remaining CPU bottlenecks in 2.0

2001-09-04 Thread William A. Rowe, Jr.

From: Justin Erenkrantz [EMAIL PROTECTED]
Sent: Tuesday, September 04, 2001 11:46 PM


 Based on the patches you submitted (and my quasi-errant formatting
 patch), I had to read most of the code in mod_include, so I'm more 
 familiar with mod_include now.  I do think there are some obvious 
 ways to optimize find_start_sequence.  I wonder if we could apply 
 a KMP-string matching algorithm here.  I dunno.  I'll take a look 
 at it though.  Something bugs me about the restarts.  I bet that 
 we spend even more time in find_start_sequence when a HTML file 
 has lots of comments.  =-)

You were discussing the possibility of parsing for !--# as a skip by 5.

Consider jumping to a 4 byte alignment, truncating to char and skip by
dwords.  E.g., you only have to test for three values, not four, and you
can use the machine's most optimal path.  But I'd ask if strstr() isn't
optimized on the platform, why are we reinventing it?

This is DSS for little endian (that char bugger comes from the first byte
so skipping in 0-3 is not an issue) but for big endian architectures you 
need to backspace to the dword alignment so you don't miss the tag by
skipping up to 6 (wrong by 3, and then reading fourth byte of dword.)

That has got to be your most optimal search pattern.

Bill




Re: [PATCH] Add mod_gz to httpd-2.0

2001-09-04 Thread Ryan Bloom

On Tuesday 04 September 2001 18:23, Greg Stein wrote:
 On Mon, Sep 03, 2001 at 05:47:02PM -0700, Ryan Bloom wrote:
 ...
  I have a big problem with this.  We had a hard enough time contributing
  patches back to MM.  The only reason we keep expat and pcre up to date,
  is that we NEVER make any changes to them.  I would be very much against
  adding zlib to our tree.

 Not to mention that I'm also an Expat developer, so I can cross-port
 changes back and forth between the trees. :-)

 But yes: the stability of PCRE and Expat are a big help. However, I'd point
 out that we didn't make a lot of changes to MM either; it was quite stable,
 too. No idea why the changes didn't go back and forth, tho.

We actually made a lot of changes to MM.  Which is also why the changes
didn't go back and forth.  We needed things that Ralf didn't want to put in
the MM dist.  Once we weren't exactly the same, the patching just fell
apart.

Ryan
__
Ryan Bloom  [EMAIL PROTECTED]
Covalent Technologies   [EMAIL PROTECTED]
--