>> So we need any gzip filter to drop out quick if 1. it knows this mime
type should not be encoded; 2. it sees the 
>> content-encoding is already gz; 3. it sees the uri/filename whatever
in a list of exclusions (that could be 
>> automagically grown when it hits files that
_just_don't_compress_well_.

Items 1 and 2 are already built into mod_gzip

>> But it still burns CPU that we don't want to waste on useless tasks.

Mod_gzip includes logic which prevents it compressing items "deemed" too
small. As for burning CPU, in benchmarking it has been proven (not only
by Apache Bench but also by Mercury Interactive) that it is always
faster to send a smaller file than a larger file. (makes sense) You will
find that the TCP/IP subsystem spends more time (CPU Cycles) outputting
large files (i.e. some javascript files are now over 100K but can be
compressed down to 4K!) than is required by the compression phase and
the transmission phase of a smaller file

>> Probably always need to set some 'threshold' of 8kb (minimally) that
the web server absolutely ignores, and some include >> or exclude list
by mime type to describe the value of further compression.  Even if the
file is requested to be 
>> gzip'ped, and it's targetted for the cache, set a flag at the end
that says "hey, I saved -2% on this file, don't let us >> do _that_
again!  File foo shouldn't be cached", and then internally add foo to
the excludes list for any gzip filter.

See Items 1 & 2 above. This is already built in.

Here is something for this forum if you are really interested in
mod_gzip. Check out this link: http://www.schroepl.net/mod_gzip/

It's an amazing analysis of mod_gzip on HTTP traffic and includes all
different browser types. Here is what is amazing, check out the "saved"
column and the "average" savings for all the different stats... About
51%

That's a HUGE benefit to ALL apache users. Why wouldn't you use it?

Regards


Peter


-----Original Message-----
From: William A. Rowe, Jr. [mailto:[EMAIL PROTECTED]] 
Sent: Sunday, September 02, 2001 11:43 AM
To: [EMAIL PROTECTED]
Subject: Re: [PATCH] Add mod_gz to httpd-2.0


From: "Greg Stein" <[EMAIL PROTECTED]>
Sent: Sunday, September 02, 2001 4:43 AM


> On Sat, Sep 01, 2001 at 07:50:19PM -0700, Ryan Bloom wrote:
> > On Saturday 01 September 2001 18:53, Cliff Woolley wrote:
> > > On Sat, 1 Sep 2001, Ryan Bloom wrote:
> >...
> We ship code that uses it. zlib fixes their code (or somebody else 
> fixes it). Now our module rocks. Chicken and egg... A ton of other 
> projects use zlib. I see no reason for us to avoid it. If it *does* 
> happen to have leaks, then (when somebody finds this to be true) they 
> could fix it.

I kind of wonder about the 'leaks like a sieve' comments.  I wonder if
the actual users are the ones not following the API to cause zlib to
clean up.

Let's not forget ... expat, pcre, and probably zlib have malloc/free fn
ptrs in 
their code.  There is _nothing_ stoping us from using pools to
initialize/mop up 
by thunking for their malloc/free fn ptrs into our pool architecture :)

> > > > 3)  I don't believe that we should be adding every possible 
> > > > module to the core distribution.  I personally think we should 
> > > > leave the core as minimal as possible, and only add more modules

> > > > if they implement a part of the HTTP spec.
> 
> The gzip content encoding is part of the HTTP spec.

By implementation, or reference?  Sure Content-encoding is part of the
spec, and it's defined to allow authors to extend their support to any
number of encodings, and forces the server to use only what the client
claims as supported encodings.

However, gzip is _defined_ by RFC, therefore we have a standards-based
encoding specification, which should be a requirement, IMHO, for
inclusion in any ASF project.

> > > My personal opinion is that this one is important enough that it 
> > > should go in.  Most clients support gzip transfer coding, and it's

> > > a very real solution to the problem of network bandwidth being the

> > > limiting factor on many heavily-loaded web servers and on 
> > > thin-piped clients (read: modem
> 
> Agreed!

+1 with caviat to follow.

> But it isn't "invisible" if you do it with Perl, PHP, Python, or CGI. 
> A person has to explicitly code it.
> 
> I'm really looking forward to mod_gz(ip) in Apache 2.0 so that 
> Subversion can transfer its content in compressed form. All of that 
> comes out of a database... it can't be precompressed, so that leaves a

> filter to do the job as it hits the wire. Doing large checkouts are 
> almost *always* network bound rather than server/client bound. 
> Compressing those files is a *huge* win.

So we need any gzip filter to drop out quick if 1. it knows this mime
type should not be encoded; 2. it sees the content-encoding is already
gz; 3. it sees the uri/filename whatever in a list of exclusions (that
could be automagically grown when it hits files that
_just_don't_compress_well_.

The fact that gzip doesn't _expand_ data is kind of nice.  But it still
burns CPU that we don't want to waste on useless tasks.

> > -1 (vote, not veto), for putting mod_gz in the core.
>
> Needless to say, I'm +1 on the concept. It's a big win for everybody. 
> I haven't reviewed the code yet, so no commentary there.

I'm +1 on concept as well, provided we don't keep implementing resuable
algorithms in the core (those are definately apr-util style tasks), we
might support both zlib and zlibc (maybe thunk through apr-util?  I
don't see a benefit just yet), and that 
we implement this legibly and simply in as few lines of code as
possible.

If we like, the entire zlib (168kb tar/gz'ed) could be distributed
through /srclib/ instead of by reference.  It really isn't that large,
and matches what we do today with pcre and expat.  If we like, drop it
into apr-util/encoding/unix/lib/zlib and only expose it through thunks
(allowing someone to come up with more robust or faster apr-util thunks
to source that we can _not_ redestribute.)  The more I contemplate, the
stronger my +1 for apr-util remapping, where appropriate on some
platforms.

The gzip encoding isn't going to change anytime soon.  I'd say that
makes it a core candidate.  If someday later we add 'another' encoding,
then that's a new module, and mod_gzip can keep chugging away.  Or we
implement this al la proxy.  mod_encoding for the generic http encoding
field parsing, and encoding_gzip for the backend.

Bill 

p.s. I believe this license is most _definately_ compatible with the
ASF, see http://www.gzip.org/zlib/zlib_license.html




Reply via email to