Re: [Contrib] mod-gz

Brian Pane Sun, 12 Aug 2001 19:26:55 -0700
The module looks good.  Here are some comments and questions (based on
a reading of the code; I haven't had time to run it yet).

Ian Holsman wrote:

[...]

>/* ===========================================================================
>   Outputs a long in LSB order to the given file
>*/
>static void putLong(char *string, unsigned long x)
>{
>    int n;
>    for (n = 0; n < 4; n++) {
>       string[n] = (int) (x & 0xff);
>       x >>= 8;
>    }
>}
>
On platforms where long is 64 bits, is there any scenario where
you'll ever need to output more than the lower 4 bytes?

[...]

>    token = ap_get_token(r->pool, &accepts, 0);
>    while (token && token[0] && strcmp(token, "gzip")) {
>       accepts++;              /* skip token */
>       token = ap_get_token(r->pool, &accepts, 0);
>    }
>    if (token == NULL || token[0] == '\0') {
>       return ap_pass_brigade(f->next, pbbIn);
>    }
>
I have some performance concerns about the use of ap_get_token
here, but it's not worth worrying about this unless profiling
shows it to be a bottleneck.

[...]

>    pbbOut = apr_brigade_create(f->r->pool);
>
Is there any way to implement the compression without having
to create a new brigade?  E.g., can you replace each bucket
in the original brigade with its compressed version?
(apr_brigade_create calls apr_pool_cleanup_register, which
incurs some overhead at the time of the call and more when
the cleanup callback is invoked later.)

>    if (!f->ctx) {
>       f->ctx = apr_pcalloc(f->c->pool, sizeof(*ctx));
>       ctx = f->ctx;
>       ctx->strm.zalloc = (alloc_func) 0;
>       ctx->strm.zfree = (free_func) 0;
>       ctx->strm.opaque = (voidpf) 0;
>       ctx->crc = 0L;
>
Aren't all those fields 0 already, following the pcalloc?

[...]

>
>           buf =
>               apr_psprintf(r->pool, "%c%c%c%c%c%c%c%c", szCRC[0], szCRC[1],
>                            szCRC[2], szCRC[3], szLen[0], szLen[1], szLen[2],
>                            szLen[3]);
>
The sprintf might be a bit slow (I've found it to be a bottleneck in other
contexts).  If it's indeed a performance problem, how about something 
like...
  char *b;
  ...
  buf = apr_palloc(r->pool, 8);
  b = buf;
  *b++ = szCRC[0];
  *b++ = szCRC[1];
  ...
  *b = szLen[3];

[...]


Aside from the code, I have some more general thoughts on the
design of a compression filter.

Is there a way to designate buckets as "pre-compressed" so that
mod_gz could pass them through without modification?  This could
support an interesting optimization for compressed, server-parsed
content...

Consider a file format that represents an HTML document as a list
of blocks, where each block is either:
  - some text, included in both original and lz-compressed form, or
  - a directive to include another URI, or
  - a directive to invoke some C function to produce output (equivalent
    to the extension directive mechanism in mod_include)

Each block needs a header that designates its type and its length,
of course.  And in the case of a block of text, the header should
include pointers to both the normal and compressed versions of the
text.

With documents in this format, plus a filter or handler that
understands how to process them, you could get the benefits of
zero-copy delivery even for gzip'ed and server-parsed content
(each chunk of static text in the file turns into an mmap or
sendfile bucket, and only dynamically created content needs to
be compressed in real time).

--Brian
Re: [Contrib] mod-gz

Reply via email to