Re: [PATCH] Rework httpd buckets/filtering

Justin Erenkrantz Sun, 23 Sep 2001 09:20:43 -0700
On Sun, Sep 23, 2001 at 08:33:07AM -0700, Ryan Bloom wrote:
> 
> None of this can be committed until it has passed all of the tests in the perl test
> framework. The last time somebody (me) mucked with these filters it broke all of the
> CGI script handling. I would also like to hear from Jeff about this, because when we
> wrote the filters originally, we had input filtering use this design, and Jeff and 
>Greg
> were adamant that filters needed to be able to return more than they were asked for.
> I am uncomfortable committing this until they have a chance to look it over.

Yeah, I have no intention of committing this until it has been suitably
reviewed.  =-)  This is a big change, but I do hope that it is a step
in the right direction.  Or, that at least, we can figure out how to
go in the right direction.

> Other than that, I made some comments in-line that should be addressed. I will
> try to run it through the test framework today or tomorrow, so that I can give it
> a +1 or not.
> 
> Ryan
> 
> 
> > Index: include/apr_buckets.h
> > ===================================================================
> > RCS file: /home/cvs/apr-util/include/apr_buckets.h,v
> > retrieving revision 1.118
> > diff -u -r1.118 apr_buckets.h
> > --- include/apr_buckets.h   2001/09/22 22:36:07     1.118
> > +++ include/apr_buckets.h   2001/09/23 10:36:12
> > @@ -689,6 +689,17 @@
> >                                               apr_off_t *length);
> >
> >  /**
> > + * create a flat buffer of the elements in a bucket_brigade.
> > + * @param b The bucket brigade to create the buffer from
> > + * @param c The pointer to the returned character string
> > + * @param len The length of the returned character string
> > + * @param p The pool to allocate the character string from.
> > + * Note: You *must* have enough space allocated to store all of the data.
> > + * See apr_brigade_length().
> > + */
> > +APU_DECLARE(apr_status_t) apr_brigade_to_buffer(apr_bucket_brigade *b,
> > char *c); +
> > +/**
> 
> Please change the docs to match the API.

Oops.  Yeah.

> > Index: buckets/apr_brigade.c
> > ===================================================================
> > RCS file: /home/cvs/apr-util/buckets/apr_brigade.c,v
> > retrieving revision 1.25
> > diff -u -r1.25 apr_brigade.c
> > --- buckets/apr_brigade.c   2001/09/03 22:00:36     1.25
> > +++ buckets/apr_brigade.c   2001/09/23 10:36:12
> > @@ -221,6 +221,29 @@
> >      return APR_SUCCESS;
> >  }
> >
> > +APU_DECLARE(apr_status_t) apr_brigade_to_buffer(apr_bucket_brigade *b,
> > char *c) +{
> > +    apr_bucket *e;
> > +    char *current;
> > +    apr_size_t curlen;
> > +    apr_status_t status;
> > +
> > +    current = c;
> > +
> > +    APR_BRIGADE_FOREACH(e, b) {
> > +
> > +        status = apr_bucket_read(e, (const char **)&current, &curlen,
> > +                                 APR_BLOCK_READ);
> 
> This is most likely wrong. You probably want to do a non-blocking read, and
> return immediately if you can't find any data. That way, the caller can keep working
> with the data he has. You also probably want to split the brigade at that location,
> so that the caller can keep track of where they are in the brigade.

The code that it replaced in http_protocol.c used a blocking read
(see http_protocol.c:1563).  I guess the mode could be passed in
to this function?

> > +        if (status != APR_SUCCESS)
> > +            return status;
> > +
> > +        current += curlen;
> > +    }
> > +
> > +    return APR_SUCCESS;
> > +
> > +}
> > +
> >  APU_DECLARE(apr_status_t) apr_brigade_to_iovec(apr_bucket_brigade *b,
> >                                                 struct iovec *vec, int
> > *nvec) {
> > Index: buckets/apr_buckets_pipe.c
> > ===================================================================
> > RCS file: /home/cvs/apr-util/buckets/apr_buckets_pipe.c,v
> > retrieving revision 1.41
> > diff -u -r1.41 apr_buckets_pipe.c
> > --- buckets/apr_buckets_pipe.c      2001/09/22 22:36:07     1.41
> > +++ buckets/apr_buckets_pipe.c      2001/09/23 10:36:13
> > @@ -106,15 +106,8 @@
> >      }
> >      else {
> >          free(buf);
> > -        a = apr_bucket_immortal_make(a, "", 0);
> > -        *str = a->data;
> > -        if (rv == APR_EOF) {
> > +        if (rv == APR_EOF)
> >              apr_file_close(p);
> > -            if (block == APR_NONBLOCK_READ) {
> > -                /* XXX: this is bogus, should return APR_SUCCESS */
> > -                return APR_EOF;
> > -            }
> > -        }
> 
> You can't do this.  The reason for the immortal bucket, was that we are closing 
> the pipe or socket bucket.  What this change does, is it closes the pipe, but
> leaves the bucket in the brigade.  This will cause problems later. You can't
> just remove the bucket, because you only have access to the one bucket,
> and if you remove this bucket, then you also need to update the pointer from
> the previous bucket, which you can't do.

This adds all of the 0-length buckets that make some of the higher-level
code more complicated.  Wouldn't calling apr_bucket_delete(a) work here?
The bucket contains a link to which brigade it is in - so it should be
able to update the right pointers, no?

> > Index: modules/http/http_protocol.c
> > ===================================================================
> > RCS file: /home/cvs/httpd-2.0/modules/http/http_protocol.c,v
> > retrieving revision 1.362
> > diff -u -r1.362 http_protocol.c
> > --- modules/http/http_protocol.c    2001/09/17 13:12:37     1.362
> > +++ modules/http/http_protocol.c    2001/09/23 10:56:30
> > @@ -487,6 +487,7 @@
> >      enum {
> >          WANT_HDR /* must have value zero */,
> >          WANT_BODY,
> > +        WANT_ENDCHUNK,
> >          WANT_TRL
> >      } state;
> >  };
> > @@ -498,9 +499,7 @@
> >  {
> >      apr_status_t rv;
> >      struct dechunk_ctx *ctx = f->ctx;
> > -    apr_bucket *b;
> > -    const char *buf;
> > -    apr_size_t len;
> > +    apr_off_t len;
> >
> >      if (!ctx) {
> >          f->ctx = ctx = apr_pcalloc(f->r->pool, sizeof(struct dechunk_ctx)); 
> >  @@ -508,43 +507,38 @@
> >
> >      do {
> >          if (ctx->chunk_size == ctx->bytes_delivered) {
> > -            /* Time to read another chunk header or trailer...  ap_http_filter() 
>is 
> > -             * the next filter in line and it knows how to return a brigade with 
> > -             * one line.
> > -             */
> > +            /* Time to read another chunk header or trailer...  We only want 
> > +             * one line - by specifying 0 as the length to read.  */
> >               char line[30];
> >
> > -            if ((rv = ap_getline(line, sizeof(line), f->r,
> > -                                 0 /* readline */)) < 0) {
> > +            if ((rv = ap_getline(line, sizeof(line), f->r, 0)) < 0) {
> >                  return rv;
> >              }
> >              switch(ctx->state) {
> >              case WANT_HDR:
> >                  ctx->chunk_size = get_chunk_size(line);
> >                  ctx->bytes_delivered = 0;
> > -                if (ctx->chunk_size == 0) {
> > +                if (ctx->chunk_size == 0)
> >                      ctx->state = WANT_TRL;
> > -                }
> > -                else {
> > +                else
> >                      ctx->state = WANT_BODY;
> > -                }
> 
> Don't do that.  The { and } aren't strictly required, but it is good practice
> to have them. It makes people who are modifying the code later not have
> to worry about them. It also shrinks the size of the patch, which is good for
> people who have to review it. In this case, we would remove about 8 lines
> from the size of the patch.  Not much, but every line that isn't in the patch
> helps.

Yeah, I'll remove the reformatting...  I was tired and forgot to
remove the reformatting before I submitted.

> > @@ -655,12 +635,16 @@
> >      if (*readbytes == -1) {
> >          apr_bucket *e;
> >          apr_off_t total;
> > +        const char *str;
> > +        apr_size_t len;
> >          APR_BRIGADE_FOREACH(e, ctx->b) {
> > -            const char *str;
> > -            apr_size_t len;
> > +            /* We don't care about these values.  We just want to force the
> > +             * lower level to convert it.  This seems hackish. */
> >              apr_bucket_read(e, &str, &len, APR_BLOCK_READ);
> 
> Why does this seem hack-ish?  We can't pass a socket up the filter stack
> and we need to get all the data.  This is the defined way to do that.

It seems like we should have a function that says, "go through and
read all the data but don't return the value to me."  Actually,
calling apr_brigade_length(ctx->b, 1, &len) would do the same
thing - but that's even more hackish.

> 
> >          }
> >          APR_BRIGADE_CONCAT(b, ctx->b);
> > +
> > +        /* Force a recompute of the length */
> >          apr_brigade_length(b, 1, &total);
> >          *readbytes = total;
> >          e = apr_bucket_eos_create();
> > @@ -670,67 +654,40 @@
> >      /* readbytes == 0 is "read a single line". otherwise, read a block. */
> >      if (*readbytes) {
> >          apr_off_t total;
> > +        apr_bucket *e;
> > +        apr_bucket_brigade *newbb;
> >
> > -        /* ### the code below, which moves bytes from one brigade to the
> > -           ### other is probably bogus. presuming the next filter down was
> > -           ### working properly, it should not have returned more than
> > -           ### READBYTES bytes, and we wouldn't have to do any work.
> > -        */
> > -
> > -        APR_BRIGADE_NORMALIZE(ctx->b);
> > -        if (APR_BRIGADE_EMPTY(ctx->b)) {
> > -            if ((rv = ap_get_brigade(f->next, ctx->b, mode, readbytes)) !=
> > APR_SUCCESS) { -                return rv;
> > -            }
> > -        }
> > -
> > +        newbb = NULL;
> > +
> >          apr_brigade_partition(ctx->b, *readbytes, &e);
> > -        APR_BRIGADE_CONCAT(b, ctx->b);
> > +        /* Must do split before CONCAT */
> >          if (e != APR_BRIGADE_SENTINEL(ctx->b)) {
> > -            apr_bucket_brigade *temp;
> > +            newbb = apr_brigade_split(ctx->b, e);
> > +        }
> > +        APR_BRIGADE_CONCAT(b, ctx->b);
> >
> > -            temp = apr_brigade_split(b, e);
> > +        /* FIXME: Is this really needed?  Due to pointer use in sentinels,
> > +         * I think so. */
> > +        if (newbb)
> > +            APR_BRIGADE_CONCAT(ctx->b, newbb);
> >
> > -            /* ### darn. gotta ensure the split brigade is in the proper pool.
> > -               ### this is a band-aid solution; we shouldn't even be doing
> > -               ### all of this brigade munging (per the comment above).
> > -               ### until then, this will get the right lifetimes. */
> > -            APR_BRIGADE_CONCAT(ctx->b, temp);
> > -        }
> > -        else {
> > -            if (!APR_BRIGADE_EMPTY(ctx->b)) {
> > -                ctx->b = NULL; /*XXX*/
> > -            }
> > -        }
> > +        /* FIXME: Are we assuming that b is empty when we enter? */
> 
> You are in an input filter.  All brigades must be empty when passed to an
> input filter.  Data is put into the filter on the way back up.

Okay.

I'll resubmit without the reformatting and with fresh eyes to see
if I catch anything else this morning.  =-) 

Thanks.  -- justin
Re: [PATCH] Rework httpd buckets/filtering

Reply via email to