Re: appending to the content brigade

Eric Prud'hommeaux Sat, 25 Aug 2001 12:40:21 -0700

On Fri, Aug 24, 2001 at 06:54:31PM -0400, Cliff Woolley wrote:
> On Fri, 24 Aug 2001, Greg Stein wrote:
> 
> > Okay... let's back up. I believe that some poor advice may have been given.
> >
> > For now, let's assume that bPayload *does* have the entire content. Given
> > that, there is *no* reason to copy the darn payload into a new brigade.
> >
> > The whole *idea* of the filters and the brigades is that you modify the
> > brigade passed to you, and then pass it on down. Thus, your operation is
> > like this:
> 
> Oh, duh, yeah of course.  Thanks for clearing the air.  =-)  I wasn't
> thinking from the beginning.  I started with the assumption that we had
> two brigades and we wanted to combine them or twiddle them in some way
> (which filters do all the time), missing the obvious fact that in THIS
> case you don't have to do that.  Tack some stuff onto the beginning, pass
> it down.  That's it.  Well, anyway, now you (Eric) know what the rules are
> if you have a more complicated case.

Yeah, I figured I could be more economical and have been shamed into
it. I wrote up some docs for filters and buckets (attached) but didn't
work this lesson into it. Take a look and see if you think they are
useful. I haven't put a lot of polish on them, but they should be a
good conversation starter. The documents are written to be peers of
http://httpd.apache.org/docs-2.0/developer/hooks.html.

> Thanks Greg.  :)

And thanks from me.
-- 
-eric

([EMAIL PROTECTED])
Feel free to forward this message to any list for any purpose other than
email address distribution.

Title: Apache 2.0 Filter Functions

Apache HTTP Server Version 2.0

Apache Filter Functions

HTTP requests are read from the network and passed through a series of input filters. Responses are fabricated by generators and passed through a series of output filters. These filters may prepend or append the stream or replace it entirely. Modules and core functions register and enable filters via the filter API (@@@ would like an href to some docs auto-generated from include/util_filter.h if such exist).

Registering a filter

During initialization, modules and core functions register input and output filters using the filter API:

void ap_register_input_filter(const char *name, ap_in_filter_func filter_func, ap_filter_type ftype)
      void ap_register_output_filter(const char *name, ap_out_filter_func filter_func, ap_filter_type ftype)


    Modules are likely to call these functions from the module initializer:

    static void my_register_hooks(apr_pool_t *p)
      {
      ap_register_input_filter("MY_INPUT_FILTER_NAME", my_input_filter, AP_FTYPE_TRANSCODE);
      ap_register_output_filter("MY_OUTPUT_FILTER_NAME", my_output_filter, AP_FTYPE_HTTP_HEADER);
      }

      module AP_MODULE_DECLARE_DATA my_module = {
      STANDARD20_MODULE_STUFF,
      my_dir_config,	/* dir config creater */
      my_dir_merge,	/* dir merger --- default is to override */
      my_server_config,	/* server config */
      my_server_merge,	/* merge server config */
      my_cmds,		/* command table */
      my_register_hooks	/* register hooks */
      };

    Implement the filter function

    Include the appropriate header, and define a static function of the correct type, input (ap_in_filter_func):

    static apr_status_t my_input_filter(ap_filter_t *f, apr_bucket_brigade *b, ap_input_mode_t mode, apr_size_t *readbytes)
      {
      return ap_pass_brigade(f->next, b);
      }

    and output:

    static apr_status_t my_output_filter(ap_filter_t *f, apr_bucket_brigade *bb)
      {
      return ap_pass_brigade(f->next, b);
      }

    The bucket brigade is documented separately. The minimal work code to propagate the existing content is the call to ap_pass_brigade.

    Add a filter

    At request processing time, modules and core functions selectively enable registered filters by name. Most filters are applied to a request:

    ap_add_input_filter("MY_INPUT_FILTER_NAME", NULL, r, r->connection);
      ap_add_output_filter("MY_OUTPUT_FILTER_NAME", NULL, r, r->connection);

    Note: if you are hacking wire protocol extensions, you may need to have your filter called before a request structure had been constructed. As of this writing, an example of this was in the pre_connection callback in modules/tls/mod_tls.c:

    static int tls_filter_inserter(conn_rec *c)
      {
      ...
      pCtx=apr_pcalloc(c->pool,sizeof *pCtx);
      ...
      pCtx->pInputFilter=ap_add_input_filter(s_szTLSFilterName,pCtx,NULL,c);
      pCtx->pOutputFilter=ap_add_output_filter(s_szTLSFilterName,pCtx,NULL,c);
      ...
      }

    Filter type

    The type of a filter controls when it is invoked. The types of filter are defined in apr_buckets.h:

    
      AP_FTYPE_CONTENT
      These filters are used to alter the content that is passed through them. Examples are SSI or PHP.

      AP_FTYPE_HTTP_HEADER may be remaned or removed
      This special type ensures that the HTTP header filter ends up in the proper location in the filter chain.

      AP_FTYPE_TRANSCODE
      These filters implement transport encodings (e.g., chunking).

      AP_FTYPE_CONNECTION
      These filters will alter the content, but in ways that are more strongly associated with the connection.  Examples are splitting * an HTTP connection into multiple requests and buffering HTTP * responses across multiple requests.

	It is important to note that these types of filters are not allowed in a sub-request. A sub-request's output can certainly be filtered by ::AP_FTYPE_CONTENT filters, but all of the "final processing" is determined by the main request.

      AP_FTYPE_NETWORK
      These filters don't alter the content.  They are responsible for sending/receiving data to/from the client.
    

    Calling conventions and contracts

    Filters may be called repeatedly. Filters store context needed between calls in the ap_filter_t's ctx pointer. Eventually, each filter should call the downstream filters by calling ap_pass_brigade with ap_filter_t's next filter.

    Example: serving a file

    In the common paradigm, one or more AP_FTYPE_CONTENT filters are passed a brigade by the content generator. For example, from the default_handler which serves file requests:

        bb = apr_brigade_create(r->pool);
    e = apr_bucket_file_create(fd, 0, r->finfo.size);

    APR_BRIGADE_INSERT_HEAD(bb, e);
    e = apr_bucket_eos_create();
    APR_BRIGADE_INSERT_TAIL(bb, e);

    return ap_pass_brigade(r->output_filters, bb);

    The next filters executed are those registered as AP_FTYPE_HTTP_HEADER. The ap_http_header_filter function generates a brigade with the response code and mime headers. This brigade is passed to TRANSCODE and CONNECTION filters:

        ...
    b2 = apr_brigade_create(r->pool);
    basic_http_header(r, b2, protocol);
    ...
    ap_pass_brigade(f->next, b2);
    ...

    ap_http_header_filter removes itself (why?) from the list of filters and calls the downstream filters again with the content brigade:

        ...
    ap_remove_output_filter(f);
    return ap_pass_brigade(f->next, b);


    Eric Prud'hommeaux, 24th August 2001

    

    
      Apache HTTP Server Version 2.0

Title: Apache 2.0 Buckets and Bucket Brigades

Apache HTTP Server Version 2.0

Buckets and Bucket Brigades

Buckets provide uniform access to a variety of data sources. Bucket brigades are collections of buckets usually use to represent a block of data. For instance, a bucket brigade may contain buckets of headers allocated from a request pool followed by a bucket representing a local file. A filter can add data before or after this file block without having to copy the the data. apr_brigade_to_iovec makes it easy to pass this scatter-write paradigm through the sockets layer all the way to the ethernet and disk controllers.

Types of buckets
Bucket virual functions
Using buckets and bucket brigades

Types of buckets

Apache provides a set of standard bucket types allowing data collection from a variety of sources:

Marker buckets
`transient`	`(const char *buf, apr_size_t nbyte)`
`transient`	Represents a data allocated off the stack. When the setaside function is called, this data is copied on to the heap.
`heap`	`(const char buf, apr_size_t nbyte, int copy, apr_size_t w)`
`heap`	Represents a data allocated from the heap.
`pool`	`(const char buf, apr_size_t length, apr_pool_t pool)`
`pool`	Represents a data that was allocated from a pool. IF this bucket is still available when the pool is cleared, the data is copied on to the heap.
`file`	`(apr_file_t *fd, apr_off_t offset, apr_size_t len)`
`file`	Represents a file on disk.
`mmap`	`(apr_mmap_t *mm, apr_off_t start, apr_size_t length)`
`mmap`	This bucket represents an MMAP'ed file.
`socket`	`(apr_socket_t *thissock)`
`socket`	Represents a socket connection to another machine.
`pipe`	`(apr_file_t *thispipe)`
`pipe`	Represents a pipe to another program.
`eos`	`(void)`
`eos`	Signifies that there will be no more data, ever. All filters MUST send all data to the next filter when they receive a bucket of this type.
`flush`	`(void)`
`flush`	Signifies that all data should be flushed to the next filter. The flush bucket should be sent with the other buckets.
Opaque buckets
`immortal`	`(const char *buf, apr_size_t nbyte)`
`immortal`	Represents a segment of data that the creator is willing to take responsability for. The core will do nothing with the data in an immortal bucket.

Bucket virual functions

Buckets implement a set of virtual functions so access and state in different bucket types may be handled uniformly. Following is a list of parameters and a description of each type of bucket followed by the funtion to be used if the bucket does not implement the virtual function:

`apr_bucket_read`	`(apr_bucket b, const char str, apr_size_t len, apr_read_type_e block)`
	Returns the address and size of the data in the bucket. If the data isn't in memory then it is read in and the bucket changes type so that it can refer to the new location of the data. If all the data doesn't fit in the bucket then a new bucket is inserted into the brigade to hold the rest of it.
	must be implemented
`apr_bucket_split`	`(apr_bucket *e, apr_off_t point)`
	Divides the data in a bucket into two regions. After a split the original bucket refers to the first part of the data and a new bucket inserted into the brigade after the original bucket refers to the second part of the data. Reference counts are maintained as necessary. If this is not implemented, apr_brigade_partition simply reads the bucket and creates new ones with the contents from either side of the divider.
	apr_bucket_split_notimpl
`apr_bucket_setaside`	`(apr_bucket *e)`
	Ensures that the data in the bucket has a long enough lifetime. Sometimes it is convenient to create a bucket referring to data on the stack in the expectation that it will be consumed (output to the network) before the stack is unwound. If that expectation turns out not to be valid, the setaside function is called to move the data somewhere safer.
	apr_bucket_setaside_notimpl
`apr_bucket_copy`	`(apr_bucket e, apr_bucket *c)`
	Makes a duplicate of the bucket structure as long as it's possible to have multiple references to a single copy of the data itself. Not all bucket types can be copied.
	apr_bucket_copy_notimpl
`apr_bucket_destroy`	`(void *data)`
	Maintains the reference counts on the resources used by a bucket and frees them if necessary.
	apr_bucket_destroy_notimpl
`apr_bucket_delete`	`(void *data)`
	@@@ calls apr_bucket_destroy.
	implemented via a macro

Note: all of the above except destroy and delete return a apr_status_t.

Because some buckets have unimplemented functions, and some do not maintain a length, callers must be prepared to work around these limitations.

Using buckets and bucket brigades

This section will (attempt to) provide an outline of the uses of buckets and a small vocabulary of idioms used to accomplish common tasks.

Bucket type and function names

There are strict naming conventions for the bucket types. The typedef name can be found by prepending the name with "apr_bucket_type_", for example, the pool bucket:

b.type = apr_bucket_type_pool

The create function, takes the prototype described above, eg:

apr_bucket * apr_bucket_pool_create(const char *buf, apr_size_t nbyte, apr_pool_t *pool);

The make function, recommended only for bucket implementers, takes a pointer to an apr_bucket plus the prototype described above:

apr_bucket * apr_bucket_pool_make(apr_bucket *b, const char *buf, apr_size_t nbyte, apr_pool_t *pool);

Each type also has a macro to test type. For brevity and maintenance purposes, it is recommended that you use this macro and not manipulation of the bucket data structure.

APR_BUCKET_IS_POOL(bucket)

Calling conventions and contracts

filter conventions dictate the use of buckets. Filters that manipulate a brigade may rely on brigade and bucket functions but are expected to deliver a coherent bucket to subsequent filters via ap_pass_brigade. Filters may be called repeatedly but must be passed a brigade ending with an EOS on the last call.

Common idioms

default_handler creates a new bucket brigade and adds a file bucket - from server/core.c

    bb = apr_brigade_create(r->pool);
	  e = apr_bucket_file_create(fd, 0, r->finfo.size);

	  APR_BRIGADE_INSERT_HEAD(bb, e);
	  e = apr_bucket_eos_create();
	  APR_BRIGADE_INSERT_TAIL(bb, e);

ap_http_header_filter looks through all the buckets for a particular kind: - from modules/http/http_protocol.c

apr_status_t ap_http_header_filter(ap_filter_t *f, apr_bucket_brigade *b)
	  {
	  ...
	  apr_bucket *e;
	  ...
	  APR_BRIGADE_FOREACH(e, b) {
	  if (e->type == &ap_bucket_type_error) {
	  ap_bucket_error *eb = e->data;

	  ap_die(eb->status, f->r);
	  return AP_FILTER_ERROR;
	  }
	  }

ap_content_length_filter maintains context between calls - from server/protocol.c

    ctx = f->ctx;
	  if (!ctx) { /* first time through */
	  f->ctx = ctx = apr_pcalloc(r->pool, sizeof(struct content_length_ctx));
	  }
	  ...
	  APR_BRIGADE_FOREACH(e, b) {
	  apr_size_t length;
	  ...
	  length = e->length;
	  ...
	  ctx->curr_len += length;
	  r->bytes_sent += length;
	  }

ap_content_length_filter handles buckets that don't know their length - from server/protocol.c

        if (e->length == -1) { /* if length unknown */
	  rv = apr_bucket_read(e, &ignored, &length, APR_BLOCK_READ);
	  if (rv != APR_SUCCESS) {
	  return rv;
	  }
	  }
	  else {
	  length = e->length;
	  }

Brigade functions

Following is a list of the bucket brigade API functions:

Brigade creation and destruction

apr_bucket_brigade * apr_brigade_create (apr_pool_t *p): Create a new bucket brigade. The resulting brigade's cleanup is registered with p.
apr_status_t apr_brigade_destroy (apr_bucket_brigade *b): Destroy entire bucket brigade b. This includes destroying all of the buckets within the bucket brigade's bucket list.
apr_status_t apr_brigade_cleanup (void *b): Empty out an entire bucket brigade. This includes destroying all of the buckets within the bucket brigade's bucket list. This is similar to apr_brigade_destroy(), except that it does not deregister the brigade's pool cleanup function.

Adding data to brigades

int apr_brigade_vputstrs (apr_bucket_brigade *b, apr_brigade_flush flush, void *ctx, va_list va): Write strings of data indicated by va into brigade b. If the flush is non-NULL and b is empty and the length of a string in va is greater than APR_BUCKET_BUFF_SIZE, flush is called with context ctx.
int apr_brigade_write (apr_bucket_brigade *b, apr_brigade_flush flush, void *ctx, const char *str, apr_size_t nbyte): Writes nbyte bytes from str into brigade b. If the flush is non-NULL and b is empty and nbyte is greater than APR_BUCKET_BUFF_SIZE, flush is called with context ctx.
int apr_brigade_puts (apr_bucket_brigade *b, apr_brigade_flush flush, void *ctx, const char *str): Write zero-terminated string str into brigade b. If the flush is non-NULL and b is empty and strlen(str) is greater than APR_BUCKET_BUFF_SIZE, flush is called with context ctx.
int apr_brigade_putc (apr_bucket_brigade *b, apr_brigade_flush flush, void *ctx, const char c): Write character c into brigade b. flush and ctx are just window dressing.
int apr_brigade_vprintf (apr_bucket_brigade *b, apr_brigade_flush flush, void *ctx, const char *fmt, va_list va): Writes string resulting from vprintf(fmt, va) into brigade b. If the flush is non-NULL and b is empty and the length of the resulting string is greater than APR_BUCKET_BUFF_SIZE, flush is called with context ctx.

Brigade data consumption

apr_bucket_brigade * apr_brigade_split (apr_bucket_brigade *b, apr_bucket *e): Split a bucket brigade into two, such that the given bucket is the first in the new bucket brigade. This function is useful when a filter wants to pass only the initial part of a brigade to the next filter. Note, individual buckets may also be split.
apr_bucket * apr_brigade_partition (apr_bucket_brigade *b, apr_off_t point): Partition a bucket brigade at a given offset (in bytes from the start of the brigade). This is useful whenever a filter wants to use known ranges of bytes from the brigade; the ranges can even overlap.
void apr_brigade_consumeapr_bucket_brigade *b, int nbytes): ~~Consume nbytes from beginning of b -- call apr_bucket_destroy as appropriate, and/or modify start on last element. Not done yet...~~
apr_status_t apr_brigade_length (apr_bucket_brigade *bb, int read_all, apr_ssize_t *length): Set length to the total length of bb's buckets. read_all controls the behaviour when a bucket with an unknown length is encountered. If read_all is set, such buckets are read. Otherwise, length is set to -1 and the function reports success.

Misc

int apr_brigade_to_iovec (apr_bucket_brigade *brigade, struct iovec *vec, int nvec): Fill out the iovec vec with read data from up to nvec buckets in brigade returns the number of elements actually assimilated. This is useful for writing to a file or to the network efficiently.

Brigade macros

Following is a set of macros to manipulate and to tests on a bucket brigade. For brevity and maintenance purposes, it is recommended that you use these macros and not manipulation of the bucket and brigade data structures.

APR_BRIGADE_SENTINEL (brigade): Bucket brigades are implemented over Apache's ring interface (srclib/apr-util/include/apr_ring.h). This macro returns the marker for the limits of the ring, ie. the end of brigade. example
APR_BRIGADE_EMPTY (brigade): Determine if brigade is empty.
APR_BRIGADE_FIRST (brigade): Return the first bucket in a brigade.
APR_BRIGADE_LAST (brigade): Return the last bucket in brigade.
APR_BRIGADE_FOREACH (bucket, brigade): Iterate through a brigade.
APR_BRIGADE_INSERT_HEAD (brigade, bucket): Insert a list of buckets at the front of a brigade.
APR_BRIGADE_INSERT_TAIL (brigade, bucket): Insert a list of buckets at the back end of a brigade.
APR_BRIGADE_CONCAT (brigade1, brigade2): Concatenate brigade2 onto the end of brigade1.

Bucket list manipluation macros

Buckets are srclib/apr-util/include/apr_ring.h (see srclib/apr-util/include/apr_ring.h) so a bucket has pointers to previous and next neighbors. Following is a list of macros for manipulating the list features of buckets. For brevity and maintenance purposes, it is recommended that you use these macros and not manipulation of the bucket and brigade data structures.

APR_BUCKET_INSERT_BEFORE (insertMe, beforeMe): Insert a list of buckets starting with insertMe before beforeMe.
APR_BUCKET_INSERT_AFTER (insertMe, afterMe): Insert a list of buckets starting with insertMe after afterMe.
APR_BUCKET_NEXT (bucket): Get bucket's next neighbor.
APR_BUCKET_PREV (bucket): Get bucket's previous neighbor.
APR_BUCKET_REMOVE (e): Remove bucket from its brigade.
APR_BUCKET_INIT (bucket): Initialize a new bucket's prev/next pointers.

Eric Prud'hommeaux, 23th August 2001

Apache HTTP Server Version 2.0

Re: appending to the content brigade

Apache HTTP Server Version 2.0

Apache Filter Functions

Registering a filter

Implement the filter function

Add a filter

Filter type

Calling conventions and contracts

Example: serving a file

Apache HTTP Server Version 2.0

Apache HTTP Server Version 2.0

Buckets and Bucket Brigades

Table of contents

Types of buckets

Bucket virual functions

Using buckets and bucket brigades

Bucket type and function names

Calling conventions and contracts

Common idioms

Brigade functions

Brigade creation and destruction

Adding data to brigades

Brigade data consumption

Misc

Brigade macros

Bucket list manipluation macros

Apache HTTP Server Version 2.0

Reply via email to