This patch trims about 2500 instructions out of the path between ap_read_request and ap_graceful_stop_signalled when handling a simple GET request (will post some profiles in a few minutes). The patch demonstrates some primary principles but is still far from complete. POST requests will certainly fail, as will pipelined requests, but the basic framework is complete enought to allow these to be fixed without requiring major rewriting of code.

Quick Overview and Interesting (controversial) Design Points
1. HTTP header parsing moved out of protocol.c and into the HTTP_IN filter (ap_http_filter in http_protocol.c)
2. HTTP_IN is now nearly purely state driven. ap_http_filter registers a cleanup on the request pool that drives the filter to the 'new request' state.
3. A pointer to the current request req is placed in the conn_rec for use by ap_http_filter
4. New function, ap_brigade_getline() replaces ap_get_brigade(AP_MODE_GETLINE). This new function has some interesting properties..
- You can pass -in- a brigade to the function.
- If you pass in an empty brigade, it will issue ap_get_brigade(AP_MODE_READBYTES) to fetch bytes from lower level filters
- User of this function must be prepared to setaside the portion of the brigade not immediately needed (eg, the second request in a pipelined request in http_input_filter_ctx_t)


I generally like the idea of putting protocol parsing in a connection level filter. I think this is a more straight forward way to use the Apache 2 infrastructure to handle protocols.

Issues to resolve:
1. net_time_filter sits on top of the HTTP_IN filter. I have given this zero thought but it may need to repositioned to sit below HTTP_IN.
2. Lots of broken error paths.
3. Need to implement and debug code to handle pipelined requests and request bodies (chunked, unchunked, etc).
4. Formally address (by API changes?) the notion of a PROTOCOL filter that lasts the duration of a connection
5. Reevaluate what services should be provided by the core_input_filter vs http_in (or any other protocol) filter.


Bill


Index: include/http_protocol.h
===================================================================
RCS file: /home/cvs/httpd-2.0/include/http_protocol.h,v
retrieving revision 1.84
diff -u -r1.84 http_protocol.h
--- include/http_protocol.h     3 Feb 2003 17:52:53 -0000       1.84
+++ include/http_protocol.h     24 Feb 2003 17:03:20 -0000
@@ -717,6 +717,9 @@
                                                               apr_bucket_brigade *);
 AP_DECLARE_NONSTD(apr_status_t) ap_old_write_filter(ap_filter_t *f, 
apr_bucket_brigade *b);
 
+AP_CORE_DECLARE_NONSTD(apr_status_t) ap_http_header_input_filter(ap_filter_t *f, 
apr_bucket_brigade *b,
+                                                                 ap_input_mode_t 
mode, apr_read_type_e block,
+                                                                 apr_off_t readbytes);
 /*
  * Setting up the protocol fields for subsidiary requests...
  * Also, a wrapup function to keep the internal accounting straight.
Index: include/httpd.h
===================================================================
RCS file: /home/cvs/httpd-2.0/include/httpd.h,v
retrieving revision 1.194
diff -u -r1.194 httpd.h
--- include/httpd.h     3 Feb 2003 17:52:53 -0000       1.194
+++ include/httpd.h     24 Feb 2003 17:03:20 -0000
@@ -1013,6 +1013,7 @@
     void *sbh;
     /** The bucket allocator to use for all bucket/brigade creations */
     struct apr_bucket_alloc_t *bucket_alloc;
+    request_rec *r;
 };
 
 /* Per-vhost config... */
Index: modules/http/http_core.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/http/http_core.c,v
retrieving revision 1.309
diff -u -r1.309 http_core.c
--- modules/http/http_core.c    3 Feb 2003 17:53:03 -0000       1.309
+++ modules/http/http_core.c    24 Feb 2003 17:03:20 -0000
@@ -277,11 +277,18 @@
     int csd_set = 0;
     apr_socket_t *csd = NULL;
 
+    /* Add the http_header_input_filter to the filter stack. This filter should
+     * remain installed for the duration of the connection.
+     * XXX: should this be added in a pre_connection hook? I think doing it
+     * here makes the code easier to read and understand.
+     */
+    ap_add_input_filter_handle(ap_http_input_filter_handle, 
+                               NULL, NULL, c); 
+
     /*
      * Read and process each request found on our connection
      * until no requests are left or we decide to close.
      */
- 
     ap_update_child_status(c->sbh, SERVER_BUSY_READ, NULL);
     while ((r = ap_read_request(c)) != NULL) {
 
@@ -327,7 +334,6 @@
 
     return OK;
 }
-
 static void register_hooks(apr_pool_t *p)
 {
     ap_hook_process_connection(ap_process_http_connection,NULL,NULL,
@@ -338,7 +344,7 @@
     ap_hook_create_request(http_create_request, NULL, NULL, APR_HOOK_REALLY_LAST);
     ap_http_input_filter_handle =
         ap_register_input_filter("HTTP_IN", ap_http_filter,
-                                 NULL, AP_FTYPE_PROTOCOL);
+                                 NULL, AP_FTYPE_CONNECTION);
     ap_http_header_filter_handle =
         ap_register_output_filter("HTTP_HEADER", ap_http_header_filter, 
                                   NULL, AP_FTYPE_PROTOCOL);
Index: modules/http/http_protocol.c
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/http/http_protocol.c,v
retrieving revision 1.465
diff -u -r1.465 http_protocol.c
--- modules/http/http_protocol.c        19 Feb 2003 06:50:10 -0000      1.465
+++ modules/http/http_protocol.c        24 Feb 2003 17:03:21 -0000
@@ -739,75 +739,526 @@
     /* it wasn't found in the hash */
     return NULL;
 }
+/*
+ * Iterate across bbIn looking for an APR_ASCII_LF. If no LF is found, attempt to 
+ * read more bytes from the filter stack and keep searching until a LF is found or 
+ * an error condition is encountered. Unnused (unconsumed?) buckets are returned to 
+ * the caller in bbIn.
+ *
+ */
+static apr_status_t ap_brigade_getline(ap_filter_t *f,
+                                       apr_bucket_brigade *bbIn,
+                                       apr_read_type_e block,
+                                       apr_off_t maxbytes,
+                                       char **s,
+                                       apr_size_t *slen,
+                                       apr_pool_t *p)
+{
+    apr_status_t rv;
+    apr_size_t readbytes = 0;
+    char *pos;
+    char *cur = NULL;
+    const char *str;
+    apr_size_t len;
+    apr_bucket *e;
+    apr_bucket_brigade *bb = bbIn;
+    apr_bucket_brigade *bbNew = NULL;
+
+    *s = NULL;
+
+    while (1) {
+        for (e = APR_BRIGADE_FIRST(bb);  e != APR_BRIGADE_SENTINEL(bb); 
+             e = APR_BUCKET_NEXT(e)) {
+            /* */
+            rv = apr_bucket_read(e, &str, &len, block);
+            if (rv != APR_SUCCESS) {
+                return rv;
+            }
+            if (!len) {
+                continue;
+            }
+            pos = memchr(str, APR_ASCII_LF, len);
+            if (!pos) {
+                readbytes = readbytes + len;
+                /* do a length check */
+                if (readbytes > maxbytes) {
+                    return APR_ENOSPC;
+                }
+                continue;
+            }
+            /* Found a LF */
+            if (!readbytes) {
+                /* The common case: the entire LF terminated string is in 
+                 * the first bucket.
+                 */
+                len = pos - str + 1;
+                /* Optimization to prevent leaving a 0 byte bucket in the brigade */
+                if (len != e->length) {
+                    apr_bucket_split(e, len);
+                }
+            }
+            else {
+                /* The uncommon case: The LF terminated string spans multiple
+                 * buckets.
+                 * XXX: pflatten allocates storage then apr_palloc below
+                 * allocates storage for the exact same content. Optimize 
+                 * away one of the pallocs and memcpy.
+                 */
+                apr_bucket_brigade *tmpbb;
+                apr_brigade_partition(bb, readbytes + len, &e);
+                tmpbb = apr_brigade_split(bb, e);
+                apr_brigade_pflatten(bb, (char **) &str, &len, p);
+                apr_brigade_cleanup(bb);
+                APR_BRIGADE_CONCAT(bb, tmpbb);
+                apr_brigade_destroy(tmpbb);
+                apr_brigade_destroy(bbNew);
+                pos = memchr(str, APR_ASCII_LF, len);
+            }
+
+            /* Trim off the APR_ASCII_LF, APS_ASCII_CR and any extra trailing spaces 
+             * or tabs except for the first space or tab at the beginning of a blank 
+             * string.  This makes it much easier to check field values for exact 
matches.
+             */
+            if (pos > str && *(pos - 1) == APR_ASCII_CR) {
+                --pos;
+            }
+            while (pos > (str + 1)
+                   && (*(pos - 1) == APR_ASCII_BLANK || *(pos - 1) == APR_ASCII_TAB)) 
{
+                --pos;
+            }
+            /* Since we want to remove the LF from the line, we'll go ahead
+             * and set this last character to be the term NULL.
+             */
+            *pos = '\0';
+            *slen = len = pos - str;
+            if (len) {
+                *s = apr_palloc(p, len + 1);
+                memcpy(*s, str, len + 1);
+            }
+            if (!readbytes) {
+                apr_bucket_delete(e);
+            }
+            return APR_SUCCESS;
+        }
+
+        /* Oh jeesh, the brigade does not contain a LF. Fetch another brigade 
+         * and keep searching until we find one or hit an error. This should
+         * not be a common case.
+         */
+        if (!bbNew) {
+            bbNew = apr_brigade_create(p, f->c->bucket_alloc);
+        }
+        rv = ap_get_brigade(f, bbNew, AP_MODE_GETLINE, APR_BLOCK_READ, 0);
+        if (rv != APR_SUCCESS) {
+            return rv;
+        }
+        if (APR_BRIGADE_EMPTY(bbNew)) {
+            return APR_EGENERAL;
+        }
+        APR_BRIGADE_CONCAT(bb, bbNew);
+    }
+    return APR_SUCCESS;
+}
+
+static apr_status_t read_request_line(ap_filter_t *fnext, request_rec *r, 
apr_bucket_brigade *b,
+                                      ap_input_mode_t mode)
+{
+    apr_status_t rv;
+    const char *ll;
+    const char *uri;
+    const char *pro;
+    int major = 1, minor = 0;   /* Assume HTTP/1.0 if non-"HTTP" protocol */
+    char http[5];
+    apr_size_t len = 0;
+
+    r->the_request = NULL;
+    r->request_time = apr_time_now();
+    do {
+        /* Read past empty lines until we get a real request line,
+         * a read error, the connection closes (EOF), or we timeout.
+         * We skip empty lines because browsers have to tack a CRLF on to the end
+         * of POSTs to support old CERN webservers.  
+         */
+        rv = ap_brigade_getline(fnext, b, mode, DEFAULT_LIMIT_REQUEST_LINE+2,
+                                &r->the_request, &len, r->pool);
+        if (rv != APR_SUCCESS) {
+            return rv;
+        }
+    } while (len <= 0);
+
+    ll = r->the_request;
+    r->method = ap_getword_white(r->pool, &ll);
+    uri = ap_getword_white(r->pool, &ll);
+
+    /* Provide quick information about the request method as soon as known */
+    r->method_number = ap_method_number_of(r->method);
+    if (r->method_number == M_GET && r->method[0] == 'H') {
+        r->header_only = 1;
+    }
+
+    ap_parse_uri(r, uri);
+
+    /* ap_getline returns (size of max buffer - 1) if it fills up the
+     * buffer before finding the end-of-line.  This is only going to
+     * happen if it exceeds the configured limit for a request-line.
+     * The cast is safe, limit_req_line cannot be negative
+     */
+    if (len > (apr_size_t) r->server->limit_req_line) {
+        r->status    = HTTP_REQUEST_URI_TOO_LARGE;
+        r->proto_num = HTTP_VERSION(1,0);
+        r->protocol  = apr_pstrdup(r->pool, "HTTP/1.0");
+        return 0; /* XXX return an error code */
+    }
+
+    if (ll[0]) {
+        r->assbackwards = 0;
+        pro = ll;
+        len = strlen(ll);
+    } else {
+        r->assbackwards = 1;
+        pro = "HTTP/0.9";
+        len = 8;
+    }
+    r->protocol = apr_pstrmemdup(r->pool, pro, (apr_size_t) len);
+
+    /* Avoid sscanf in the common case */
+    if (len == 8
+        && pro[0] == 'H' && pro[1] == 'T' && pro[2] == 'T' && pro[3] == 'P'
+        && pro[4] == '/' && apr_isdigit(pro[5]) && pro[6] == '.'
+        && apr_isdigit(pro[7])) {
+        r->proto_num = HTTP_VERSION(pro[5] - '0', pro[7] - '0');
+    }
+    else if (3 == sscanf(r->protocol, "%4s/%u.%u", http, &major, &minor)
+             && (strcasecmp("http", http) == 0)
+             && (minor < HTTP_VERSION(1, 0)) ) /* don't allow HTTP/0.1000 */
+        r->proto_num = HTTP_VERSION(major, minor);
+    else
+        r->proto_num = HTTP_VERSION(1, 0);
+
+    return APR_SUCCESS;
+}
+static apr_status_t read_header_fields(ap_filter_t *fnext, request_rec *r, 
+                                       apr_bucket_brigade *b, 
+                                       ap_input_mode_t mode)
+{
+    apr_status_t rv;
+    char *last_field = NULL;
+    apr_size_t last_len = 0;
+    apr_size_t alloc_len = 0;
+    char *field;
+    char *value;
+    apr_size_t len;
+    int fields_read = 0;
+    apr_table_t *tmp_headers;
+
+    if (r->assbackwards) {
+        if (r->header_only) {
+#if 0
+            /*
+             * Client asked for headers only with HTTP/0.9, which doesn't send
+             * headers! Have to dink things just to make sure the error message
+             * comes through...
+             */
+            ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
+                          "client sent invalid HTTP/0.9 request: HEAD %s",
+                          r->uri);
+            r->header_only = 0;
+            r->status = HTTP_BAD_REQUEST;
+            ap_send_error_response(r, 0);
+            ap_run_log_transaction(r);
+#endif
+        }
+
+        /* No headers on an HTTP/0.9 request */
+        return APR_SUCCESS;
+    }
+
+    /* We'll use apr_table_overlap later to merge these into r->headers_in. */
+    tmp_headers = apr_table_make(r->pool, 50);
+
+    /*
+     * Read header lines until we get the empty separator line, a read error,
+     * the connection closes (EOF), reach the server limit, or we timeout.
+     */
+    while (1) {
+        int folded = 0;
+        rv = ap_brigade_getline(fnext, b, mode, DEFAULT_LIMIT_REQUEST_LINE+2,
+                                &field, &len, r->pool);
+
+        /* ap_rgetline returns APR_ENOSPC if it fills up the buffer before
+         * finding the end-of-line.  This is only going to happen if it
+         * exceeds the configured limit for a field size.
+         * The cast is safe, limit_req_fieldsize cannot be negative
+         */
+        if (rv == APR_ENOSPC
+            || (rv == APR_SUCCESS 
+                && len > (apr_size_t)r->server->limit_req_fieldsize)) {
+            r->status = HTTP_BAD_REQUEST;
+            apr_table_setn(r->notes, "error-notes",
+                           apr_pstrcat(r->pool,
+                                       "Size of a request header field "
+                                       "exceeds server limit.<br />\n"
+                                       "<pre>\n",
+                                       ap_escape_html(r->pool, field),
+                                       "</pre>\n", NULL));
+            return APR_ENOSPC;
+        }
+
+        if (rv != APR_SUCCESS) {
+            r->status = HTTP_BAD_REQUEST;
+            return rv;
+        }
+
+        if (last_field != NULL) {
+            if ((len > 0) && ((*field == '\t') || *field == ' ')) {
+                /* This line is a continuation of the preceding line(s),
+                 * so append it to the line that we've set aside.
+                 * Note: this uses a power-of-two allocator to avoid
+                 * doing O(n) allocs and using O(n^2) space for
+                 * continuations that span many many lines.
+                 */
+                if (last_len + len > alloc_len) {
+                    char *fold_buf;
+                    alloc_len += alloc_len;
+                    if (last_len + len > alloc_len) {
+                        alloc_len = last_len + len;
+                    }
+                    fold_buf = (char *)apr_palloc(r->pool, alloc_len);
+                    memcpy(fold_buf, last_field, last_len);
+                    last_field = fold_buf;
+                }
+                memcpy(last_field + last_len, field, len +1); /* +1 for nul */
+                last_len += len;
+                folded = 1;
+            }
+            else {
+                if (r->server->limit_req_fields
+                    && (++fields_read > r->server->limit_req_fields)) {
+                    r->status = HTTP_BAD_REQUEST;
+                    apr_table_setn(r->notes, "error-notes",
+                                   "The number of request header fields "
+                                   "exceeds this server's limit.");
+                    return APR_ENOSPC;  /* XXX: ??? */
+                }
+
+                if (!(value = strchr(last_field, ':'))) { /* Find ':' or    */
+                    r->status = HTTP_BAD_REQUEST;      /* abort bad request */
+                    apr_table_setn(r->notes, "error-notes",
+                                   apr_pstrcat(r->pool,
+                                               "Request header field is "
+                                               "missing ':' separator.<br />\n"
+                                               "<pre>\n",
+                                               ap_escape_html(r->pool,
+                                                              last_field),
+                                               "</pre>\n", NULL));
+                    return APR_ENOSPC; /* XXX ??? */
+                }
+
+                *value = '\0';
+                ++value;
+                while (*value == ' ' || *value == '\t') {
+                    ++value;            /* Skip to start of value   */
+                }
+
+                apr_table_addn(tmp_headers, last_field, value);
+
+                /* reset the alloc_len so that we'll allocate a new
+                 * buffer if we have to do any more folding: we can't
+                 * use the previous buffer because its contents are
+                 * now part of tmp_headers
+                 */
+                alloc_len = 0;
+
+            } /* end if current line is not a continuation starting with tab */
+        }
+
+        /* Found a blank line, stop. */
+        if (len == 0) {
+            break;
+        }
+
+        /* Keep track of this line so that we can parse it on
+         * the next loop iteration.  (In the folded case, last_field
+         * has been updated already.)
+         */
+        if (!folded) {
+            last_field = field;
+            last_len = len;
+        }
+    }
+
+    apr_table_overlap(r->headers_in, tmp_headers, APR_OVERLAP_TABLES_MERGE);
+    return APR_SUCCESS;
+}
+static apr_status_t check_header_fields(request_rec *r)
+{
+    const char *expect;
+    if ((!r->hostname && (r->proto_num >= HTTP_VERSION(1, 1)))
+        || ((r->proto_num == HTTP_VERSION(1, 1))
+            && !apr_table_get(r->headers_in, "Host"))) {
+        /*
+         * Client sent us an HTTP/1.1 or later request without telling us the
+         * hostname, either with a full URL or a Host: header. We therefore
+         * need to (as per the 1.1 spec) send an error.  As a special case,
+         * HTTP/1.1 mentions twice (S9, S14.23) that a request MUST contain
+         * a Host: header, and the server MUST respond with 400 if it doesn't.
+         */
+        r->status = HTTP_BAD_REQUEST;
+        ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
+                      "client sent HTTP/1.1 request without hostname "
+                      "(see RFC2616 section 14.23): %s", r->uri);
+    }
+
+    if (r->status != HTTP_OK) {
+#if 0
+        ap_send_error_response(r, 0);
+        ap_run_log_transaction(r);
+        return r;
+#endif
+    }
+
+    if (((expect = apr_table_get(r->headers_in, "Expect")) != NULL)
+        && (expect[0] != '\0')) {
+        /*
+         * The Expect header field was added to HTTP/1.1 after RFC 2068
+         * as a means to signal when a 100 response is desired and,
+         * unfortunately, to signal a poor man's mandatory extension that
+         * the server must understand or return 417 Expectation Failed.
+         */
+        if (strcasecmp(expect, "100-continue") == 0) {
+            r->expecting_100 = 1;
+        }
+        else {
+            r->status = HTTP_EXPECTATION_FAILED;
+#if 0
+            ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r,
+                          "client sent an unrecognized expectation value of "
+                          "Expect: %s", expect);
+            ap_send_error_response(r, 0);
+            ap_run_log_transaction(r);
+            return r;
+#endif
+        }
+    }
+    return APR_SUCCESS;
+}
 
 static long get_chunk_size(char *);
 
-typedef struct http_filter_ctx {
+typedef struct http_input_filter_ctx {
+    apr_bucket_brigade *b;
+    enum {
+        HIF_READ_HEADER_FIELDS,
+        HIF_READ_BODY_LENGTH,
+        HIF_READ_BODY_CHUNKED_SEND_100,
+        HIF_READ_BODY_CHUNKED,
+        HIF_BODY_NONE,
+        HIF_EOS_SENT
+    } state;
     apr_off_t remaining;
     apr_off_t limit;
     apr_off_t limit_used;
-    enum {
-        BODY_NONE,
-        BODY_LENGTH,
-        BODY_CHUNK
-    } state;
-    int eos_sent;
-} http_ctx_t;
-
-/* This is the HTTP_INPUT filter for HTTP requests and responses from 
- * proxied servers (mod_proxy).  It handles chunked and content-length 
- * bodies.  This can only be inserted/used after the headers
- * are successfully parsed. 
- */
+    const char *tenc;
+    const char *lenp;
+} http_input_filter_ctx_t;
+static apr_status_t reset_state(void *arg)
+{
+    http_input_filter_ctx_t *ctx = (http_input_filter_ctx_t *) arg;
+    ctx->state = HIF_READ_HEADER_FIELDS;
+    return APR_SUCCESS;
+}
 apr_status_t ap_http_filter(ap_filter_t *f, apr_bucket_brigade *b,
                             ap_input_mode_t mode, apr_read_type_e block,
                             apr_off_t readbytes)
 {
-    apr_bucket *e;
-    http_ctx_t *ctx = f->ctx;
     apr_status_t rv;
+    http_input_filter_ctx_t *ctx = f->ctx;
+    apr_bucket *e;
     apr_off_t totalread;
-
-    /* just get out of the way of things we don't want. */
+    request_rec *r;
+    r = f->r = f->c->r;
+    
     if (mode != AP_MODE_READBYTES && mode != AP_MODE_GETLINE) {
         return ap_get_brigade(f->next, b, mode, block, readbytes);
     }
 
     if (!ctx) {
-        const char *tenc, *lenp;
-        f->ctx = ctx = apr_palloc(f->r->pool, sizeof(*ctx));
-        ctx->state = BODY_NONE;
-        ctx->remaining = 0;
-        ctx->limit_used = 0;
-        ctx->eos_sent = 0;
+        f->ctx = ctx = apr_pcalloc(f->c->pool, sizeof(*ctx));
+    }
 
-        /* LimitRequestBody does not apply to proxied responses.
-         * Consider implementing this check in its own filter. 
-         * Would adding a directive to limit the size of proxied 
-         * responses be useful?
+    switch (ctx->state) {
+    case HIF_READ_HEADER_FIELDS: 
+    {
+        /* Read and parse the headers. 
+         * Begin by fetching and reading the request line 
          */
-        if (!f->r->proxyreq) {
-            ctx->limit = ap_get_limit_req_body(f->r);
+        rv = ap_get_brigade(f->next, b, mode, block, readbytes);
+        if (rv != APR_SUCCESS) {
+            return rv;
         }
-        else {
-            ctx->limit = 0;
+        if (ctx->b && !APR_BRIGADE_EMPTY(ctx->b)) {
+            APR_BRIGADE_PREPEND(b, ctx->b);
+        }
+        rv = read_request_line(f->next, r, b, mode);
+        if (rv != APR_SUCCESS) {
+            return rv;
+        }
+        rv = read_header_fields(f->next, r, b, mode);
+        if (rv != APR_SUCCESS) {
+#if 0
+            /* Handle errors at the top of the filter chain? */
+            if (r->status != HTTP_REQUEST_TIME_OUT) {
+                ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
+                              "request failed: error reading the headers");
+                ap_send_error_response(r, 0);
+                ap_run_log_transaction(r);
+            }
+#endif
+            return rv;
+        }
+        /* Set aside any unused bytes in the brigade 
+         * XXX: this check is not right
+         */
+        if (!APR_BRIGADE_EMPTY(b)) {
+            if (!ctx->b) {
+                ctx->b = apr_brigade_create(f->c->pool, f->c->bucket_alloc);
+            }
+            APR_BRIGADE_CONCAT(ctx->b, b);
         }
+        /* Register a cleanup to reset the ctx->state to HIF_READ_HEADER_FIELDS upon
+         * destruction of the request pool
+         */
+        apr_pool_cleanup_register(r->pool, ctx, reset_state, apr_pool_cleanup_null);
 
-        tenc = apr_table_get(f->r->headers_in, "Transfer-Encoding");
-        lenp = apr_table_get(f->r->headers_in, "Content-Length");
+        r->status = HTTP_OK;
+        ap_update_vhost_from_headers(r);
 
-        if (tenc) {
-            if (!strcasecmp(tenc, "chunked")) {
-                ctx->state = BODY_CHUNK;
-            }
+        /* we may have switched to another server */
+        r->per_dir_config = r->server->lookup_defaults;
+
+        rv = check_header_fields(r);
+        if (rv != APR_SUCCESS) {
+            return rv;
+        }
+
+        /* What we do next depends on whether the request includes a 
+         * body and if so, whether the body is transfer encoded or not
+         */
+        ctx->tenc = apr_table_get(f->r->headers_in, "Transfer-Encoding");
+        ctx->lenp = apr_table_get(f->r->headers_in, "Content-Length");
+
+        /* If either tenc or lenp, then the request has a body */
+        if (ctx->tenc) {
+            /* Humm... assume chunked... */
+            ctx->state = HIF_READ_BODY_CHUNKED_SEND_100;
         }
-        else if (lenp) {
-            int conversion_error = 0;
+        else if (ctx->lenp) {
             char *endstr;
+            ctx->state = HIF_READ_BODY_LENGTH;
 
-            ctx->state = BODY_LENGTH;
+            /* Perform one time checks on the content-length of the body */
             errno = 0;
-            ctx->remaining = strtol(lenp, &endstr, 10);        /* we depend on ANSI */
+            ctx->remaining = strtol(ctx->lenp, &endstr, 10);   /* we depend on ANSI */
 
             /* This protects us from over/underflow (the errno check),
              * non-digit chars in the string (excluding leading space)
@@ -815,68 +1266,78 @@
              * on the strtol implementation, the errno check may also
              * trigger on an all whitespace string */
             if (errno || (endstr && *endstr) || (ctx->remaining < 0)) {
-                 conversion_error = 1; 
-            }
-
-            if (conversion_error) {
+                /* Conversion error */
                 apr_bucket_brigade *bb;
-
                 ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, f->r,
                               "Invalid Content-Length");
-
                 bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
                 e = ap_bucket_error_create(HTTP_REQUEST_ENTITY_TOO_LARGE, NULL,
                                            f->r->pool, f->c->bucket_alloc);
                 APR_BRIGADE_INSERT_TAIL(bb, e);
                 e = apr_bucket_eos_create(f->c->bucket_alloc);
                 APR_BRIGADE_INSERT_TAIL(bb, e);
-                ctx->eos_sent = 1;
+                ctx->state = HIF_EOS_SENT;
                 return ap_pass_brigade(f->r->output_filters, bb);
             }
-            
             /* If we have a limit in effect and we know the C-L ahead of
              * time, stop it here if it is invalid. 
              */ 
             if (ctx->limit && ctx->limit < ctx->remaining) {
                 apr_bucket_brigade *bb;
                 ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, f->r,
-                          "Requested content-length of %" APR_OFF_T_FMT 
-                          " is larger than the configured limit"
-                          " of %" APR_OFF_T_FMT, ctx->remaining, ctx->limit);
+                              "Requested content-length of %" APR_OFF_T_FMT 
+                              " is larger than the configured limit"
+                              " of %" APR_OFF_T_FMT, ctx->remaining, ctx->limit);
                 bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
                 e = ap_bucket_error_create(HTTP_REQUEST_ENTITY_TOO_LARGE, NULL,
                                            f->r->pool, f->c->bucket_alloc);
                 APR_BRIGADE_INSERT_TAIL(bb, e);
                 e = apr_bucket_eos_create(f->c->bucket_alloc);
                 APR_BRIGADE_INSERT_TAIL(bb, e);
-                ctx->eos_sent = 1;
+                ctx->state = HIF_EOS_SENT;
                 return ap_pass_brigade(f->r->output_filters, bb);
             }
         }
+        else {
+            ctx->state = HIF_BODY_NONE;
+            /* XXX: proxy resp is broken... */
+            
+            /* If we don't have a request entity indicated by the headers, EOS.
+             * (BODY_NONE is a valid intermediate state due to trailers,
+             *  but it isn't a valid starting state.)
+             *
+             * RFC 2616 Section 4.4 note 5 states that connection-close
+             * is invalid for a request entity - request bodies must be
+             * denoted by C-L or T-E: chunked.
+             *
+             * Note that since the proxy uses this filter to handle the
+             * proxied *response*, proxy responses MUST be exempt.
+             */
+            if (f->r->proxyreq != PROXYREQ_RESPONSE) {
+                e = apr_bucket_eos_create(f->c->bucket_alloc);
+                APR_BRIGADE_INSERT_TAIL(b, e);
+                ctx->state = HIF_EOS_SENT;
+                return APR_SUCCESS;
+            }
+        }
+        /* We have a request body to handle */
 
-        /* If we don't have a request entity indicated by the headers, EOS.
-         * (BODY_NONE is a valid intermediate state due to trailers,
-         *  but it isn't a valid starting state.)
-         *
-         * RFC 2616 Section 4.4 note 5 states that connection-close
-         * is invalid for a request entity - request bodies must be
-         * denoted by C-L or T-E: chunked.
-         *
-         * Note that since the proxy uses this filter to handle the
-         * proxied *response*, proxy responses MUST be exempt.
+        /* LimitRequestBody does not apply to proxied responses.
+         * Consider implementing this check in its own filter. 
+         * Would adding a directive to limit the size of proxied 
+         * responses be useful?
          */
-        if (ctx->state == BODY_NONE && f->r->proxyreq != PROXYREQ_RESPONSE) {
-            e = apr_bucket_eos_create(f->c->bucket_alloc);
-            APR_BRIGADE_INSERT_TAIL(b, e);
-            ctx->eos_sent = 1;
-            return APR_SUCCESS;
+        if (!f->r->proxyreq) {
+            ctx->limit = ap_get_limit_req_body(f->r);
+        }
+        else {
+            ctx->limit = 0;
         }
-
         /* Since we're about to read data, send 100-Continue if needed.
-         * Only valid on chunked and C-L bodies where the C-L is > 0. */
-        if ((ctx->state == BODY_CHUNK || 
-            (ctx->state == BODY_LENGTH && ctx->remaining > 0)) &&
-            f->r->expecting_100 && f->r->proto_num >= HTTP_VERSION(1,1)) {
+         * Only valid on chunked and C-L bodies where the C-L is > 0. 
+         * XXX: Is it valid to assume C-L is > 0 here?
+         */
+        if (f->r->expecting_100 && f->r->proto_num >= HTTP_VERSION(1,1)) {
             char *tmp;
             apr_bucket_brigade *bb;
 
@@ -892,138 +1353,89 @@
             ap_pass_brigade(f->c->output_filters, bb);
         }
 
-        /* We can't read the chunk until after sending 100 if required. */
-        if (ctx->state == BODY_CHUNK) {
-            char line[30];
-            apr_bucket_brigade *bb;
-            apr_size_t len = 30;
-            apr_off_t brigade_length;
-
-            bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
-
-            rv = ap_get_brigade(f->next, bb, AP_MODE_GETLINE,
-                                APR_BLOCK_READ, 0);
-
-            if (rv == APR_SUCCESS) {
-                /* We have to check the length of the brigade we got back.
-                 * We will not accept partial lines.
-                 */
-                rv = apr_brigade_length(bb, 1, &brigade_length);
-                if (rv == APR_SUCCESS
-                    && brigade_length > f->r->server->limit_req_line) {
-                    rv = APR_ENOSPC;
-                }
-                if (rv == APR_SUCCESS) {
-                    rv = apr_brigade_flatten(bb, line, &len);
-                    if (rv == APR_SUCCESS) {
-                        ctx->remaining = get_chunk_size(line);
-                    }
-                }
-            }
-            apr_brigade_cleanup(bb);
-
-            /* Detect chunksize error (such as overflow) */
-            if (rv != APR_SUCCESS || ctx->remaining < 0) {
-                ctx->remaining = 0; /* Reset it in case we have to
-                                     * come back here later */
-                e = ap_bucket_error_create(HTTP_REQUEST_ENTITY_TOO_LARGE, NULL,
-                                           f->r->pool,
-                                           f->c->bucket_alloc);
-                APR_BRIGADE_INSERT_TAIL(bb, e);
-                e = apr_bucket_eos_create(f->c->bucket_alloc);
-                APR_BRIGADE_INSERT_TAIL(bb, e);
-                ctx->eos_sent = 1;
-                return ap_pass_brigade(f->r->output_filters, bb);
-            }
-
-            if (!ctx->remaining) {
-                /* Handle trailers by calling ap_get_mime_headers again! */
-                ctx->state = BODY_NONE;
-                ap_get_mime_headers(f->r);
-                e = apr_bucket_eos_create(f->c->bucket_alloc);
-                APR_BRIGADE_INSERT_TAIL(b, e);
-                ctx->eos_sent = 1;
-                return APR_SUCCESS;
-            }
-        } 
-    }
+        return ap_http_filter(f, b, mode, block, readbytes);
 
-    if (ctx->eos_sent) {
-        e = apr_bucket_eos_create(f->c->bucket_alloc);
-        APR_BRIGADE_INSERT_TAIL(b, e);
-        return APR_SUCCESS;
+        break;
     }
-        
-    if (!ctx->remaining) {
-        switch (ctx->state) {
-        case BODY_NONE:
-            break;
-        case BODY_LENGTH:
+    case HIF_READ_BODY_LENGTH:  
+    {
+        if (!ctx->remaining) {
             e = apr_bucket_eos_create(f->c->bucket_alloc);
             APR_BRIGADE_INSERT_TAIL(b, e);
-            ctx->eos_sent = 1;
+            ctx->state = HIF_EOS_SENT;
             return APR_SUCCESS;
-        case BODY_CHUNK:
-            {
-                char line[30];
-                apr_bucket_brigade *bb;
-                apr_size_t len = 30;
-
-                bb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
-
-                /* We need to read the CRLF after the chunk.  */
-                rv = ap_get_brigade(f->next, bb, AP_MODE_GETLINE,
-                                    APR_BLOCK_READ, 0);
-                apr_brigade_cleanup(bb);
+        }
+        break;
+    }
+    case HIF_READ_BODY_CHUNKED_SEND_100:
+    {
+        /* We can't read the chunk until after sending 100 if required */
+        char *line;
+        apr_size_t len = 0;
 
-                if (rv == APR_SUCCESS) {
-                    /* Read the real chunk line. */
-                    rv = ap_get_brigade(f->next, bb, AP_MODE_GETLINE,
-                                        APR_BLOCK_READ, 0);
-                    if (rv == APR_SUCCESS) {
-                        rv = apr_brigade_flatten(bb, line, &len);
-                        if (rv == APR_SUCCESS) {
-                            ctx->remaining = get_chunk_size(line);
-                        }
-                    }
-                    apr_brigade_cleanup(bb);
-                }
+        if (!APR_BRIGADE_EMPTY(ctx->b)) {
+            APR_BRIGADE_PREPEND(b, ctx->b);
+        }
+        rv = ap_brigade_getline(f->next, b, APR_BLOCK_READ, 30, &line, &len,
+                                r->pool);
+        if (rv == APR_SUCCESS && len != 0) {
+            ctx->remaining = get_chunk_size(line);
+        }
 
-                /* Detect chunksize error (such as overflow) */
-                if (rv != APR_SUCCESS || ctx->remaining < 0) {
-                    ctx->remaining = 0; /* Reset it in case we have to
-                                         * come back here later */
-                    e = ap_bucket_error_create(HTTP_REQUEST_ENTITY_TOO_LARGE,
-                                               NULL, f->r->pool,
-                                               f->c->bucket_alloc);
-                    APR_BRIGADE_INSERT_TAIL(bb, e);
-                    e = apr_bucket_eos_create(f->c->bucket_alloc);
-                    APR_BRIGADE_INSERT_TAIL(bb, e);
-                    ctx->eos_sent = 1;
-                    return ap_pass_brigade(f->r->output_filters, bb);
-                }
+        /* Detect chunksize error (such as overflow) 
+         * XXX: should we use this brigade to send a response?
+         * It should be emptied at least...
+         */
+        if (rv != APR_SUCCESS || ctx->remaining < 0) {
+            ctx->remaining = 0; /* Reset it in case we have to
+                                 * come back here later */
+            e = ap_bucket_error_create(HTTP_REQUEST_ENTITY_TOO_LARGE, NULL,
+                                       f->r->pool,
+                                       f->c->bucket_alloc);
+            APR_BRIGADE_INSERT_TAIL(b, e);
+            e = apr_bucket_eos_create(f->c->bucket_alloc);
+            APR_BRIGADE_INSERT_TAIL(b, e);
+            ctx->state = HIF_EOS_SENT;
+            return ap_pass_brigade(f->r->output_filters, b);
+        }
 
-                if (!ctx->remaining) {
-                    /* Handle trailers by calling ap_get_mime_headers again! */
-                    ctx->state = BODY_NONE;
-                    ap_get_mime_headers(f->r);
-                    e = apr_bucket_eos_create(f->c->bucket_alloc);
-                    APR_BRIGADE_INSERT_TAIL(b, e);
-                    ctx->eos_sent = 1;
-                    return APR_SUCCESS;
-                }
-            }
-            break;
+        if (!ctx->remaining) {
+            /* Handle trailers by calling ap_get_mime_headers again! */
+            ctx->state = HIF_BODY_NONE;
+/*             ap_get_mime_headers(f->r); This is broken */
+            e = apr_bucket_eos_create(f->c->bucket_alloc);
+            APR_BRIGADE_INSERT_TAIL(b, e);
+            ctx->state = HIF_EOS_SENT;
+            return APR_SUCCESS;
         }
+        ctx->state = HIF_READ_BODY_CHUNKED;
+        /* The break; is intentionally left out */
     }
+    case HIF_READ_BODY_CHUNKED:
+    {
+        /* This is still broken.... */
+        if (!ctx->remaining) {
+            char line[30];
+            apr_bucket_brigade *bb;
+            apr_size_t len = 30;
+            apr_off_t brigade_length;
+        }
+        break; 
+    }
+    case HIF_EOS_SENT:
+    {
+        e = apr_bucket_eos_create(f->c->bucket_alloc);
+        APR_BRIGADE_INSERT_TAIL(b, e);
+        return APR_SUCCESS;
+        break;
+    }
+
 
     /* Ensure that the caller can not go over our boundary point. */
-    if (ctx->state == BODY_LENGTH || ctx->state == BODY_CHUNK) {
-        if (ctx->remaining < readbytes) {
-            readbytes = ctx->remaining;
-        }
-        AP_DEBUG_ASSERT(readbytes > 0);
+    if (ctx->remaining < readbytes) {
+        readbytes = ctx->remaining;
     }
+    AP_DEBUG_ASSERT(readbytes > 0);
 
     rv = ap_get_brigade(f->next, b, mode, block, readbytes);
 
@@ -1038,16 +1450,17 @@
      * it means our assumptions have changed. */
     AP_DEBUG_ASSERT(totalread >= 0);
 
-    if (ctx->state != BODY_NONE) {
+    if (ctx->state != HIF_BODY_NONE) {
         ctx->remaining -= totalread;
     }
 
     /* If we have no more bytes remaining on a C-L request, 
      * save the callter a roundtrip to discover EOS.
      */
-    if (ctx->state == BODY_LENGTH && ctx->remaining == 0) {
+    if (ctx->state == HIF_READ_BODY_LENGTH && ctx->remaining == 0) {
         e = apr_bucket_eos_create(f->c->bucket_alloc);
         APR_BRIGADE_INSERT_TAIL(b, e);
+        /* WGS: set ctx->eos_sent? */
     }
 
     /* We have a limit in effect. */
@@ -1069,12 +1482,13 @@
             APR_BRIGADE_INSERT_TAIL(bb, e);
             e = apr_bucket_eos_create(f->c->bucket_alloc);
             APR_BRIGADE_INSERT_TAIL(bb, e);
-            ctx->eos_sent = 1;
+            ctx->state = HIF_EOS_SENT;
             return ap_pass_brigade(f->r->output_filters, bb);
         }
     }
 
     return APR_SUCCESS;
+    }
 }
 
 /* The index is found by its offset from the x00 code of each level.
@@ -1542,6 +1956,7 @@
         ap_add_output_filters_by_type(r);
     }
 }
+
 
 typedef struct header_filter_ctx {
     int headers_sent;
Index: modules/http/mod_core.h
===================================================================
RCS file: /home/cvs/httpd-2.0/modules/http/mod_core.h,v
retrieving revision 1.22
diff -u -r1.22 mod_core.h
--- modules/http/mod_core.h     3 Feb 2003 17:53:04 -0000       1.22
+++ modules/http/mod_core.h     24 Feb 2003 17:03:21 -0000
@@ -75,7 +75,7 @@
 extern AP_DECLARE_DATA ap_filter_rec_t *ap_http_header_filter_handle;
 extern AP_DECLARE_DATA ap_filter_rec_t *ap_chunk_filter_handle;
 extern AP_DECLARE_DATA ap_filter_rec_t *ap_byterange_filter_handle;
-
+extern AP_DECLARE_DATA ap_filter_rec_t *ap_http_header_input_filter_handle;
 /*
  * These (input) filters are internal to the mod_core operation.
  */
Index: server/protocol.c
===================================================================
RCS file: /home/cvs/httpd-2.0/server/protocol.c,v
retrieving revision 1.127
diff -u -r1.127 protocol.c
--- server/protocol.c   3 Feb 2003 17:53:19 -0000       1.127
+++ server/protocol.c   24 Feb 2003 17:03:21 -0000
@@ -245,7 +245,7 @@
                                           apr_size_t *read, request_rec *r,
                                           int fold, apr_bucket_brigade *bb)
 {
-    apr_status_t rv;
+    apr_status_t rv = APR_SUCCESS;
     apr_bucket *e;
     apr_size_t bytes_handled = 0, current_alloc = 0;
     char *pos, *last_char = *s;
@@ -742,7 +742,7 @@
 
     return 1;
 }
-
+#if 0
 AP_DECLARE(void) ap_get_mime_headers_core(request_rec *r, apr_bucket_brigade *bb)
 {
     char *last_field = NULL;
@@ -880,13 +880,14 @@
     apr_brigade_destroy(tmp_bb);
 }
 
+#endif
 request_rec *ap_read_request(conn_rec *conn)
 {
+    apr_status_t rv;
     request_rec *r;
     apr_pool_t *p;
-    const char *expect;
     int access_status;
-    apr_bucket_brigade *tmp_bb;
+    apr_bucket_brigade *bb;
 
     apr_pool_create(&p, conn->pool);
     r = apr_pcalloc(p, sizeof(request_rec));
@@ -923,117 +924,16 @@
     r->status          = HTTP_REQUEST_TIME_OUT;  /* Until we get a request */
     r->the_request     = NULL;
 
-    tmp_bb = apr_brigade_create(r->pool, r->connection->bucket_alloc);
-
-    /* Get the request... */
-    if (!read_request_line(r, tmp_bb)) {
-        if (r->status == HTTP_REQUEST_URI_TOO_LARGE) {
-            ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
-                          "request failed: URI too long");
-            ap_send_error_response(r, 0);
-            ap_run_log_transaction(r);
-            apr_brigade_destroy(tmp_bb);
-            return r;
-        }
-
-        apr_brigade_destroy(tmp_bb);
-        return NULL;
-    }
-
-    if (!r->assbackwards) {
-        ap_get_mime_headers_core(r, tmp_bb);
-        if (r->status != HTTP_REQUEST_TIME_OUT) {
-            ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
-                          "request failed: error reading the headers");
-            ap_send_error_response(r, 0);
-            ap_run_log_transaction(r);
-            apr_brigade_destroy(tmp_bb);
-            return r;
-        }
-    }
-    else {
-        if (r->header_only) {
-            /*
-             * Client asked for headers only with HTTP/0.9, which doesn't send
-             * headers! Have to dink things just to make sure the error message
-             * comes through...
-             */
-            ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
-                          "client sent invalid HTTP/0.9 request: HEAD %s",
-                          r->uri);
-            r->header_only = 0;
-            r->status = HTTP_BAD_REQUEST;
-            ap_send_error_response(r, 0);
-            ap_run_log_transaction(r);
-            apr_brigade_destroy(tmp_bb);
-            return r;
-        }
-    }
-
-    apr_brigade_destroy(tmp_bb);
-
-    r->status = HTTP_OK;                         /* Until further notice. */
-
-    /* update what we think the virtual host is based on the headers we've
-     * now read. may update status.
-     */
-    ap_update_vhost_from_headers(r);
-
-    /* we may have switched to another server */
-    r->per_dir_config = r->server->lookup_defaults;
-
-    if ((!r->hostname && (r->proto_num >= HTTP_VERSION(1, 1)))
-        || ((r->proto_num == HTTP_VERSION(1, 1))
-            && !apr_table_get(r->headers_in, "Host"))) {
-        /*
-         * Client sent us an HTTP/1.1 or later request without telling us the
-         * hostname, either with a full URL or a Host: header. We therefore
-         * need to (as per the 1.1 spec) send an error.  As a special case,
-         * HTTP/1.1 mentions twice (S9, S14.23) that a request MUST contain
-         * a Host: header, and the server MUST respond with 400 if it doesn't.
-         */
-        r->status = HTTP_BAD_REQUEST;
-        ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
-                      "client sent HTTP/1.1 request without hostname "
-                      "(see RFC2616 section 14.23): %s", r->uri);
-    }
-
-    if (r->status != HTTP_OK) {
-        ap_send_error_response(r, 0);
-        ap_run_log_transaction(r);
-        return r;
-    }
-
+    r->connection->r = r;
+    bb = apr_brigade_create(r->pool, r->connection->bucket_alloc);
+    rv = ap_get_brigade(r->input_filters, bb, AP_MODE_READBYTES,
+                        APR_BLOCK_READ, HUGE_STRING_LEN);
+    apr_brigade_destroy(bb);
     if ((access_status = ap_run_post_read_request(r))) {
         ap_die(access_status, r);
         ap_run_log_transaction(r);
         return NULL;
     }
-
-    if (((expect = apr_table_get(r->headers_in, "Expect")) != NULL)
-        && (expect[0] != '\0')) {
-        /*
-         * The Expect header field was added to HTTP/1.1 after RFC 2068
-         * as a means to signal when a 100 response is desired and,
-         * unfortunately, to signal a poor man's mandatory extension that
-         * the server must understand or return 417 Expectation Failed.
-         */
-        if (strcasecmp(expect, "100-continue") == 0) {
-            r->expecting_100 = 1;
-        }
-        else {
-            r->status = HTTP_EXPECTATION_FAILED;
-            ap_log_rerror(APLOG_MARK, APLOG_INFO, 0, r,
-                          "client sent an unrecognized expectation value of "
-                          "Expect: %s", expect);
-            ap_send_error_response(r, 0);
-            ap_run_log_transaction(r);
-            return r;
-        }
-    }
-
-    ap_add_input_filter_handle(ap_http_input_filter_handle,
-                               NULL, r, r->connection);
 
     return r;
 }

Reply via email to