protocol.c

Brian Pane Tue, 03 Jan 2006 01:21:08 -0800


On Jan 2, 2006, at 3:41 PM, Justin Erenkrantz wrote:

+static apr_status_t process_request_line(request_rec *r, char *line,
+                                         int skip_first)
+{
+    if (!skip_first && (r->the_request == NULL)) {
+        /* This is the first line of the request */
+        if ((line == NULL) || (*line == '\0')) {

+ /* We skip empty lines because browsers have to tacka CRLFon to the end + * of POSTs to support old CERNwebservers.

+             */
+            int max_blank_lines = r->server->limit_req_fields;
+            if (max_blank_lines <= 0) {
+                max_blank_lines = DEFAULT_LIMIT_REQUEST_FIELDS;
+            }
+            r->num_blank_lines++;
+            if (r->num_blank_lines < max_blank_lines) {
+                return APR_SUCCESS;
+            }
+        }
+        set_the_request(r, line);
+    }

This will cause a segfault if line is null and we are at or abovemax_blank_lines. Perhaps you meant for an else clause here?


Yes, thanks for catching that.

+    else {
+ /* We've already read the first line of the request.This iseither + * a header field or the blank line terminatingthe header
+         */
+        if ((line == NULL) || (*line == '\0')) {
+            if (r->pending_header_line != NULL) {
+                apr_status_t rv = set_mime_header(r,
r->pending_header_line); +                if (rv != APR_SUCCESS) {
+                    return rv;
+                }
+                r->pending_header_line = NULL;
+            }
+            if (r->status == HTTP_REQUEST_TIME_OUT) {
+                apr_table_compress(r->headers_in,
APR_OVERLAP_TABLES_MERGE);
+                r->status = HTTP_OK;
Say what?  ...looks at rest of file...
Is this because r->status is unset and we're saying that's it's'okay' to proceed with the request. If so, this *really* needs acomment to that effect. It makes no sense whatsoever otherwise.(We should probably remove the hack to set it toHTTP_REQUEST_TIME_OUT initially as part of these changes.)


Yeah, setting r->status to HTTP_OK is done here solely to make it work
with the existing logic about HTTP_REQUEST_TIME_OUT meaning "still
reading the request header."

+1 for of removing the HTTP_REQUEST_TIME_OUT hack.  I was trying

to be conservative by preserving that part of the original logic, butnowthat you mention it, we might as well replace it with something lesssubtle

in 2.3.  This will break any 3rd party modules that depend upon the
HTTP_REQUEST_TIME_OUT convention, but for a major release
like 2.4 or 3.0 that's a defensible choice.

+            }
+        }
+        else {
+            if ((*line == ' ') || (*line == '\t')) {

+ /* This line is a continuation of the previousone */

+                if (r->pending_header_line == NULL) {
+                    r->pending_header_line = line;
+                    r->pending_header_size = 0;
+                }
+                else {
+                    apr_size_t pending_len =
strlen(r->pending_header_line);
+                    apr_size_t fold_len = strlen(line);

This seems really expensive. You shouldn't need to recountpending_len each time.


Good point; I'll add something to keep track of the length.

+            }
+            break;
If I understand your direction, it'd bail out here if it ever gotEAGAIN?


Yes indeed.  That's what the version on the async-read-dev branch does.

+request_rec *ap_read_request(conn_rec *conn)
+{
+    request_rec *r;
+    apr_status_t rv;
+
+    r = init_request(conn);
+
+    rv = read_partial_request(r);
+    /* TODO poll if EAGAIN */
+    if (r->status != HTTP_OK) {
+        ap_send_error_response(r, 0);
+        ap_update_child_status(conn->sbh, SERVER_BUSY_LOG, r);
+        ap_run_log_transaction(r);
+        return NULL;
+    }
Obviously, this can't be the case if you want to do real polling.This would be the wrong place to poll. You have to exit out ofap_read_request and go back up to an 'event thread' that waitsuntil there's data to read on any incoming sockets. Then, you'dhave to call ap_read_request again to 'restart' the parser wheneverthere is more data to read.
In my opinion, this code isn't even close to being async.

I'm relieved to hear that it's not async, since you're looking at theblockingversion. :-) See ap_read_async_request() (still on the async-read-dev branch).

So, I wonder why it was even merged to trunk right now. You'dhave to deal with partial lines and the state the header parser isin when it gets the next chunk of data - which this code doesn'teven try to do. The current code is just going to bail when itdoesn't have a 'complete' line.
My hunch is that you plan to build up pending_header_line and delayparsing until you have the line terminators; but that's not goingto work really well with non-blocking sockets as you have to storethe data you just read non-blocking'ly if you actually readmultiple MIME header lines or, icky, even part of the request bodyor the next request. This just lends more credence to the argumentthat request_rec is the wrong place for the buffer. It's clearlyconnection or connection-filter oriented...

The async impl addesses this problem by only grabbing a line at atime fromthe input filter chain. The filters are responsible for bufferinganything that's beenread from the socket but not consumed by the current request. If youhave achance to review the async code (the version on the branch), I'meager to get

some feedback on that part of the design.

(IIRC, another point that'll kill httpd is that httpd mis-usesEAGAIN; serf has a definition that EAGAIN means some data is read*and* you need to ask for more. httpd's filters don't have that.One more issue that httpd got wrong...)
To me, it sounds like we need a way for the MPM (or whatever) toredefine the accept logic instead of relying upon the core. Oh,yah, we do that already (heh) by allowing them to call somethingother than ap_process_connection: such as withap_process_http_async_connection(). So, why are we even touchingap_read_request() at all? This is going to necessitate anap_async_read_request() anyway...

Agreed on the need for async_read_request(); I purposely created thatas a separate function,following the pattern set by ap_process_http_async_connection(), sothat things depending onthe semantics of the original ap_read_request() wouldn't break. Thebulk of the refactoringthat I've recently checked into the trunk is designed to allow thesync and async versions ofap_read_request() to share code. Given that the mime-folding codehas been a source ofsecurity vulnerabilities in the past, I'm trying to stick with asingle impl of the header parsing

code that the sync and async code can both use.

Brian

Re: svn commit: r360461 - in /httpd/httpd/trunk: CHANGES include/ap_mmn.h include/httpd.h server/protocol.c

Reply via email to