Greetings all.
A couple of days ago I was trying to proxy Zope through Apache (1.3.26) on OpenBSD. Zope was listening on port 8080, and I had a rewrite rule in my http.conf file to proxy requests through to the Zope web server. Unfortunately, when I tried to test this setup, I ended up getting multiple HTTP headers in the response, which ended up appearing on the client browser.
What I discovered was that when mod_proxy was processing the headers, if it got a line that didn't have a colon in it, (other than the first line), it would return an NULL header list back to the calling function indicating that the headers were invalid. Unfortunately, the Zope server was sending back the string:
Server: Zope/(Zope 2.5.1b1 (OpenBSD package zope-2.5.1b1) , python 2.1.2, openbsd3) ZServer/1.1b1
split on two lines, which meant mod_proxy wouldn't process the headers properly. Of course, this is a problem with the zope compilation rather than mod_proxy, but I was wondering just how strict the proxying code should be in parsing the headers from the remote server? It seems to me that if the header parsing stuff in proxy_util.c was slightly more forgiving of irregularities then life would be just that little bit better :)
Here is a patch to proxy_util.c that will allow certain header fields to have (up to) a certain number additional continuation lines... at the moment I've only defined the "Server" field name and specified that it can have up to 2 additional invalid lines after it, but more header fields can be added to fields_allowing_additional_lines array.
Cheers, Brett
diff -Naur old/proxy_util.c new/proxy_util.c --- old/proxy_util.c Fri Aug 16 10:16:56 2002 +++ new/proxy_util.c Fri Aug 16 10:17:05 2002 @@ -377,12 +377,23 @@ * headers, which contain commas within the date field) do not get stuffed * up. */ +static struct field_allowing_additional_lines { + char *field_name; + unsigned short max_num_additional_lines; +} fields_allowing_additional_lines[] = { + { "Server", 2 } /* allow the server field a max of 2 additional lines */ +}; + table *ap_proxy_read_headers(request_rec *r, char *buffer, int size, BUFF *f) { table *resp_hdrs; int len; char *value, *end; char field[MAX_STRING_LEN]; + char field_name[MAX_STRING_LEN]; /* to store the last field name read */ + int num_additional_lines = 0; /* to count the number of crap header lines */ + + field_name[0] = '\0'; resp_hdrs = ap_make_table(r->pool, 20); @@ -391,8 +402,33 @@ * the connection closes (EOF), or we timeout. */ while ((len = ap_getline(buffer, size, f, 1)) > 0) { - if (!(value = strchr(buffer, ':'))) { /* Find the colon separator */ + /* check to see if the last header field read is + one we can have continuation lines for... + */ + if (field_name[0]) /* check the 1st byte for null to be quick */ + { + int i = 0; + int i_max = sizeof(fields_allowing_additional_lines) / sizeof(struct field_allowing_additional_lines); + int ok_to_continue = 0; + + while (i < i_max) + { + if (!strcmp(field_name, fields_allowing_additional_lines[i].field_name)) + { + if (num_additional_lines < fields_allowing_additional_lines[i].max_num_additional_lines) + ok_to_continue = 1; + break; + } + i++; + } + + if (ok_to_continue) + { + num_additional_lines++; + continue; + } + } /* * Buggy MS IIS servers sometimes return invalid headers (an @@ -415,6 +451,12 @@ *value = '\0'; ++value; + + /* woo hoo! We got a new header! Do some stuff! */ + /* copy the header field name into a buffer so we can check it later */ + strcpy(field_name, buffer); + num_additional_lines = 0; /* reset the number of additional lines */ + /* * XXX: RFC2068 defines only SP and HT as whitespace, this test is * wrong... and so are many others probably.