Re: Backslashes in HTTP Headers

2007-10-02 Thread Joe Orton
On Mon, Oct 01, 2007 at 02:01:24AM +0100, Nick Kew wrote:
 Coadvisor has several testcases involving a Content-Type line with 
 a lot of qualifier tokens.  These tokens are quoted strings and
 include backslashes.  This is going to wrap when I cutpaste:
 
 Content-Type: text/other; charset=ISO-8859-4; attribute=value; q=0.9;
 q=9.  ; a=quoted text/html; a=quoted, list=b; a=quoted \r\n
 new line; a=quoted \r\n\t\r\n new lines; a=slashed \alpha;
 a=slashed \\nnew line; a=slashed \\r\\ncrlf; a=slashed \\n\\nnew
 lines; a=slashed \string; a-rvlmxgisq=v-r808478;
 a-rtbtrjxmwqirv=v-r797440; a-rwsqj=v-r9946045539;
 a-rkdrdmk=v-r93968576355\r\n
 
 Our ap_rgetline_core is seeing those quoted \-r-\-n sequences as
 newlines and getting hopelessly confused (the outcome is 400
 in the case of a request header, 502 from a response).

For exactly what request does the server give a 400, and what error is 
given (attach or reference to avoid line-wrapping if necessary)?  I 
doubt there is any code in the server which will backslash-expand header 
values as you describe; certainly ap_rgetline_core doesn't do it.

joe


Backslashes in HTTP Headers

2007-09-30 Thread Nick Kew
Coadvisor has several testcases involving a Content-Type line with 
a lot of qualifier tokens.  These tokens are quoted strings and
include backslashes.  This is going to wrap when I cutpaste:

Content-Type: text/other; charset=ISO-8859-4; attribute=value; q=0.9;
q=9.  ; a=quoted text/html; a=quoted, list=b; a=quoted \r\n
new line; a=quoted \r\n\t\r\n new lines; a=slashed \alpha;
a=slashed \\nnew line; a=slashed \\r\\ncrlf; a=slashed \\n\\nnew
lines; a=slashed \string; a-rvlmxgisq=v-r808478;
a-rtbtrjxmwqirv=v-r797440; a-rwsqj=v-r9946045539;
a-rkdrdmk=v-r93968576355\r\n

Our ap_rgetline_core is seeing those quoted \-r-\-n sequences as
newlines and getting hopelessly confused (the outcome is 400
in the case of a request header, 502 from a response).

A simple search of RFC2616 gives:

   message-header = field-name : [ field-value ]
   field-name = token
   field-value= *( field-content | LWS )
   field-content  = the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string

quoted-string  = (  *(qdtext | quoted-pair )  )
quoted-pair= \ CHAR

CHAR   = any US-ASCII character (octets 0 - 127)

from which it appears that the header in the testcase is legitimate
and our parser is screwed.

Anyone?

-- 
Nick Kew

Application Development with Apache - the Apache Modules Book
http://www.apachetutor.org/