On Dec 10, 2004, at 4:19 AM, Joe Orton wrote:
My understanding was that the forced default charset *does* prevent
browsers (or maybe, MSIE) from guessing the charset as UTF-7; UTF-7
being the special case as it's already an "escaped" encoding and hence
defies normal escaping-of-client-provided-data tricks.  Is that not
correct?

Yes and no -- it is both the source of the problem and the biggest reason that we should NOT set charset as a default.

Consider the following two identical content resources, the first
being sent as

     Content-Type: text/html; charset=ISO-8859-15

  http://www-uxsup.csx.cam.ac.uk/~jw35/docs/cross-site-demo.html

and the second being sent with only

     Content-Type: text/html

  http://www.ics.uci.edu/~fielding/xss-demo.html

I've tested the above with all of my browsers. Safari and MSIE-Mac do not
support utf-7 at all. Firefox (Mac and Win) supports utf-7 but only when
manually set (it does not auto-detect utf-7, even when read from a local file).


MSIE (Windows), of course, does the least intelligent thing -- it does
not allow users to select utf-7 manually, but does auto-detect and interpret
utf-7 if it is read from a local file, or if "auto-detect" is enabled
regardless of the content-type charset parameter -- setting charset has
no effect on MSIE's auto-detect results. In other words, it
is only at risk for XSS via utf-7 if auto-detect is enabled.


The problem we have created is that AddDefaultCharset causes entire
sites to default to one charset, usually iso-8859-1.  And because it
is set by default (no brains spent thinking about the right value),
it is often set that way even when installed in non-Latin countries
[and there is also a problem in Europe, since iso-8859-15 is where
the euro symbol was added].  As a result, normal users get a higher
frequency of wrong charset declarations in HTTP, for which the only
"standards-compliant" solution short of manually adjusting every
page received is to turn on auto-detect!  In other words, our default
is now causing more users to be vulnerable to utf-7 XSS attacks than
they would otherwise be if we never sent a default charset.

In any case, the only tutorials on cross-site scripting that still
emphasize setting charset is our own (written by Marc) and CERT's
(based on input from Marc).  Those were intended to be temporary
workarounds until folks had a chance to fix the real problems, which
were non-validating scripts that echo untrusted content to users.

After doing another afternoon of research on this one, I am now convinced
that AddDefaultCharset does far more harm than good.


....Roy



Reply via email to