On Tue, Apr 26, 2005 at 10:31:42AM -0400, Stas Bekman wrote: >>> Since when unescaped & in the QUERY_STRING part of the URL are not allowed? >> I dunno the specifics, but if you try using the w3c validator you end up >> with something like this >> reference not terminated by REFC delimiter >> <a href="http://example.com/foo.pl?foo=bar®=foobar">is this valid?</a> >> If you meant to include an entity that starts with "&", then you should >> terminate it with ";". Another reason for this error message is that you >> inadvertently created an entity by failing to escape an "&" character just >> before this text. > OK, in which case it must be some relatively recent change, since an > unescaped & in the QUERY_STRING was a valid separator. A pointer to the > relevant RFC would be nice so we can add that to the URL that started this > thread.
Actually, I think it's been like this since the beginning, but it's one of those things that browsers are very forgiving about, so most people never bump into problems with this. Yes, & is a valid separator in a query_string, but when you include this query_string in an HTML-document, then the query_string has to follow the same rules as the rest of the document, which means that & must be escaped. The HTML-spec says: http://www.w3.org/TR/1998/REC-html40-19980424/sgml/dtd.html#URI URI must be of the type CDATA and CDATA-content must be treated like this: http://www.w3.org/TR/1998/REC-html40-19980424/types.html#type-cdata - Replace character entities with characters, - Ignore line feeds, - Replace each carriage return or tab with a single space. This doesn't say that you _must_ encode certain characters, like &, to entities, though, only that browsers must decode the entities that you've encoded. Likewise, the documentation on entities, just recommends encoding in CDATA-sections http://www.w3.org/TR/1998/REC-html40-19980424/charset.html#h-5.3.2 Authors should use "&" (ASCII decimal 38) instead of "&" to avoid confusion with the beginning of a character reference (entity reference open delimiter). Authors should also use "&" in attribute values since character references are allowed within CDATA attribute values. However, the XHTML-spec clearly states that & must be entitized within attribute values. http://www.w3.org/TR/xhtml1/#C_12 -- Trond Michelsen