* Joe Cheng <[EMAIL PROTECTED]> [2007-06-19 21:35]: > * James M Snell <[EMAIL PROTECTED]> [2007-06-19 21:30]: > > * Joe Cheng <[EMAIL PROTECTED]> [2007-06-19 21:05]: > > > How should implementers deal with entities in XHTML > > > payloads? I'm sure this is a generic XML question that many > > > on this list have dealt with before and was wondering if > > > there is some consensus as to the best practice. > > > > > > As I understand it, you can't just slap in a © into an > > > XML document and expect it to be interpreted as the > > > copyright symbol. You need to declare the entity > > > explicitly, or bring in a DTD, or use the &#nnn; form. Is > > > that right? > > > > Folks should be using the numeric character references rather > > than the entities. > > Thanks, I'm glad I asked--that would not have been my guess. > > Is that a fairly uncontroversial stance these days? I have been > surprised at how much some users are concerned about the > aesthetics of their (X)HTML (although I admit I have not heard > anything specifically about the use of named vs. numeric > entities).
If you’re including unescaped well-formed XHTML with type="xhtml", then you *can’t* include named entities, because there’s no DTD on the document to declare them, so using them would just render the document malformed. If the document is in some national encoding like US-ASCII or Latin-1, then you don’t *have* any other choice than using numeric char refs. If the document is in UTF-8 or another Unicode encoding, I wouldn’t bother with NCRs at all and would just include the literal characters. If you’re putting escaped tagsoup in type="html" content, then of course you can just include double-escaped named entities. (This is really an issue for atom-syntax, btw, not atom-protocol; re-routing there.) Regards, -- Aristotle Pagaltzis // <http://plasmasturm.org/>
