On Jul 26, 2006, at 3:19 AM, Bill de hÓra wrote:
A. Pagaltzis wrote:
* Robert Sayre <[EMAIL PROTECTED]> [2006-07-26 01:45]:
On 7/25/06, Bill de hÓra <[EMAIL PROTECTED]> wrote:
And I didn't know whether Atom code could get away with
escaping < and &.
<atom:title type="html">&lt;b>&nbsp;hmm&lt;b></atom:title>

that is an XML fatal error, no doubt, as the ampersand before
"nbsp" must be escaped.
But he did say “escaping < and &”, so it would be. I’m not sure
what Bill’s question even is.

What do I escape, so I know what to unescape?

The point is that after your XML parser has unescaped the content of the element, it should be suitable for handling as HTML. Escape whatever you have to ensure that the consumer gets HTML from their XML parser. Converting & to &amp; and < to &lt; is sufficient (assuming that you've started with HTML--if you've started with plain text, then you need to double escape, but in that case, you should be using type="text" anyway to save yourself the trouble). You could also convert > to &gt;, " to &quot;, ' to &apos; and any other characters to numeric character references. Or you put the whole thing in a CDATA block.

Reply via email to