On Thu, Apr 22, 2010 at 10:28 AM, Tibor Simko <[email protected]> wrote:
> On Thu, 22 Apr 2010, Jerome Caffaro wrote:
>> What do other people think? I have the feeling it is easy to escape
>> too much, or not enough...
>
> Victor has been modifying the XML encoding and washing bits, using
> saxutils and friends, as part of his `webtag' branch. (which is not
> fully cleaned for integration yet, but can be inspected in his public
> repo)
>
> Victor, how your encode_for_xml() changes behave WRT CDATA?
Well, instead of using our own escapes, it uses the standard Python
libraries to escape (from xml.sax.saxutils import escape, quoteattr),
so:
>>> from invenio.textutils import encode_for_xml
>>> encode_for_xml('<subfield code="a">Learn about <![CDATA[ lalala ]]>
>>> blocks. These are great!</subfield>')
>>> '<subfield code="a">Learn about &lt;![CDATA[ lalala ]]&gt;
>>> blocks. These are great!</subfield>'
That should do it, no?
--
Victor Engmark