On Thu, 22 Apr 2010, Victor Engmark wrote:
>>>> from invenio.textutils import encode_for_xml
>>>> encode_for_xml('<subfield code="a">Learn about <![CDATA[ lalala ]]>
>>>> blocks. These are great!</subfield>')
>>>> '<subfield code="a">Learn about &lt;![CDATA[ lalala ]]&gt;
>>>> blocks. These are great!</subfield>'
>
> That should do it, no?
Let's test the full upload-index-download cycle for a real-life test ADS
record. Dunno what CDATA contains in Benoit's use case, and whether it
is wanted to store it `as is' in Invenio DB tables. For example, I'd
say they should rather be stored dereferenced, whenever possible,
similarly as we dereference \u03A3 into Σ during upload; this makes
indexing etc simpler. But maybe the real-life use case is different.
Benoit, can you please send a concrete MARCXML snippet to be loaded, and
specify how you would like to see it stored/indexed/exported?
Alternatively, please just take the XML encoding bits from Victor's
public branch (vengmark/webtag), and test on your own end.
Best regards
--
Tibor Simko