Hi,

We use CDATA not only when text contains '<' or '&' characters, but also to preserve linebreaks etc.

As it is now we have to do some post processing on the generated xml files, to get the CDATA sections right.

What could work for us is a method like setStringValue(), where you specify that the value should be kept in a CDATA section. e.g setCDATAStringValue()

Radu Preotiuc-Pietro wrote:

The issue of CDATA and entitization has come up a lot of times.
XmlBeans is 100% infoset, but the XML infoset doesn't make any distinction as 
to how character data is represented. So the approach that it took was to 
decide on its own when should characters be entitized and when saved as a CDATA 
section. The algorithm is:
- if the length of the text is < 32 chars, entitization is used
- otherwise, if there are at least 5 '<' or '&' characters and they also 
account for at least 1% of the text length, CDATA is used.

For V2, we looked into making this configurable, since we got feedback on this 
mailing list that it would be useful, but never got around to doing it.

Here is one of the proposals:
- turn entitization on on a char by char basis via an XmlOption that basically says: 
"I want character x to always be entitized"
- turn CDATA on/off on a per-document basis

What do people think?
Thanks,
Radu

-----Original Message-----
From: Patrick Hochstenbach [mailto:[EMAIL PROTECTED]
Sent: Thursday, May 05, 2005 11:27 PM
To: [email protected]
Subject: CDATA Heuristics



Hi,

in our library we are very interested using XMLBeans in a document archiving project which stores XML files in a database. The excellent
round-tripping characteristics of XMLBeans are crucial in
our project. But, with the serialization of text containing
escaped '<'-s and '&'-s we're at a loss. XMLBeans seems to
have some heuristics to decide when text containing these
characters should be saved as CDATA and when not.


Is it possible to decide at runtime when text should be saved in
CDATA sections and when not? Or better, can in some way CDATA
sections be preserved?

Best regards,

Patrick


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]





-- Med venlig hilsen Steffen Vinther S�rensen @ Logiva A/S mailto:[EMAIL PROTECTED] http://www.logiva.dk Tlf direkte. 87 46 44 15


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to