Re: Reading XML files with ampersands in them

dara Mon, 27 Jun 2005 20:13:47 -0700

Hi,

Quite simply, that is not valid xml.

The ampersand is a 'special' character and must be referred to via it'sentity-reference ( &anp; ) for the character itself.

You should find a lot of stuff on this via various search engines orbasic xml tutorials. You can get the full XML specification fromwww.w3.org, but the following two articles should suffice and providefurther pointers for related reading :


http://www.xml.com/pub/a/2003/02/26/qa.html
http://www.xml.com/pub/a/2001/01/31/qanda.html


Regards

Dara


Xiaolei Li wrote:

Hi,
I'm trying to read in all the #text nodes in a set of XML documents,but I'm running into problems when the document content includesampersands (&) in the text.
So given a document path, I use XercesDOMParser to get the rootDOMNode*. Using that node, I traverse the entire tree looking for#text nodes. Whenever I see a #text node, I getNodeValue() and do aXMLString::transcode() on it to get the char*.
This works fine until I run into a document that has & in itscontent. For example,
=========================
...
<TEXT>
Maryland Federal Bancorp Inc., a Hyattsville-based thrift, announcedyesterdaythat it will be acquired by BB&T Corp. of Winston-Salem, N.C., for $265.3
million in stock.
...
=========================
For some reason, the char* I get back from XMLString::transcode()only gives me the text up to "BB" (in "BB&T"). If I manually deletethe & from the file, it'll parse just fine. So basically, the "&" isending the text prematurely.
I'm a total XML noob so I have no clue what to do here. I'm probablyjust missing something very basic. Any guidance would be greatlyappreciated.
Thank you.

-Xiaolei

Re: Reading XML files with ampersands in them

Reply via email to