Let me expand a bit on my earlier question.
 
I created an XML file (using XMLWriter) with an element called 'body' that contains a CDATA mixture of paragraphs (using the <p> and </p> tags), bold text(delineated with <b> and </b> tags) and ordinary text. I then read and parsed this XML file into Document doc1. 
 
Next, I extract doc1.element("body").asXML() into a String called 'stuff'. 
 
What I want to do is parse the contents of "body" into the component paragraph, highlighted text and regular text parts.  So, I created a string something like "<doc>" + stuff + "</doc>", used that to create a StringReader 's_in' and used a SAXReader.read(s_in) to create a new Document doc2.
 
Unfortunately, doc2 now contains an element 'body', instead of a set of 'p' elements. No matter what I do, I still end up with the 'body' element with all of its contents treated as a (CDATA) lump.  So I am unable to selectively extract the 'p', 'b' tags or text. 
 
That's where I'm stumped.
 
Regards,
 
Terry
----- Original Message -----
Sent: Saturday, July 06, 2002 8:22 AM
Subject: [dom4j-user] Parsing CDATA

Probably a dumb question, but here goes....
 
I have an XML document that contains a 'body' element containing CDATA content - that is, a mixture of tags and content.  What I'd like to do is extract that content and parse it to separate the tags and text and then selectively display it. 
 
I know how to create another document and add elements and so forth.  But I'm stumped on how to do the parsing.  Any advice/insight would be much appreciated.
 
Regards,
 
Terry
 

Reply via email to