Fw: [dom4j-user] Parsing CDATA

Terry Steichen Sat, 06 Jul 2002 07:17:32 -0700

Let me expand a bit on my earlier question.

I created an XML file (using XMLWriter) with an element called 'body' that contains a CDATA mixture of paragraphs (using the <p> and </p> tags), bold text(delineated with <b> and </b> tags) and ordinary text. I then read and parsed this XML file into Document doc1.

Next, I extract doc1.element("body").asXML() into a String called 'stuff'.

What I want to do is parse the contents of "body" into the component paragraph, highlighted text and regular text parts. So, I created a string something like "<doc>" + stuff + "</doc>", used that to create a StringReader 's_in' and used a SAXReader.read(s_in) to create a new Document doc2.

Unfortunately, doc2 now contains an element 'body', instead of a set of 'p' elements. No matter what I do, I still end up with the 'body' element with all of its contents treated as a (CDATA) lump. So I am unable to selectively extract the 'p', 'b' tags or text.

That's where I'm stumped.

Regards,

Terry

----- Original Message -----

From: Terry Steichen

To: dom4j-user

Sent: Saturday, July 06, 2002 8:22 AM

Subject: [dom4j-user] Parsing CDATA

Probably a dumb question, but here goes....

I have an XML document that contains a 'body' element containing CDATA content - that is, a mixture of tags and content. What I'd like to do is extract that content and parse it to separate the tags and text and then selectively display it.

I know how to create another document and add elements and so forth. But I'm stumped on how to do the parsing. Any advice/insight would be much appreciated.

Regards,

Terry

Fw: [dom4j-user] Parsing CDATA

Reply via email to