Instead of doing text manipulation, can you use selectNode(...) to find the <body> node, then use the dom4j to get the textual content?
You may then have to do some text manip to wrap some outter tags around the content, though, as a well-formed XML doc has exactly 1 root element. ie: String stuff = "<new-root"> + bodyElement.getText() + "</new-root>"; Then, you have some (hopefully) well-formed XML you can parse again. -bob On Sat, 6 Jul 2002, Terry Steichen wrote: > Let me expand a bit on my earlier question. > > I created an XML file (using XMLWriter) with an element called 'body' > that contains a CDATA mixture of paragraphs (using the <p> and </p> tags), > bold text(delineated with <b> and </b> tags) and ordinary text. I then > read and parsed this XML file into Document doc1. > > Next, I extract doc1.element("body").asXML() into a String called > 'stuff'. > > What I want to do is parse the contents of "body" into the component > paragraph, highlighted text and regular text parts. So, I created a > string something like "<doc>" + stuff + "</doc>", used that to create > a StringReader 's_in' and used a SAXReader.read(s_in) to create a new > Document doc2. > > Unfortunately, doc2 now contains an element 'body', instead of a set > of 'p' elements. No matter what I do, I still end up with the 'body' > element with all of its contents treated as a (CDATA) lump. So I am > unable to selectively extract the 'p', 'b' tags or text. > > That's where I'm stumped. ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Got root? We do. http://thinkgeek.com/sf _______________________________________________ dom4j-user mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dom4j-user