Dennis, You are correct, this is only half the solution. What I did was to read through the DOM umnarshaller examples provided, then used the javax.text.HTMLDocument to store the HTML section. The DOM parsing allows reading the raw text up to a close tag, which I used, then passed the entire thing to the HTMLDocument which did the parsing.
It worked perfectly, allows use of the Java Document editing interfaces and didn't require any changes to the internal JiBX code to work (unlike the marshalling). -- Thomas Jones-Low Softstart Services Inc. [EMAIL PROTECTED] JobScheduler for Oracle Ph: 802-398-1012 http://www.softstart.com Dennis Sosnoski wrote: > Hi Thomas, > > The ICharacterEscaper approach would work for marshalling out a string > containing markup, but wouldn't help with unmarshalling. The problem > here is that there's no way to tell the parser to just treat the content > of an element (<text>, in this case) as a text blob. The parser will > *always* insist on parsing out the individual elements, and there's > nothing JiBX can do to avoid this. > > The easiest way of dealing with this type of arbitrary (but well-formed) > content is generally to use a DOM representation. The JiBX extras > classes include marshaller/unmarshallers for DOM and a couple of > alternatives (JDOM and dom4j). Once you have the content in the form of > a DOM, you can work with it directly or use an empty > javax.xml.transform.Transformer to convert it to text. > > - Dennis > > Dennis M. Sosnoski > SOA and Web Services in Java > Training and Consulting > http://www.sosnoski.com - http://www.sosnoski.co.nz > Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117 > > > > Thomas Jones-Low wrote: >> Nick Stolwijk wrote: >> >>> I have the following XML structure to work with: >>> >>> <?xml version="1.0" encoding="UTF-8"?> >>> <document> >>> <content> >>> <section> >>> <text> >>> <html> >>> <body> >>> <p>This is a test<em>item.</em></p> >>> </body> >>> </html> >>> </text> >>> <subtitle>subtitel</subtitle> >>> </section> >>> </content> >>> </document> >>> >>> And a Java class with a property name 'text'. After unmarshalling I want >>> this property to contain all text within the body tag (so including the >>> elements). Do I need to write a custom unmarshaller for this or is there >>> already something which does this? >>> >>> >> Looking back through the list archives, Kees de Kooter implemented an >> ICharacterEscaper class to write the literal string >> to the output stream. This would work better than my solution of >> modifying the JiBX Code. >> >> > > ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ jibx-users mailing list jibx-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/jibx-users