Re: [jibx-users] Unmarshalling Elements to a String field

Thomas Jones-Low Fri, 12 Oct 2007 06:53:59 -0700

Dennis,
        You are correct, this is only half the solution. What I did was to read 
through the DOM umnarshaller examples provided, then used the 
javax.text.HTMLDocument to store the HTML section. The DOM parsing 
allows reading the raw text up to a close tag, which I used, then passed 
the entire thing to the HTMLDocument which did the parsing.


        It worked perfectly, allows use of the Java Document editing interfaces 
and didn't require any changes to the internal JiBX code to work (unlike 
the marshalling).

-- 
        Thomas Jones-Low            Softstart Services Inc.
        [EMAIL PROTECTED]      JobScheduler for Oracle
        Ph: 802-398-1012            http://www.softstart.com

Dennis Sosnoski wrote:
> Hi Thomas,
> 
> The ICharacterEscaper approach would work for marshalling out a string 
> containing markup, but wouldn't help with unmarshalling. The problem 
> here is that there's no way to tell the parser to just treat the content 
> of an element (<text>, in this case) as a text blob. The parser will 
> *always* insist on parsing out the individual elements, and there's 
> nothing JiBX can do to avoid this.
> 
> The easiest way of dealing with this type of arbitrary (but well-formed) 
> content is generally to use a DOM representation. The JiBX extras 
> classes include marshaller/unmarshallers for DOM and a couple of 
> alternatives (JDOM and dom4j). Once you have the content in the form of 
> a DOM, you can work with it directly or use an empty 
> javax.xml.transform.Transformer to convert it to text.
> 
>   - Dennis
> 
> Dennis M. Sosnoski
> SOA and Web Services in Java
> Training and Consulting
> http://www.sosnoski.com - http://www.sosnoski.co.nz
> Seattle, WA +1-425-939-0576 - Wellington, NZ +64-4-298-6117
> 
> 
> 
> Thomas Jones-Low wrote:
>> Nick Stolwijk wrote:
>>   
>>> I have the following XML structure to work with:
>>>
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <document>
>>>     <content>
>>>         <section>
>>>             <text>
>>>                 <html>
>>>                     <body>
>>>                         <p>This is a test<em>item.</em></p>
>>>                     </body>
>>>                 </html>
>>>             </text>
>>>             <subtitle>subtitel</subtitle>
>>>         </section>
>>>     </content>
>>> </document>
>>>
>>> And a Java class with a property name 'text'. After unmarshalling I want 
>>> this property to contain all text within the body tag (so including the 
>>> elements). Do I need to write a custom unmarshaller for this or is there 
>>> already something which does this?
>>>
>>>     
>>      Looking back through the list archives, Kees de Kooter implemented an 
>> ICharacterEscaper class to write the literal string
>> to the output stream. This would work better than my solution of 
>> modifying the JiBX Code.
>>
>>   
> 
> 


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
jibx-users mailing list
jibx-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jibx-users

Re: [jibx-users] Unmarshalling Elements to a String field

Reply via email to