Hi everyone,

 

I'm interested in using the POI package in order to extract content from
a MS Word document.  I've managed to get it do to this, but the
extracted text is stripped of all style information, just plain text,
e.g.

 

The quick brown fox jumps over the lazy dog.

 

What I'm looking to do is also show which text is in bold or italics.
So for example it would output:

 

The [b]quick[/b] brown fox [i]jumps over[/i] the lazy dog.

 

Or failing this, can the document be outputted as an XML document that
also contains style information?

 

Is there any way of doing this using the standard POI package?  I
believe this would definitely be possible using POI/HWPF?  I visited the
HWPF project page but couldn't see where to download the source code -
could someone point me in the right direction?

 

I'd be immensely grateful to anyone who could help me.

 

Many thanks,

 

Martin Burrow


**********************************************************************
The information in this e-mail and any attachment(s) is confidential and may be 
legally privileged. This e-mail is intended solely for the addressee. If you 
are not the addressee, dissemination, copying or other use of this e-mail or 
any of its content is strictly prohibited and may be unlawful. If you are not 
the intended recipient please inform the sender immediately and destroy the 
e-mail and any copies. All liability for viruses is excluded to the fullest 
extent permitted by law. Any views expressed in this message are those of the 
individual sender. No contract may be construed by this e-mail.

**********************************************************************

Reply via email to