On Tue, 10 Jan 2012, Andrei Khveras wrote:
I'm trying to use the class org.apache.poi.hwpf.extractor.WordExtractor,
what I downloaded as a part of Apache POI
<http://poi.apache.org/download.html>.
*Could somebody, please*, kindly help me to resolve this little issue.
My goal is to get MS Word file contents as one single String, containing
all control characters. I need it for further (hand-made!) splitting
text into paragraphs, words, etc.
Why not fetch the paragraphs directly then? That'd give you full control
over which bit of text is in which paragraph, and will let you decide if
you want to display or hide control characters etc
I'd suggest you look at the code for WordExtractor to get an idea of how
to go about doing it, then do your own version that implements your
required logic
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]