I was a bit short with my question then.  Yes I did try the oleextractor and
managed to open the word doc and read the text.  I thought I could just
cread a new HWPF and write the updated text to it.

Currently saving my word doc as an xml and trying to read it.  But it looks
like the actual content is in hex and so wandering if a parser can handle
this.

Hi Gotnoname,


On Thu, 2010-09-09 at 18:25 -0700, gotnoname wrote:
> New to this programming world and to this package.


Ho, that won't help... Note that just telling the world that you haven't
had a try at all to solve the problem won't encourage people to help
you.


> Read a word document.
> search and replace text in some paragraphs and footers
> then save it to a new word document.


Then you should have a look at the text extractor code to get an idea of
how to get the text. However, replacing the text and resaving may need
to change some other values (mostly in FIB structure).

As you may (or may not) have seen in the Word files specs, the
footnotes, textboxes, headers, footers and main document aren't stored
in a single place: you'll probably need to hack POI to get all of them
as I'm not sure all is implemented.

> Simple?please help.

Not sure POI is the easiest way to do that. You may want to have a look
at JODConverter: a combination of OOo and JODConverter could provide a
far easier way to achieve your goal.

Reply via email to