Hi, as I told you, I have tried it, but with the same result, the resulting 
file is corrupted, that is what MSWord says. My next approach is to create a 
copy file, and do modifications within this file. My problem is that I do not 
know how to save modifications done in the charRuns of the paragraphs, what I 
mean is to persist modifications done in the resulting file, without have to 
coopy it, calling document.write(outputStream)

My code is:

public File processFile(final InputStream is, final Map<String, String> 
replacementText) throws IOException {
        Set<String> keys = replacementText.keySet();
        try {
            // Makes a copy of the file.
            File res = copyfile(is);
            InputStream auxIs = new FileInputStream(res);
            POIFSFileSystem poifs = new POIFSFileSystem(auxIs);
            HWPFDocument document = new HWPFDocument(poifs);
            Range range = document.getRange();

            for (int i = 0; i < range.numParagraphs(); i++) {
                Paragraph paragraph = range.getParagraph(i);
                int numCharRuns = paragraph.numCharacterRuns();
                for (int j = 0; j < numCharRuns; j++) {
                    CharacterRun charRun = paragraph.getCharacterRun(j);
                    for (Iterator<String> it = keys.iterator(); it.hasNext();) {
                        String key = it.next();
                        if (charRun.text().contains(key)) {
                            String value = replacementText.get(key);
                            charRun.replaceText(key, value);
                            range = document.getRange();
                            paragraph = range.getParagraph(i);
                            charRun = paragraph.getCharacterRun(j);
                        }
                    }
                }
            }
            is.close();
            return res;
        } catch (IOException e) {
            logger.error("Error procesando el fichero WORD: " + e);
            throw new IOException("Error procesando el fichero WORD");
        } finally {
            if (is != null) {
                is.close();
            }
        }
    }


Thanks in advance, Fabi.

-----Mensaje original-----
De: MSB [mailto:[email protected]] 
Enviado el: martes, 24 de noviembre de 2009 8:43
Para: [email protected]
Asunto: Re: Modify word document


You have not dug down far enough into the structure of the document yet I am
afraid - all of the formatting information is stopred (encapsulated) within
the CharacterRun class and you need to perform the repllacements at that
level.

I do not have any suitable code at hand as I type this so what follows will
need to be converted into Java and tested;

Open the Word document.
Get the overall Range for the document.
Get the number of Paragraph objects the Range contains.
Iterate through the Pargraphs and for each Pargraph
    Get the CharacterRun(s) the Paragraph contains.
    Call the method to replace the search term with the replacement text on
the CharacterRun
Save the modified document away again.

You do however face a couple of problems with this. It has been a long time
since I tried to write a search and replace routine using HWPF and I could
not get it to work if the replacement text was longer that the search term.
In that case, HWPF threw an exception and would not allow me to complete the
process; but that problem could well have been addressed by now as it was
well known and caused by faulty bounds checking within the Range class. Only
testing will prove or disprove this for you I am afraid.

Secondly, the CharacterRun class encapsulates a piece of text with common
properties. So, imagine that we are searching for the phrase 'search term'
and that the word 'search' has been emboldened whilst the word 'term' has
been left as normal text, then my suggested approach will not work. That is
because the words search and term will be held in different CharacterRun(s).
If you do hit this problem, then I am afraid you will have to write code
that searches for the term at the Paragraph level and that identifies where
the search terms can be found and recovers the CharacterRun(s) that
encapsulate them. Once you have these, you can modify the runs or create and
substitute new ones but I have to admit that I have never tried to do this
myself. Instead I chose to automate Word using OLE and to explore the
possibilities offered by OpenOffices UNO interface. Both options did work
but threw up other problems that proved more limiting (in terms of
architecture and platform). If you can get it to work, HWPF offers the
better solution IMO.

Yours

Mark B


Fabián Avilés Martínez wrote:
> 
> Hi all,
>       I have a Word document, as a template: In this template there are some
> tokenized words, which have to be modified and the result has to be saved
> into another file. The original file has some properties, like header and
> footer, images, etc. The resulting file has to be the same, but with the
> modified words. I am trying it with the code below, but it does not work.
> 
> public ByteArrayOutputStream processFile(final InputStream is, final
> Map<String, String> replacementText)
>         throws IOException {
>         Set<String> keys = replacementText.keySet();
>         try {
>             POIFSFileSystem poifs = new POIFSFileSystem(is);
>             HWPFDocument document = new HWPFDocument(poifs);
>             Range range = document.getRange();
> 
>             for (int i = 0; i < range.numParagraphs(); i++) {
>                 String newTxt = range.getParagraph(i).text();
>                 String oldTxt = range.getParagraph(i).text();
>                 for (Iterator<String> it = keys.iterator(); it.hasNext();)
> {
>                     String key = it.next();
>                     if (newTxt.contains(key)) {
>                         newTxt = replacePlaceholders(key,
> replacementText.get(key), newTxt);
>                     }
>                 }
>                 if (!oldTxt.equals(newTxt)) {
>                     range.getParagraph(i).replaceText(oldTxt, newTxt);
>                 }
>             }
> 
>             // Save the document away.
>             ByteArrayOutputStream bos = new ByteArrayOutputStream();
>             document.write(bos);
>             bos.flush();
>             bos.close();
>             return bos;
>         } catch (IOException e) {
>             logger.error("Error procesando el fichero WORD: " + e);
>             throw new IOException("Error procesando el fichero WORD");
>         } finally {
>             if (is != null) {
>                 is.close();
>             }
>         }
>     }
> 
> Any help, please?
> 
> Thanks in advance, Fabi.
> 
> 
> 
> ______________________
> This message including any attachments may contain confidential 
> information, according to our Information Security Management System,
>  and intended solely for a specific individual to whom they are addressed.
>  Any unauthorised copy, disclosure or distribution of this message
>  is strictly forbidden. If you have received this transmission in error,
>  please notify the sender immediately and delete it.
> 
> ______________________
> Este mensaje, y en su caso, cualquier fichero anexo al mismo,
>  puede contener informacion clasificada por su emisor como confidencial
>  en el marco de su Sistema de Gestion de Seguridad de la 
> Informacion siendo para uso exclusivo del destinatario, quedando 
> prohibida su divulgacion copia o distribucion a terceros sin la 
> autorizacion expresa del remitente. Si Vd. ha recibido este mensaje 
>  erroneamente, se ruega lo notifique al remitente y proceda a su borrado. 
> Gracias por su colaboracion.
> 
> ______________________
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Modify-word-document-tp26480450p26491636.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


______________________
This message including any attachments may contain confidential 
information, according to our Information Security Management System,
 and intended solely for a specific individual to whom they are addressed.
 Any unauthorised copy, disclosure or distribution of this message
 is strictly forbidden. If you have received this transmission in error,
 please notify the sender immediately and delete it.

______________________
Este mensaje, y en su caso, cualquier fichero anexo al mismo,
 puede contener informacion clasificada por su emisor como confidencial
 en el marco de su Sistema de Gestion de Seguridad de la 
Informacion siendo para uso exclusivo del destinatario, quedando 
prohibida su divulgacion copia o distribucion a terceros sin la 
autorizacion expresa del remitente. Si Vd. ha recibido este mensaje 
 erroneamente, se ruega lo notifique al remitente y proceda a su borrado. 
Gracias por su colaboracion.

______________________

Reply via email to