Hi Mark, version 3.2-FINAL is accesible in public maven repositories, these are 
the dependencies:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.2-FINAL</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-scratchpad</artifactId>
    <version>3.2-FINAL</version>
</dependency>


Thanks, Fabi.

-----Mensaje original-----
De: MSB [mailto:[email protected]]
Enviado el: martes, 24 de noviembre de 2009 17:27
Para: [email protected]
Asunto: RE: Modify word document


You are welcome.

If you do not have access to 3.2 FINAL of the API, it is possible to
download older releases from here -
http://archive.apache.org/dist/poi/release/bin/. Must admit that I do not
know what changes were made to HWPF between 3.2 and 3.5 so cannot say why
the formatting information is being lost and can only hope that you will ne
able to revert to using 3.2 FINAL for this project.

All that you will need to do is to ensure that both the scratchpad and POI
archives are in your classpath and you should be able to successfully
compile and run the code. Any problems, just let me know.

Yours

Mark B



Fabián Avilés Martínez wrote:
>
> Wow, thats great. At least I have new direction to work with. I have been
> struggling myself for at least three days. I can not try it today, but
> tomorrow wil be the first thing I am going to do. I will told you the
> results.
>
> Thank you so nuch.
>
> -----Mensaje original-----
> De: MSB [mailto:[email protected]]
> Enviado el: martes, 24 de noviembre de 2009 16:51
> Para: [email protected]
> Asunto: RE: Modify word document
>
>
> I have had the chance to play around with some code and I have to admit
> that
> I was wrong, on two counts.
>
> Firstly, if you do drill down to the level of the CharacterRun and perform
> a
> replacement operation there, you will not retain the formatting applied to
> the text, further more, it seems to fail completely; no replacements will
> be
> made in the document at all. To have the search term be successfully
> replaced, you DO need to operate at the Pargraph level.
>
> Secondly, if the search term is shorter than the replacement term, then
> HWPF
> will throw an exception. It seems quite happy to work if the replacement
> term is equal to or longer - in terms of the number of characters - than
> the
> search term.
>
> Please see the code I have attached below;
>
> /* ====================================================================
>    Licensed to the Apache Software Foundation (ASF) under one or more
>    contributor license agreements.  See the NOTICE file distributed with
>    this work for additional information regarding copyright ownership.
>    The ASF licenses this file to You under the Apache License, Version 2.0
>    (the "License"); you may not use this file except in compliance with
>    the License.  You may obtain a copy of the License at
>
>        http://www.apache.org/licenses/LICENSE-2.0
>
>    Unless required by applicable law or agreed to in writing, software
>    distributed under the License is distributed on an "AS IS" BASIS,
>    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> implied.
>    See the License for the specific language governing permissions and
>    limitations under the License.
> ==================================================================== */
>
> package newsearchreplace;
>
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileOutputStream;
> import java.io.FileNotFoundException;
> import java.io.IOException;
> import java.util.HashMap;
> import java.util.Set;
>
> import org.apache.poi.hwpf.HWPFDocument;
> import org.apache.poi.hwpf.usermodel.Range;
> import org.apache.poi.hwpf.usermodel.Paragraph;
> import org.apache.poi.hwpf.usermodel.CharacterRun;
>
>
> /**
>  *
>  * @author win Mark Beardsley [msb at apache.org]
>  * @version 1.00
>  */
> public class SearchReplace {
>
>     private HashMap<String, String> searchTerms = null;
>     private Set<String> searchKeys = null;
>     private HWPFDocument wordDocument = null;
>
>     public SearchReplace() {
>         searchTerms = new HashMap<String, String>();
>         // The first String is the text that will be searched for, the
> second is what will be used to
>         // replace it. Of course, it is possible to create more than one
> search term, replacement text
>         // pairing.
>         searchTerms.put("replace", "tester");
>         searchKeys = searchTerms.keySet();
>     }
>
>     public void openTemplate(String filename) throws
> FileNotFoundException,
> IOException {
>         File file = null;
>         FileInputStream fis = null;
>         try {
>             file = new File(filename);
>             fis = new FileInputStream(file);
>             this.wordDocument = new HWPFDocument(fis);
>         }
>         finally {
>             if(fis != null) {
>                 try {
>                     fis.close();
>                     fis = null;
>                 }
>                 catch(Exception ex) {
>                     // I G N O R E
>                 }
>             }
>         }
>     }
>
>     public void searchAndReplace() {
>         Range docRange = this.wordDocument.getRange();
>         int numParas = docRange.numParagraphs();
>         for(int i = 0; i < numParas; i++) {
>             Paragraph para = docRange.getParagraph(i);
>             int numCharRuns = para.numCharacterRuns();
>             for(int j = 0; j < numCharRuns; j++) {
>                 CharacterRun charRun = para.getCharacterRun(j);
>                 String text = charRun.text();
>                 for(String key : this.searchKeys) {
>                     if(text.contains(key)) {
>                         String replacementTerm =
> this.searchTerms.get(key);
>                         charRun.replaceText(replacementTerm, key);
>                         System.out.println("Found: " + key + " in " + text
> +
> ". Will replace with: " + replacementTerm);
>                     }
>                 }
>             }
>         }
>
>     }
>
>     public void searchReplace() {
>         Range docRange = this.wordDocument.getRange();
>         int numParas = docRange.numParagraphs();
>         for(int i = 0; i < numParas; i++) {
>             Paragraph para = docRange.getParagraph(i);
>             String text = para.text();
>             for(String key : this.searchKeys) {
>                 if(text.contains(key)) {
>                     String replacementTerm = this.searchTerms.get(key);
>                     para.replaceText(key, replacementTerm);
>                 }
>             }
>         }
>     }
>
>     public void saveResults(String filename) throws FileNotFoundException,
> IOException {
>         File file = null;
>         FileOutputStream fos = null;
>         try {
>             file = new File(filename);
>             fos = new FileOutputStream(file);
>             this.wordDocument.write(fos);
>         }
>         finally {
>             if(fos != null) {
>                 try {
>                     fos.close();
>                     fos = null;
>                 }
>                 catch(Exception ex) {
>                     // I G N O R E
>                 }
>             }
>         }
>     }
>
>     /**
>      * @param args the command line arguments
>      */
>     public static void main(String[] args) {
>         try {
>             SearchReplace sr = new SearchReplace();
>             sr.openTemplate("C:/temp/Test Document.doc");
>             sr.searchAndReplace();
>             //sr.searchReplace();
>             sr.saveResults("C:/temp/New Updated Document.doc");
>         }
>         catch(Exception ex) {
>             System.out.println("Caught an: " + ex.getClass().getName());
>             System.out.println("Message: " + ex.getMessage());
>             System.out.println("Stacktrace follows............");
>             ex.printStackTrace(System.out);
>         }
>     }
> }
>
> More particularly, look at the main method. If you comment out the
> sr.searchAndReplace() and un-comment the sr.searchReplace() line, then the
> code will work successfully. But, and this is a BIG but, it will only work
> if you compile and run it against 3.2 FINAL of the API. I have found that
> later versions seem to 'drop' or lose the formatting information
> completely;
> to convince yourself of this, just modify the main method so that it
> contains only these lines of code;
>
> SearchReplace sr = new SearchReplace();
> sr.openTemplate("C:/temp/Test Document.doc");
> sr.saveResults("C:/temp/New Updated Document.doc");
>
> If you run that against versions later than 3.2 FINAL, you should see that
> the copy of the original document that this produces loses all of it's
> formatting.
>
> Yours
>
> Mark B
>
> PS. I guess that it should go without saying, you will need to replace the
> paths to and document names passed to the openTemplate() and saveResults()
> methods to point to locations and files that exist on your machine.
>
> PPS Forgive the lack of comments please. I hope that the it is apparant
> just
> what the methods do.
>
>
> Fabián Avilés Martínez wrote:
>>
>> Hi, as I told you, I have tried it, but with the same result, the
>> resulting file is corrupted, that is what MSWord says. My next approach
>> is
>> to create a copy file, and do modifications within this file. My problem
>> is that I do not know how to save modifications done in the charRuns of
>> the paragraphs, what I mean is to persist modifications done in the
>> resulting file, without have to coopy it, calling
>> document.write(outputStream)
>>
>> My code is:
>>
>> public File processFile(final InputStream is, final Map<String, String>
>> replacementText) throws IOException {
>>         Set<String> keys = replacementText.keySet();
>>         try {
>>             // Makes a copy of the file.
>>             File res = copyfile(is);
>>             InputStream auxIs = new FileInputStream(res);
>>             POIFSFileSystem poifs = new POIFSFileSystem(auxIs);
>>             HWPFDocument document = new HWPFDocument(poifs);
>>             Range range = document.getRange();
>>
>>             for (int i = 0; i < range.numParagraphs(); i++) {
>>                 Paragraph paragraph = range.getParagraph(i);
>>                 int numCharRuns = paragraph.numCharacterRuns();
>>                 for (int j = 0; j < numCharRuns; j++) {
>>                     CharacterRun charRun = paragraph.getCharacterRun(j);
>>                     for (Iterator<String> it = keys.iterator();
>> it.hasNext();) {
>>                         String key = it.next();
>>                         if (charRun.text().contains(key)) {
>>                             String value = replacementText.get(key);
>>                             charRun.replaceText(key, value);
>>                             range = document.getRange();
>>                             paragraph = range.getParagraph(i);
>>                             charRun = paragraph.getCharacterRun(j);
>>                         }
>>                     }
>>                 }
>>             }
>>             is.close();
>>             return res;
>>         } catch (IOException e) {
>>             logger.error("Error procesando el fichero WORD: " + e);
>>             throw new IOException("Error procesando el fichero WORD");
>>         } finally {
>>             if (is != null) {
>>                 is.close();
>>             }
>>         }
>>     }
>>
>>
>> Thanks in advance, Fabi.
>>
>> -----Mensaje original-----
>> De: MSB [mailto:[email protected]]
>> Enviado el: martes, 24 de noviembre de 2009 8:43
>> Para: [email protected]
>> Asunto: Re: Modify word document
>>
>>
>> You have not dug down far enough into the structure of the document yet I
>> am
>> afraid - all of the formatting information is stopred (encapsulated)
>> within
>> the CharacterRun class and you need to perform the repllacements at that
>> level.
>>
>> I do not have any suitable code at hand as I type this so what follows
>> will
>> need to be converted into Java and tested;
>>
>> Open the Word document.
>> Get the overall Range for the document.
>> Get the number of Paragraph objects the Range contains.
>> Iterate through the Pargraphs and for each Pargraph
>>     Get the CharacterRun(s) the Paragraph contains.
>>     Call the method to replace the search term with the replacement text
>> on
>> the CharacterRun
>> Save the modified document away again.
>>
>> You do however face a couple of problems with this. It has been a long
>> time
>> since I tried to write a search and replace routine using HWPF and I
>> could
>> not get it to work if the replacement text was longer that the search
>> term.
>> In that case, HWPF threw an exception and would not allow me to complete
>> the
>> process; but that problem could well have been addressed by now as it was
>> well known and caused by faulty bounds checking within the Range class.
>> Only
>> testing will prove or disprove this for you I am afraid.
>>
>> Secondly, the CharacterRun class encapsulates a piece of text with common
>> properties. So, imagine that we are searching for the phrase 'search
>> term'
>> and that the word 'search' has been emboldened whilst the word 'term' has
>> been left as normal text, then my suggested approach will not work. That
>> is
>> because the words search and term will be held in different
>> CharacterRun(s).
>> If you do hit this problem, then I am afraid you will have to write code
>> that searches for the term at the Paragraph level and that identifies
>> where
>> the search terms can be found and recovers the CharacterRun(s) that
>> encapsulate them. Once you have these, you can modify the runs or create
>> and
>> substitute new ones but I have to admit that I have never tried to do
>> this
>> myself. Instead I chose to automate Word using OLE and to explore the
>> possibilities offered by OpenOffices UNO interface. Both options did work
>> but threw up other problems that proved more limiting (in terms of
>> architecture and platform). If you can get it to work, HWPF offers the
>> better solution IMO.
>>
>> Yours
>>
>> Mark B
>>
>>
>> Fabián Avilés Martínez wrote:
>>>
>>> Hi all,
>>>      I have a Word document, as a template: In this template there are
>>> some
>>> tokenized words, which have to be modified and the result has to be
>>> saved
>>> into another file. The original file has some properties, like header
>>> and
>>> footer, images, etc. The resulting file has to be the same, but with the
>>> modified words. I am trying it with the code below, but it does not
>>> work.
>>>
>>> public ByteArrayOutputStream processFile(final InputStream is, final
>>> Map<String, String> replacementText)
>>>         throws IOException {
>>>         Set<String> keys = replacementText.keySet();
>>>         try {
>>>             POIFSFileSystem poifs = new POIFSFileSystem(is);
>>>             HWPFDocument document = new HWPFDocument(poifs);
>>>             Range range = document.getRange();
>>>
>>>             for (int i = 0; i < range.numParagraphs(); i++) {
>>>                 String newTxt = range.getParagraph(i).text();
>>>                 String oldTxt = range.getParagraph(i).text();
>>>                 for (Iterator<String> it = keys.iterator();
>>> it.hasNext();)
>>> {
>>>                     String key = it.next();
>>>                     if (newTxt.contains(key)) {
>>>                         newTxt = replacePlaceholders(key,
>>> replacementText.get(key), newTxt);
>>>                     }
>>>                 }
>>>                 if (!oldTxt.equals(newTxt)) {
>>>                     range.getParagraph(i).replaceText(oldTxt, newTxt);
>>>                 }
>>>             }
>>>
>>>             // Save the document away.
>>>             ByteArrayOutputStream bos = new ByteArrayOutputStream();
>>>             document.write(bos);
>>>             bos.flush();
>>>             bos.close();
>>>             return bos;
>>>         } catch (IOException e) {
>>>             logger.error("Error procesando el fichero WORD: " + e);
>>>             throw new IOException("Error procesando el fichero WORD");
>>>         } finally {
>>>             if (is != null) {
>>>                 is.close();
>>>             }
>>>         }
>>>     }
>>>
>>> Any help, please?
>>>
>>> Thanks in advance, Fabi.
>>>
>>>
>>>
>>> ______________________
>>> This message including any attachments may contain confidential
>>> information, according to our Information Security Management System,
>>>  and intended solely for a specific individual to whom they are
>>> addressed.
>>>  Any unauthorised copy, disclosure or distribution of this message
>>>  is strictly forbidden. If you have received this transmission in error,
>>>  please notify the sender immediately and delete it.
>>>
>>> ______________________
>>> Este mensaje, y en su caso, cualquier fichero anexo al mismo,
>>>  puede contener informacion clasificada por su emisor como confidencial
>>>  en el marco de su Sistema de Gestion de Seguridad de la
>>> Informacion siendo para uso exclusivo del destinatario, quedando
>>> prohibida su divulgacion copia o distribucion a terceros sin la
>>> autorizacion expresa del remitente. Si Vd. ha recibido este mensaje
>>>  erroneamente, se ruega lo notifique al remitente y proceda a su
>>> borrado.
>>> Gracias por su colaboracion.
>>>
>>> ______________________
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [email protected]
>>> For additional commands, e-mail: [email protected]
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Modify-word-document-tp26480450p26491636.html
>> Sent from the POI - User mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>> ______________________
>> This message including any attachments may contain confidential
>> information, according to our Information Security Management System,
>>  and intended solely for a specific individual to whom they are
>> addressed.
>>  Any unauthorised copy, disclosure or distribution of this message
>>  is strictly forbidden. If you have received this transmission in error,
>>  please notify the sender immediately and delete it.
>>
>> ______________________
>> Este mensaje, y en su caso, cualquier fichero anexo al mismo,
>>  puede contener informacion clasificada por su emisor como confidencial
>>  en el marco de su Sistema de Gestion de Seguridad de la
>> Informacion siendo para uso exclusivo del destinatario, quedando
>> prohibida su divulgacion copia o distribucion a terceros sin la
>> autorizacion expresa del remitente. Si Vd. ha recibido este mensaje
>>  erroneamente, se ruega lo notifique al remitente y proceda a su borrado.
>> Gracias por su colaboracion.
>>
>> ______________________
>>
>>
>>
>
> --
> View this message in context:
> http://old.nabble.com/Modify-word-document-tp26480450p26498333.html
> Sent from the POI - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
> ______________________
> This message including any attachments may contain confidential
> information, according to our Information Security Management System,
>  and intended solely for a specific individual to whom they are addressed.
>  Any unauthorised copy, disclosure or distribution of this message
>  is strictly forbidden. If you have received this transmission in error,
>  please notify the sender immediately and delete it.
>
> ______________________
> Este mensaje, y en su caso, cualquier fichero anexo al mismo,
>  puede contener informacion clasificada por su emisor como confidencial
>  en el marco de su Sistema de Gestion de Seguridad de la
> Informacion siendo para uso exclusivo del destinatario, quedando
> prohibida su divulgacion copia o distribucion a terceros sin la
> autorizacion expresa del remitente. Si Vd. ha recibido este mensaje
>  erroneamente, se ruega lo notifique al remitente y proceda a su borrado.
> Gracias por su colaboracion.
>
> ______________________
>
>
>

--
View this message in context: 
http://old.nabble.com/Modify-word-document-tp26480450p26498547.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


______________________
This message including any attachments may contain confidential 
information, according to our Information Security Management System,
 and intended solely for a specific individual to whom they are addressed.
 Any unauthorised copy, disclosure or distribution of this message
 is strictly forbidden. If you have received this transmission in error,
 please notify the sender immediately and delete it.

______________________
Este mensaje, y en su caso, cualquier fichero anexo al mismo,
 puede contener informacion clasificada por su emisor como confidencial
 en el marco de su Sistema de Gestion de Seguridad de la 
Informacion siendo para uso exclusivo del destinatario, quedando 
prohibida su divulgacion copia o distribucion a terceros sin la 
autorizacion expresa del remitente. Si Vd. ha recibido este mensaje 
 erroneamente, se ruega lo notifique al remitente y proceda a su borrado. 
Gracias por su colaboracion.

______________________

Reply via email to