OK, I have managed to make some more progress and hit a problem; today has
been a case of taking one step forward and two back I am afraid.

Firstly, this code uses the CharacterRun class to perform the replacements
and DOES preserve the formatting;

    public void searchAndReplace(String inputFilename,
                                 String outputFilename,
                                 HashMap<String, String> replacements) {
        
        File inputFile = null;
        File outputFile = null;
        FileInputStream fileIStream = null;
        FileOutputStream fileOStream = null;
        BufferedInputStream bufIStream = null;
        BufferedOutputStream bufOStream = null;
        POIFSFileSystem fileSystem = null;
        HWPFDocument document = null;
        Range docRange = null;
        Paragraph paragraph = null;
        CharacterRun charRun = null;
        Set<String> keySet = null;
        Iterator<String> keySetIterator = null;
        int numParagraphs = 0;
        int numCharRuns = 0;
        String text = null;
        String key = null;
        String value = null;
        
        try {
            // Create an instance of the POIFSFileSystem class and
            // attach it to the Word document using an InputStream.
            inputFile = new File(inputFilename);
            fileIStream = new FileInputStream(inputFile);
            bufIStream = new BufferedInputStream(fileIStream);
            fileSystem = new POIFSFileSystem(bufIStream);
        
            document = new HWPFDocument(fileSystem);
          
            docRange = document.getRange();
            
            numParagraphs = docRange.numParagraphs();
           
            keySet = replacements.keySet();
            
            for(int i = 0; i < numParagraphs; i++) {
                paragraph = docRange.getParagraph(i);
                text = paragraph.text();
                numCharRuns = paragraph.numCharacterRuns();
                for(int j = 0; j < numCharRuns; j++) {
                    charRun = paragraph.getCharacterRun(j);
                    text = charRun.text();
                    System.out.println("Character Run text: " + text);
                    keySetIterator = keySet.iterator();
                    while(keySetIterator.hasNext()) {
                        key = keySetIterator.next();
                        if(text.contains(key)) {
                            value = replacements.get(key);
                            charRun.replaceText(key, value);
                            docRange = document.getRange();
                            paragraph = docRange.getParagraph(i);
                            charRun = paragraph.getCharacterRun(j);
                            text = charRun.text();
                        }
                    }
                }
            }
            
            bufIStream.close();
            bufIStream = null;
            
            outputFile = new File(outputFilename);
            fileOStream = new FileOutputStream(outputFile);
            bufOStream = new BufferedOutputStream(fileOStream);
            
            document.write(bufOStream);
        
        }
        catch(Exception ex) {
            System.out.println("Caught an: " + ex.getClass().getName());
            System.out.println("Message: " + ex.getMessage());
            System.out.println("Stacktrace follows.............");
            ex.printStackTrace(System.out);
        }
        finally {
            if(bufIStream != null) {
                try {
                    bufIStream.close();
                    bufIStream = null;
                }
                catch(Exception ex) {
                    // I G N O R E //
                }
            }
            if(bufOStream != null) {
                try {
                    bufOStream.flush();
                    bufOStream.close();
                    bufOStream = null;
                }
                catch(Exception ex) {
                    // I G N O R E //
                }
            }
        }
    }


To call it, you code something like the following;

HashMap<String, String> replacements = new HashMap<String, String>();
replacements.put("${Search Term One}", "First  Replacement Value");
replacements.put("${Search Term Two}", "Second Replacement Value");
replacements.put("${Search Term Three}", "Third Replacement Value");
replacements.put("${Search Term Four}", "Fourth Replacement Value");
        
new Main().searchAndReplace("C:/temp/test_file3.doc",
                                             "C:/temp/test_file 1.doc",
                                             replacements);

There is one interesting twist to the code - at least I found it
interesting. After making an insertino, it is necessary to again recover the
documents contents as the value being inserted could be different to that
you serached for - in terms of their respective lengths. Failing to allow
for this caused quite  few problems.

The BIG problem is that the files the code produces can be opened by Word
but are somehow corrupted. If you try to modify the file using Word or even
to save it again under a different name, the application - Word that is
seems to hang. I found that the same problem could be reproduced even if I
simply 'copied' the file using this code;

inputFile = new File(inputFilename);
fileIStream = new FileInputStream(inputFile);
bufIStream = new BufferedInputStream(fileIStream);
fileSystem = new POIFSFileSystem(bufIStream);
document = new HWPFDocument(fileSystem);
outputFile = new File(outputFilename);
fileOStream = new FileOutputStream(outputFile);
bufOStream = new BufferedOutputStream(fileOStream);
document.write(bufOStream);

Currently, I am using 3.2 final whcih is getting old now. After I have
posted this message, I am going to get the most recent build and try with
that to see if the same issue occurs. Will let you know.

PS The first chunk of code I posted also has the corrupted file issue.




bfavro-2 wrote:
> 
> 
> 
> Can anyone offer help on modifying Word 97-2003 documents in HWPF using
> POI 3.2? 
> 
> 
> 
> I am able to use insertBefore(text) and insertAfter(text) without any
> issues but my requirement is to change/replace tokens in the Word file
> with dynamic text much like the unit tests, e.g. ${organization} with
> "Apache Software Foundation" 
> 
> 
> 
> When I do this with Range.replaceText("${organization}", "Apache Software
> Foundation") the doc file keeps getting corrupted.  In viewing the file in
> a text editor I can see the replacement text but it is not compressed like
> the rest of the file. 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Modify-Word-document-tp23220457p23233627.html
Sent from the POI - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to