If you just need the text from the document, try the ExtractorFactory

============================
import java.io.File;
import org.apache.poi.extractor.ExtractorFactory;
import org.apache.poi.POITextExtractor;

public class GetTextExample {

        public static void main(String[] args) {
                try {
                        File inputFile = new File("c:\\test\\docs\\test.docx");
                        POITextExtractor extractor = 
ExtractorFactory.createExtractor(inputFile);
                        System.out.println("Word Document Text: ");
                        System.out.println("====================");
                        System.out.println(extractor.getText());
                }
                catch (Exception ex) {
                        ex.printStackTrace();           
                }
        }

}

============================
my classpath:
        poi-3.5-beta5/poi-3.5-beta5-20090219.jar
        poi-3.5-beta5/poi-contrib-3.5-beta5-20090219.jar
        poi-3.5-beta5/poi-ooxml-3.5-beta5-20090219.jar
        poi-3.5-beta5/poi-scratchpad-3.5-beta5-20090219.jar
        poi-3.5-beta5/lib/log4j-1.2.13.jar
        poi-3.5-beta5/ooxml-lib/*.jar

HTH
Leigh

> pof wrote:
> > 
> > Hi, I was wondering if someone could provide an
> example how to parse out
> > the plain text from a docx using poi 3.5 beta5?
> > 
> > Cheers, Brett.
> > 



      


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to