On Thu, 22 Sep 2011, Jebarlin Robertson wrote:
I am using Apache POI for reading MS Office 2007 documents, so I need some sample code to read the text from all the MS Office documents.

If you just want plain text, then POI already has code for this. You'd want classes such as org.apache.poi.xssf.extractor.XSSFExcelExtractor and org.apache.poi.xslf.extractor.XSLFPowerPointExtractor

Apache Tika may also help you here, it has code to extract contents from a wide variety of files (including Microsoft Office ones via POI), all via a common interface

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to