Thank you very much, Nick! -----Original Message----- From: Nick Burch [mailto:[email protected]] Sent: Thursday, August 26, 2010 9:55 AM To: POI Users List Subject: Re: detect format using POI
On Thu, 26 Aug 2010, [email protected] wrote: > Are there utilities in POI to detect a file's format (suppose the file > has no dos extension), at least for office files? If so can somebody > please point me to the spot? If you're really not sure at all of the format, use Apache Tika: http://tika.apache.org/0.7/detection.html POI has org.apache.poi.extractor.ExtractorFactory, which will pick the correct kind of text extractor for any supported file format, which may get you close. From a POIOLE2TextExtractor or POIXMLTextExtractor you can get at the underlying open document Nick --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
