On Thu, 26 Aug 2010, [email protected] wrote:
Are there utilities in POI to detect a file's format (suppose the file
has no dos extension), at least for office files? If so can somebody
please point me to the spot?
If you're really not sure at all of the format, use Apache Tika:
http://tika.apache.org/0.7/detection.html
POI has org.apache.poi.extractor.ExtractorFactory, which will pick the
correct kind of text extractor for any supported file format, which may
get you close. From a POIOLE2TextExtractor or POIXMLTextExtractor you can
get at the underlying open document
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]