Mark Kerzner
Wed, 11 Nov 2009 18:43:13 -0800
Hi, I tried to extract text from an Office 2207 Word and Excel, and Tika thinks they are XML files. "file" command in Linux thinks they are "zip' files.
Where should I look for the current format list? What are the plans for Office 2007? Thank you, Mark