tika-user  

Office 2007?

Mark Kerzner
Wed, 11 Nov 2009 18:43:13 -0800

Hi,

I tried to extract text from an Office 2207 Word and Excel, and Tika thinks
they are XML files. "file" command in Linux thinks they are "zip' files.

Where should I look for the current format list? What are the plans for
Office 2007?

Thank you,
Mark