tika-user  

parsing old Excel files

Tomas Fernandez Lobbe
Wed, 16 Dec 2009 11:27:00 -0800

Hi, I'm trying to parse a big set of  Miscrosoft Word and Microsoft Excel 
files. I'm having a problem with some old excel files, they are not being 
parsed (both, metadata and content info is empty after parseing them).


For example, if I run a test similar to ExcelParserTest with my old excel file, 
the parsing doesn't return any data. 
Debugging the parser code (OfficeParser) a little bit I found that there is not 
an entry with the the name "Workbook" in this excel file, there is an entry 
with the name "Book" instead, but anyway, the ExcelExtractor wont work with 
this file (tried it).

Did someone faced this problem before? Does somebody knows the first excel 
version that can be parsed with tika?

Thanks


Tomás


      Yahoo! Cocina

Encontra las mejores recetas con Yahoo! Cocina.


http://ar.mujer.yahoo.com/cocina/