Hello,

I'm looking at using Tika for extracting information from office documents (amoung others) and using this information to build a lucene index. However, I need to be able to extract information from OLE2 office docs and OOXML office docs. Looking at the website, there is a comment that OOXML is awaiting a 3.5 release - however on the ticket it looks like it is working on the head. Is this the case? If so then I'd be keen to take the head and 'give it a go' (reporting any problems I find back to the dev group of course).

 Thanks for your time.

Cheers,

Neil

Reply via email to