Hi, On Sat, Feb 14, 2009 at 12:59 PM, Akil Ali <[email protected]> wrote: > i can see that there are numbers of filters available in the latest version. > But will it be able to extract the contents of office 2007 documents. is > anyone tested with indexing contents of office 2007 documents.
See JCR-1887 [1] for a patch that adds support for indexing Office 2007 documents. Alternatively, the latest trunk of Apache Tika [2] also supports Office 2007, and you can the jackrabbit-tika sandbox component [3] allows you to set up Tika as a text extractor in Jackrabbit. We will most likely have Office 2007 support built in when Jackrabbit 1.6 is released. [1] https://issues.apache.org/jira/browse/JCR-1887 [2] http://lucene.apache.org/tika/ [3] http://svn.apache.org/repos/asf/jackrabbit/sandbox/jackrabbit-tika/ BR, Jukka Zitting
