The Apache Lucene project are pleased to announce the immediate availability of Apache Tika 0.2.
Apache Tika, a subproject of Apache Lucene, is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 0.2 contains a number of improvements and bug fixes. Details can be found in the changes file: http://www.apache.org/dist/lucene/tika/CHANGES-0.2.txt Apache Tika is available in source form from the following download page: http://www.apache.org/dyn/closer.cgi/lucene/tika/apache-tika-0.2-src.tar.gz Apache Tika is also available in binary form or for use using Maven 2 from the Central Maven Repositories: http://repo1.maven.org/maven2/org/apache/tika/0.2/ http://mirrors.ibiblio.org/pub/mirrors/maven2/org/apache/tika/0.2/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: http://www.apache.org/dist/lucene/tika/KEYS For more information on Apache Tika, visit the project home page: http://lucene.apache.org/tika -- Dave Meikle (on behalf of the Apache Lucene community)