The Apache Tika project is pleased to announce the release of Apache
Tika 4.0.0-beta-1. The release contents have been pushed out to the main
Apache release site and to the Maven Central sync.

Apache Tika is a toolkit for detecting and extracting metadata and
structured text content from various documents using existing parser
libraries.

Apache Tika 4.0.0-beta-1 includes a switch to Markdown as
the default content handler, new runnable zip distributions for
tika-app, tika-server and tika-eval (with drop-in pf4j pipes plugins)
in place of shaded jars, and a new maxPages option in PDFParserConfig
to cap PDF page processing. This release also includes dependency
upgrades, including Jetty 12.x and CXF 4.1.x.

Details can be found in the
changes file: 
https://www.apache.org/dist/tika/4.0.0-beta-1/CHANGES-4.0.0-beta-1.txt
and in our draft 4.x docs site: https://tika.apache.org/docs/4.0.0-SNAPSHOT/

Apache Tika is available on the download page:
https://tika.apache.org/download.html

Apache Tika will be available shortly in binary form or for use using
Maven 2 from the Central Repository:
https://repo1.maven.org/maven2/org/apache/tika/

When downloading, please remember to verify the downloads using
signatures found: https://www.apache.org/dist/tika/KEYS

For more information on Apache Tika, visit the project home page:
https://tika.apache.org/


-- Tim Allison, on behalf of the Apache Tika community

Reply via email to