[ANNOUNCE] Apache Tika 3.0.0-BETA2 released
The Apache Tika project is pleased to announce the release of Apache Tika 3.0.0-BETA2. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 3.0.0-BETA2 includes numerous bug fixes and dependency upgrades. The biggest change in the 3.x branch is that it requires >= Java 11. Details can be found in the changes file: https://www.apache.org/dist/tika/3.0.0-BETA2/CHANGES-3.0.0-BETA2.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika will be available shortly in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ NOTE: This release requires Java 11. We plan to support the 2.x branch (which requires Java 8) for six months after the release of 3.0.0. -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.9.2 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.9.2. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.9.2 includes numerous bug fixes and dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/2.9.2/CHANGES-2.9.2.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 3.0.0-BETA released
The Apache Tika project is pleased to announce the release of Apache Tika 3.0.0-BETA. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 3.0.0-BETA includes numerous bug fixes and dependency upgrades. The biggest change in the 3.x branch is that it requires >= Java 11. Details can be found in the changes file: https://www.apache.org/dist/tika/3.0.0-BETA/CHANGES-3.0.0-BETA.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika will be available shortly in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ NOTE: Users of the tika-pipes Solr jars (tika-emitter-solr and tika-pipes-iterator-solr) should take steps to mitigate the risks of logback related CVEs: CVE-2023-6481/CVE-2023-6378. NOTE: This release requires Java 11. We plan to support the 2.x branch (which requires Java 8) for six months after the release of 3.0.0. -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.9.1 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.9.1. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.9.1 includes numerous bug fixes and dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/2.9.1/CHANGES-2.9.1.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.8.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.8.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.8.0 includes optional handling of incremental updates in PDFs, a fix for a bug that had prevented the running of exiftool and ffmpeg by default, and the move of the GeoTopic parser back to its 1.x namespace: o.a.t.parser.geo.topic. There are several other improvements, bug fixes and dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/2.8.0/CHANGES-2.8.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
ANNOUNCE] Apache Tika 2.7.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.7.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.7.0 includes a new optional integration with Siegfried, a bug fix for the OpenSearch emitter and several dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/2.7.0/CHANGES-2.7.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.6.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.6.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.6.0 includes a new optional integration with Siegfried, a bug fix for the OpenSearch emitter and several dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/2.6.0/CHANGES-2.6.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.5.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.5.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.5.0 includes several security upgrades in dependencies. Details can be found in the changes file: https://www.apache.org/dist/tika/2.5.0/CHANGES-2.5.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.28.5 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.28.5. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.28.5 contains a security-related fix and dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/1.28.5/CHANGES-1.28.5.txt NOTE: The 1.x branch is now in security-fixes-only mode. The formal EoL for the 1.x branch is 30 September 2022: https://lists.apache.org/thread/yq6n7o01kw544dvj1jsoqk29g6yqjkp3 If there are no security issues identified by 30 September 2022, this will be the last 1.x version released. Please upgrade to 2.4.x at your earliest convenience. For guidance on this upgrade: https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0 Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
CVE-2022-33879: Apache Tika: Incomplete fix and new regex DoS in StandardsExtractingContentHandler
Severity: low Description: The initial fixes in CVE-2022-30126 and CVE-2022-30973 for regexes in the StandardsExtractingContentHandler were insufficient, and we found a separate, new regex DoS in a different regex in the StandardsExtractingContentHandler. These are now fixed in 1.28.4 and 2.4.1 (download: https://tika.apache.org/download.html). See https://tika.apache.org/security.html for a full list of known security issues. Credit: This incomplete fix was discovered and reported by the CodeQL team member [@atorralba (Tony Torralba)](https://github.com/atorralba) and [@jarlob (Jaroslav Lobačevski)](https://github.com/jarlob) from Github Security Lab. The new ReDos was discovered by the Apache Tika team.
[ANNOUNCE] Apache Tika 1.28.4 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.28.4. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.28.4 contains security-related fixes and dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/1.28.4/CHANGES-1.28.4.txt NOTE: The 1.x branch is now in security-fixes-only mode. The PMC has decided the formal EoL for the 1.x branch is 30 September 2022: https://lists.apache.org/thread/yq6n7o01kw544dvj1jsoqk29g6yqjkp3 Please upgrade to 2.4.1 at your earliest convenience. For guidance on this upgrade: https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0 Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.4.1 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.4.1. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.4.1 includes improved customization and configuration and several upgrades in dependencies. Details can be found in the changes file: https://www.apache.org/dist/tika/2.4.1/CHANGES-2.4.1.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
CVE-2022-30973: Apache Tika: Missing fix for CVE-2022-30126 in 1.28.2
Description: We failed to apply the fix for CVE-2022-30126 to the 1.x branch in the 1.28.2 release. In Apache Tika, a regular expression in the StandardsText class, used by the StandardsExtractingContentHandler could lead to a denial of service caused by backtracking on a specially crafted file. This only affects users who are running the StandardsExtractingContentHandler, which is a non-standard handler. This is fixed in 1.28.3. Mitigation: Avoid using the StandardsExtractingContentHandler or upgrade to Tika 1.28.3 or 2.4.0 Credit: This issue was reported by Cathy Hu, SUSE Software Solutions Germany GmbH.
[ANNOUNCE] Apache Tika 1.28.3 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.28.3. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.28.3 contains security-related fixes and dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/1.28.3/CHANGES-1.28.3.txt NOTE: The 1.x branch is now in security-fixes-only mode. The PMC has decided the formal EoL for the 1.x branch is 30 September 2022: https://lists.apache.org/thread/yq6n7o01kw544dvj1jsoqk29g6yqjkp3 Please upgrade to 2.4.0 at your earliest convenience. For guidance on this upgrade: https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0 Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
CVE-2022-25169: Apache Tika BPGParser Memory Usage DoS
Description: The BPG parser in versions of Tika before 1.28.2 and 2.4.0 may allocate an unreasonable amount of memory on carefully crafted files.
CVE-2022-30126: Apache Tika Regular Expression Denial of Service in Standards Extractor
Severity: low Description: A regular expression in our StandardsText class, used by the StandardsExtractingContentHandler could lead to a denial of service caused by backtracking on a specially crafted file. This only affects users who are running the StandardsExtractingContentHandler, which is a non-standard handler. This is fixed in 1.28.2 and 2.4.0 Mitigation: Upgrade to 1.28.2 or 2.4.0 Credit: This issue was discovered and reported by the CodeQL team members [@atorralba (Tony Torralba)](https://github.com/atorralba) and [@joefarebrother (Joseph Farebrother)](https://github.com/joefarebrother).
[ANNOUNCE] Apache Tika 1.28.2 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.28.2. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.28.2 contains security-related and general dependency upgrades. This release also includes a non-trivial upgrade to Apache POI 5.2.0 (TIKA-3164); users will observe significantly more logging from the POI parsers. Details can be found in the changes file: https://www.apache.org/dist/tika/1.28.2/CHANGES-1.28.2.txt NOTE: The 1.x branch is now in security-fixes-only mode. The PMC has decided the formal EoL for the 1.x branch is 30 September 2022: https://lists.apache.org/thread/yq6n7o01kw544dvj1jsoqk29g6yqjkp3 Please upgrade to 2.4.0 at your earliest convenience. For guidance on this upgrade: https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0 Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.4.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.4.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.4.0 includes several security upgrades in dependencies. Note that we no longer bundle the deeplearning4j dependencies in our tika-dl jar; users must provide those on their own. Details can be found in the changes file: https://www.apache.org/dist/tika/2.4.0/CHANGES-2.4.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.28.1 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.28.1. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.28.1 contains security-related and general dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/1.28.1/CHANGES-1.28.1.txt NOTE: The 1.x branch is now in security-fixes-only mode. The PMC has decided the formal EoL for the 1.x branch is 30 September 2022: https://lists.apache.org/thread/yq6n7o01kw544dvj1jsoqk29g6yqjkp3 Please upgrade to 2.3.0 at your earliest convenience. For guidance on this upgrade: https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0 Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.x End-Of-Life (EOL) announcement
The Apache Tika Project Team would like to inform you that the Apache Tika 1.x branch is now in security-only maintenance until September 30, 2022. After that date, we will not make updates or releases from our 1.x branch. We will continue to make security fixes and security-related dependency upgrades in our 1.x branch as necessary until September 30, 2022. We initially announced this on our website on December 16, 2021 with the release of Tika 2.2.0: https://tika.apache.org/ Questions and Answers: With the announcement of Tika 1.x EoL, what happens to Tika 1.x resources? All resources will stay where they are. Users will still be able to download source code from our branch_1x branch via github[1]; and published artifacts will remain available on maven central and in the Apache archives[2]. [1] https://github.com/apache/tika/tree/branch_1x [2] https://archive.apache.org/dist/tika/ Is there an immediate need to upgrade to Tika 2.x in my projects? As of today, there aren't known critical vulnerabilities affecting the soon-to-be-released Tika 1.28.1. However, considering that there are several breaking changes in the 2.x branch, we encourage making the migration soon to allow time to adjust your client code as necessary. For up-to-date documentation on migrating to 2.x, see [3]. [3] https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0 My friends / colleagues and I would like to see Tika 1.x being maintained after September 30, 2022. What can we do? You may fork the existing source and support it on your own. Kind regards - The Apache Tika Team
[ANNOUNCE] Apache Tika 2.3.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.3.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.3.0 includes several security upgrades in dependencies, including an upgrade to log4j2 (version 2.17.1). This release also includes a non-trivial upgrade to Apache POI 5.2.0 (TIKA-3164); users will observe significantly more logging from the POI parsers. Details can be found in the changes file: https://www.apache.org/dist/tika/2.3.0/CHANGES-2.3.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.2.1 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.2.1. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.2.1 contains an upgrade to log4j2 2.17.0, a critical fix to an OOXML parser regression that was introduced in 2.2.0, and upgrades to other dependencies. Details can be found in the changes file: https://www.apache.org/dist/tika/2.2.1/CHANGES-2.2.1.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.28 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.28. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.28 contains a migration to log4j2 (2.17.0) from log4j as well as several other dependency upgrades. Details can be found in the changes file: https://www.apache.org/dist/tika/1.28/CHANGES-1.28.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.2.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.2.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.2.0 contains a mitigation to log4j's CVE-2021-44228 by upgrading to log4j 2.15.0 as well as a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/2.2.0/CHANGES-2.2.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ When downloading, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.1.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.1.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.1.0 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/2.1.0/CHANGES-2.1.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.0.0 released
The Apache Tika project is pleased to announce the release of Apache Tika 2.0.0. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.0.0 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/2.0.0/CHANGES-2.0.0.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.27 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.27. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.27 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/1.27/CHANGES-1.27.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.0.0-BETA released
The Apache Tika project is pleased to announce the release of Apache Tika 2.0.0-BETA. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.0.0-BETA contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/2.0.0-BETA/CHANGES-2.0.0-BETA.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
CVE-2021-28657: Infinite loop in Apache Tika's MP3 parser
Description: A carefully crafted or corrupt file may trigger an infinite loop in Tika's MP3Parser up to and including Tika 1.25. Apache Tika users should upgrade to 1.26 or later. Mitigation: Users should upgrade to 1.26 or later. Credit: Apache Tika would like to thank Khaled Nassar for reporting this issue.
[ANNOUNCE] Apache Tika 1.26 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.26. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.26 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.26.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 2.0.0-ALPHA released
The Apache Tika project is pleased to announce the release of Apache Tika 1.25. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 2.0.0-ALPHA contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-2.0.0-ALPHA.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.25 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.25. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.25 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.25.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2020-9489] Denial of Service (DOS) Vulnerabilities in Some of Apache Tika's Parsers
Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.24 Description: A carefully crafted or corrupt file may trigger a System.exit in Tika's OneNote Parser. Crafted or corrupted files can also cause out of memory errors and/or infinite loops in Tika's ICNSParser, MP3Parser, MP4Parser, SAS7BDATParser, OneNoteParser and ImageParser. Mitigation: Apache Tika users should upgrade to 1.24.1 or later. The vulnerabilities in the MP4Parser were partially fixed by upgrading the com.googlecode:isoparser:1.1.22 dependency to org.tallison:isoparser:1.9.41.2. For unrelated security reasons, we upgraded org.apache.cxf to 3.3.6 as part of the 1.24.1 release. We also upgraded openjson to 1.0.10, org.ow2.asm to 8.0.1, zstd-jni to 1.4.4-9, bouncycastle to 1.65, commons-lang3 to 3.10, lucene to 8.5.0 and mockito to 3.3.3 as part of the 1.24.1 release. Credit: These vulnerabilities were discovered by Tim Allison on the Apache Tika team.
[ANNOUNCE] Apache Tika 1.24.1 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.24.1. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.24.1 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.24.1.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2020-1951] Infinite Loop (DoS) vulnerability in Apache Tika's PSDParser
TItle: [CVE-2020-1951] Infinite Loop (DoS) vulnerability in Apache Tika's PSDParser Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.0 to 1.23 Description: A carefully crafted or corrupt PSD file can cause an infinite loop in Apache Tika's PSDParser in versions 1.0-1.23. Mitigation: Apache Tika users should upgrade to 1.24 or later. Credit: This issue was discovered by Tim Allison on the Apache Tika team.
[CVE-2020-1950] Excessive memory usage (DoS) vulnerability in Apache Tika's PSDParser
Title: [CVE-2020-1950] Excessive memory usage (DoS) vulnerability in Apache Tika's PSDParser Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.0 to 1.23 Description: A carefully crafted or corrupt PSD file can cause excessive memory usage in Apache Tika's PSDParser in versions 1.0-1.23. Mitigation: Apache Tika users should upgrade to 1.24 or later. Credit: This issue was discovered by Pierre Ernst at Elastic.
[ANNOUNCE] Apache Tika 1.24 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.24. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.24 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.24.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.23 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.23. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.23 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.23.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2019-10094] StackOverflow from Crafted Package/Compressed Files in Apache Tika's RecursiveParserWrapper
Title: [CVE-2019-10094] StackOverflow from Crafted Package/Compressed Files in Apache Tika's RecursiveParserWrapper Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.7 to 1.21 Description: A carefully crafted package/compressed file that, when unzipped/uncompressed yields the same file (a quine), causes a StackOverflowError in Apache Tika's RecursiveParserWrapper in versions 1.7-1.21 of Apache Tika. Mitigation: Apache Tika users should upgrade to 1.22 or later. Credit: This issue was discovered by Tim Allison on the Apache Tika team. Many thanks to Matthew Barber and Erling Ellingson for crafting examples and contributing these files to Tika's unit tests.
[CVE-2019-10093] Denial of Service in Apache Tika's 2003ml and 2006ml Parsers
Title: [CVE-2019-10093] Denial of Service in Apache Tika's 2003ml and 2006ml Parsers Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.19 to 1.21 Description: A carefully crafted 2003ml or 2006ml file could consume all available SAXParsers in the pool and lead to very long hangs. Mitigation: Apache Tika users should upgrade to 1.22 or later. Credit: This issue was discovered by Tim Allison on the Apache Tika team.
[CVE-2019-10088] OOM from a crafted Zip File in Apache Tika's RecursiveParserWrapper
Title: [CVE-2019-10088] OOM from a crafted Zip File in Apache Tika's RecursiveParserWrapper Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.7 to 1.21 Description: A carefully crafted or corrupt zip file can cause an OOM in Apache Tika's RecursiveParserWrapper in versions 1.7-1.21. Mitigation: Apache Tika users should upgrade to 1.22 or later. Credit: This issue was discovered by RunningSnail.
[ANNOUNCE] Apache Tika 1.22 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.22. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.22 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.22.txt Apache Tika is available on the download page: https://tika.apache.org/download.html In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.21 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.21. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.21 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.21.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2018-17197] Apache Tika Denial of Service -- Infinite Loop in Tika's SQLite3Parser
[CVE-2018-17197] Apache Tika Denial of Service -- Infinite Loop in Tika's SQLite3Parser Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.8 to 1.19.1 Description: A carefully crafted or corrupt sqlite file can cause an infinite loop in Apache Tika's SQLite3Parser in versions 1.8-1.19.1 of Apache Tika. Mitigation: Apache Tika users should upgrade to 1.20 or later. Credit: This issue was discovered by Tim Allison on the Apache Tika Team.
[ANNOUNCE] Apache Tika 1.20 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.20. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.20 contains a number of improvements and bug fixes. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.20.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2018-11796] Apache Tika Denial of Service via XML Entity Expansion Vulnerability
CVE-2018-11796: Apache Tika Denial of Service via XML Entity Expansion Vulnerability Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 0.1 to 1.19 Description: In Apache Tika 1.19 (CVE-2018-11761), we added an entity expansion limit for XML parsing. However, Tika reuses SAXParsers and calls reset() after each parse, which, for Xerces2 parsers, as per the documentation, removes the user-specified SecurityManager and thus removes entity expansion limits after the first parse. Apache Tika 1.19 is therefore still vulnerable to entity expansions which can lead to a denial of service attack. Mitigation: Apache Tika users should upgrade to 1.19.1 or later Credit: This issue was discovered by Slava Gorelik of CloudAlly.
[ANNOUNCE] Apache Tika 1.19.1 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.19.1. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.19.1 contains two critical bug fixes to the MP3Parser and the handling of SAX parsing. Details can be found in the changes file: https://www.apache.org/dist/tika/CHANGES-1.19.1.txt Apache Tika is available on the download page: https://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: https://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: https://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.19 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.19. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.19 contains a number of improvements and bug fixes. Details can be found in the changes file: http://www.apache.org/dist/tika/CHANGES-1.19.txt Apache Tika is available on the download page: http://tika.apache.org/download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: http://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2018-8017] Apache Tika Denial of Service Vulnerability -- Potential Infinite Loop in IptcAnpaParser
CVE-2018-8017: Apache Tika Denial of Service Vulnerability -- Potential Infinite Loop in IptcAnpaParser Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 1.2 to 1.18 Description: A carefully crafted file can trigger an infinite loop in Apache Tika's IptcAnpaParser. Mitigation: Apache Tika users should upgrade to 1.19 or later. Credit: This issue was discovered by Tobias Ospelt of modzero AG.
[CVE-2018-11762] Zip Slip Vulnerability in Apache Tika's tika-app
CVE-2018-11762: Zip Slip Vulnerability in Apache Tika's tika-app Severity: Low Vendor: The Apache Software Foundation Versions Affected: Apache Tika 0.9 to 1.18 Description: In a rare edge case where a user does not specify an extract directory on the commandline (--extract-dir=) and the input file has an embedded file with an absolute path, such as "C:/evil.bat", tika-app would overwrite that file. Mitigation: Apache Tika users should upgrade to 1.19 or later Credit: This issue was discovered by Tim Allison on the Apache Tika team.
[CVE-2018-11761] Apache Tika DoS XML Entity Expansion Vulnerability
CVE-2018-11761: Apache Tika Denial of Service via XML Entity Expansion Vulnerability Severity: Medium Vendor: The Apache Software Foundation Versions Affected: Apache Tika 0.1 to 1.18 Description: Apache Tika's XML parsers were not configured to limit entity expansion. They were therefore vulnerable to an entity expansion vulnerability which can lead to a denial of service attack. Mitigation: Apache Tika users should upgrade to 1.19 or later Credit: This issue was discovered by Renfei (Brian) Wang of Amazon.
Fwd: [ANNOUNCE] Apache Tika 1.18 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.18. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.18 contains a number of improvements as well as security and bug fixes. Details can be found in the changes file: http://www.apache.org/dist/tika/CHANGES-1.18.txt Apache Tika is available on the download page: http://tika.apache.org/ download.html Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: http://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: http://www.apache.org/dist/tika/KEYS For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2018-1335] Command Injection Vulnerability in Apache Tika’s tika-server module
CVE-2018-1335 – Command Injection Vulnerability in Apache Tika’s tika-server module Severity: High Vendor: The Apache Software Foundation Versions Affected: <1.18 Description: Before Tika 1.18, clients could send carefully crafted headers to tika-server that could be used to inject commands into the command line of the server running tika-server. This vulnerability only affects those running tika-server on a server that is open to untrusted clients. Mitigation: Ensure that untrusted users don't have access to tika-server and/or upgrade to Apache Tika >=1.18. Credit: Tim Allison, a member of the Apache Tika team, discovered this.
[CVE-2018-1339] DoS (Infinite Loop) Vulnerability in Apache Tika’s ChmParser
CVE-2018-1339 – DoS (Infinite Loop) Vulnerability in Apache Tika’s ChmParser Severity: Important Vendor: The Apache Software Foundation Versions Affected: <1.18 Description: A carefully crafted (or fuzzed) file can trigger an infinite loop in Apache Tika's ChmParser. Mitigation: Turn off the ChmParser or upgrade to Apache Tika >=1.18. Credit: Tobias Ospelt of modzero AG discovered this issue by fuzzing with Kelinci (https://github.com/isstac/kelinci).
[CVE-2018-1338] DoS (Infinite Loop) Vulnerability in Apache Tika’s BPGParser
CVE-2018-1338 – DoS (Infinite Loop) Vulnerability in Apache Tika’s BPGParser Severity: Important Vendor: The Apache Software Foundation Versions Affected: <1.18 Description: A carefully crafted (or fuzzed) file can trigger an infinite loop in Apache Tika's BPGParser. Mitigation: Turn off the BPGParser or upgrade to Apache Tika >=1.18. Credit: Tobias Ospelt of modzero AG discovered this issue by fuzzing with Kelinci (https://github.com/isstac/kelinci).
CVE-2017-12626 – Denial of Service Vulnerabilities in Apache POI < 3.17
Title: CVE-2017-12626 – Denial of Service Vulnerabilities in Apache POI < 3.17 Severity: Important Vendor: The Apache Software Foundation Versions affected: versions prior to version 3.17 Description: Apache POI versions prior to release 3.17 are vulnerable to Denial of Service Attacks: * Infinite Loops while parsing specially crafted WMF, EMF, MSG and macros (POI bugs 61338 [0] and 61294 [1]) * Out of Memory Exceptions while parsing specially crafted DOC, PPT and XLS (POI bugs 52372 [2] and 61295 [3]) Mitigation: Users with applications which accept content from external or untrusted sources are advised to upgrade to Apache POI 3.17 or newer. -Tim Allison on behalf of the Apache POI PMC [0] https://bz.apache.org/bugzilla/show_bug.cgi?id=61338 [1] https://bz.apache.org/bugzilla/show_bug.cgi?id=61294 [2] https://bz.apache.org/bugzilla/show_bug.cgi?id=52372 [3] https://bz.apache.org/bugzilla/show_bug.cgi?id=61295
[ANNOUNCE] Apache Tika 1.17 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.17. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.17 contains a number of improvements and bug fixes. Details can be found in the changes file: http://www.apache.org/dist/tika/CHANGES-1.17.txt Apache Tika is available in source form from the following download page: http://www.apache.org/dyn/closer.cgi/tika/apache-tika-1.17-src.zip Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: http://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.16 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.16. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.16 contains a number of improvements and bug fixes. Details can be found in the changes file: http://www.apache.org/dist/tika/CHANGES-1.16.txt Apache Tika is available in source form from the following download page: http://www.apache.org/dyn/closer.cgi/tika/apache-tika-1.16-src.zip Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: http://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[ANNOUNCE] Apache Tika 1.15 released
The Apache Tika project is pleased to announce the release of Apache Tika 1.15. The release contents have been pushed out to the main Apache release site and to the Maven Central sync, so the releases should be available as soon as the mirrors get the syncs. Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. Apache Tika 1.15 contains a number of improvements and bug fixes. Details can be found in the changes file: http://www.apache.org/dist/tika/CHANGES-1.15.txt Apache Tika is available in source form from the following download page: http://www.apache.org/dyn/closer.cgi/tika/apache-tika-1.15-src.zip Apache Tika is also available in binary form or for use using Maven 2 from the Central Repository: http://repo1.maven.org/maven2/org/apache/tika/ In the initial 48 hours, the release may not be available on all mirrors. When downloading from a mirror site, please remember to verify the downloads using signatures found on the Apache site: https://people.apache.org/keys/group/tika.asc For more information on Apache Tika, visit the project home page: http://tika.apache.org/ -- Tim Allison, on behalf of the Apache Tika community
[CVE-2016-4434] Apache Tika XML External Entity vulnerability
CVE-2016-4434: Apache Tika XML External Entity vulnerability Severity: Important Vendor: The Apache Software Foundation Versions Affected: Apache Tika 0.10 to 1.12 Description: Apache Tika parses XML within numerous file formats. In some instances[1], the initialization ofthe XML parser or the choice of handlers did not protect against XML External Entity (XXE) vulnerabilities. According to www.owasp.org [2]: "This attack may lead to the disclosure of confidential data, denial of service, server side request forgery, port scanning from the perspective of the machine where the parser is located, and other system impacts." Mitigation: Upgrade to Apache Tika 1.13. Credit: This issue was discovered by Arthur Khashaev (https://khashaev.ru), Seulgi Kim, Mesut Timur, and Microsoft Vulnerability Research. [1] Spreadsheets in OOXML files and XMP in PDF and other file formats. [2] https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing