[
https://issues.apache.org/jira/browse/TIKA-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099485#comment-17099485
]
Hudson commented on TIKA-3094:
------------------------------
SUCCESS: Integrated in Jenkins build Tika-trunk #1811 (See
[https://builds.apache.org/job/Tika-trunk/1811/])
TIKA-3094: Add SparseBitSet and xmpcore-shaded to tika-bundle. (tallison:
[https://github.com/apache/tika/commit/e9623650a37039286604e8ed3a17fdcc0ab04fc1])
* (edit) tika-bundle/src/test/java/org/apache/tika/bundle/BundleIT.java
* (edit) tika-bundle/pom.xml
* (add) tika-bundle/src/test/resources/testPPT.pptx
TIKA-3094 add ignored unit test that runs the bundle against all of the
(tallison:
[https://github.com/apache/tika/commit/5a1ee00e64ec812574ba7be8e48f637e01fa018c])
* (edit) tika-bundle/src/test/java/org/apache/tika/bundle/BundleIT.java
* (edit) tika-bundle/pom.xml
> Apache Tika fails to extract text for pptx extension.
> -----------------------------------------------------
>
> Key: TIKA-3094
> URL: https://issues.apache.org/jira/browse/TIKA-3094
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.24, 1.24.1
> Reporter: Abhishek Chauhan
> Assignee: Bob Paulin
> Priority: Critical
> Attachments: Sample PPT.pptx
>
>
> This is regressed from 1.23 version of Apache Tika. Text extraction for .pptx
> ententions which was earlier working with Apache Tika 1.23 is no longer
> working in 1.24 version.
> For .ppt extention it is working fine in both 1.23 and 1.24
>
> As I referred to release notes [https://tika.apache.org/1.24/index.html], you
> have updated the POI to 4.1.2. That might be the root cause of this problem.
> POI requires [https://mvnrepository.com/artifact/com.zaxxer/SparseBitSet/1.2]
> which is not present in bundle I guess.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)