[
https://issues.apache.org/jira/browse/TIKA-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Abhishek Chauhan updated TIKA-3094:
-----------------------------------
Description:
This is regressed from 1.23 version of Apache Tika. Text extraction for .pptx
ententions which was earlier working with Apache Tika 1.23 is no longer working
in 1.24 version.
For .ppt extention it is working fine in both 1.23 and 1.24
As I referred to release notes [https://tika.apache.org/1.24/index.html], you
have updated the POI to 4.1.2. That might be the root cause of this problem.
POI requires [https://mvnrepository.com/artifact/com.zaxxer/SparseBitSet/1.2]
which is not present in bundle I guess.
was:
This is regressed from 1.23 version of Apache Tika.
For .ppt extention it is working fine in both 1.23 and 1.24
Text extraction for .pptx ententions which was earlier working with Apache Tika
is no longer working in 1.24 version.
> Apache Tika fails to extract text for pptx extension.
> -----------------------------------------------------
>
> Key: TIKA-3094
> URL: https://issues.apache.org/jira/browse/TIKA-3094
> Project: Tika
> Issue Type: Bug
> Affects Versions: 1.24
> Reporter: Abhishek Chauhan
> Priority: Major
>
> This is regressed from 1.23 version of Apache Tika. Text extraction for .pptx
> ententions which was earlier working with Apache Tika 1.23 is no longer
> working in 1.24 version.
> For .ppt extention it is working fine in both 1.23 and 1.24
>
> As I referred to release notes [https://tika.apache.org/1.24/index.html], you
> have updated the POI to 4.1.2. That might be the root cause of this problem.
> POI requires [https://mvnrepository.com/artifact/com.zaxxer/SparseBitSet/1.2]
> which is not present in bundle I guess.
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)