José Borges Ferreira created TIKA-2507:
------------------------------------------
Summary: xlsx takes more than 5 mins to parse in 1.16
Key: TIKA-2507
URL: https://issues.apache.org/jira/browse/TIKA-2507
Project: Tika
Issue Type: Bug
Components: server
Affects Versions: 1.16
Environment: started server with
{noformat}
java -jar tiki-server-1.16.jar
{noformat}
Reporter: José Borges Ferreira
when sending a xlsx file with a lot of charts the tiki server takes more that 5
min to process on my 2,2GHz Macbook pro.
In version 1.15 this takes less than a second. Looking at the changeling I'm
guessing that can be related with some features introduced in 1.16, namely :
# Extract text from charts in .docx, .pptx, .xlsx and .xlsb(TIKA-2254).
# Extract text from diagrams in .docx, .pptx, .xlsx and .xlsb(TIKA-1945).
I'm attaching the file
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)