Ashish Basran created TIKA-2180:
---
Summary: Multiple requests on Tika to extract text slows down
Key: TIKA-2180
URL: https://issues.apache.org/jira/browse/TIKA-2180
Project: Tika
Issue Type: Bug
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Basran updated TIKA-2180:
Affects Version/s: 1.14
> Multiple requests on Tika to extract text slows down
>
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687660#comment-15687660
]
Ashish Basran commented on TIKA-2180:
-
I tested with Word document and Excel. I observe
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687795#comment-15687795
]
Ashish Basran commented on TIKA-2180:
-
I am calling Tika (http://localhost:8080/tika) u
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702912#comment-15702912
]
Ashish Basran commented on TIKA-2180:
-
Thanks Tim. I tried with 4 concurrent requests o
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15702912#comment-15702912
]
Ashish Basran edited comment on TIKA-2180 at 11/28/16 7:45 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713116#comment-15713116
]
Ashish Basran commented on TIKA-2180:
-
[~talli...@mitre.org], I will try with bigger sa
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15713218#comment-15713218
]
Ashish Basran commented on TIKA-2180:
-
With 100 documents it look like following. I abo
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15716609#comment-15716609
]
Ashish Basran commented on TIKA-2180:
-
Thanks [~talli...@mitre.org]. I will try new SAX
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Basran updated TIKA-2180:
Attachment: screenshot-2.png
> Multiple requests on Tika to extract text slows down
> ---
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Basran updated TIKA-2180:
Attachment: screenshot-1.png
> Multiple requests on Tika to extract text slows down
> ---
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730782#comment-15730782
]
Ashish Basran commented on TIKA-2180:
-
Hello [~talli...@mitre.org],
With the same doc
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Basran updated TIKA-2180:
Attachment: screenshot-3.png
> Multiple requests on Tika to extract text slows down
> ---
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Basran updated TIKA-2180:
Attachment: with new experimental SAX docx parser.png
> Multiple requests on Tika to extract text slo
[
https://issues.apache.org/jira/browse/TIKA-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15735961#comment-15735961
]
Ashish Basran commented on TIKA-2180:
-
I tried with new experimental parser and with th
15 matches
Mail list logo