[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500173#comment-17500173 ]
Tim Allison edited comment on TIKA-3684 at 3/3/22, 12:41 PM: ------------------------------------------------------------- I attached an example for turning off the WMFParser and the EMFParser. When calling tika-server with docker, add {{--config tika-config-no-xmf.xml}} was (Author: talli...@mitre.org): I attached an example for turning off the WMFParser and the EMFParser. When calling tika-server in docker, add {{-c tika-config-no-xmf.xml}} > Extract text returns the text multiple times > -------------------------------------------- > > Key: TIKA-3684 > URL: https://issues.apache.org/jira/browse/TIKA-3684 > Project: Tika > Issue Type: Bug > Components: docker > Affects Versions: 2.1.0 > Reporter: Naama Hophstatder > Priority: Major > Attachments: example.docx, example.json, tika-config-no-xmf.xml > > > We are using tika docker container as a linux service, when I want to extract > text from a word document, e.g.: > curl -T example.docx http://localhost:9998/tika --header "Accept: text/plain" > we get the text 3 times. > Notice: We also have tika server v1.14, and this version returns the text > just as expected. -- This message was sent by Atlassian Jira (v8.20.1#820001)