[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-10 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504150#comment-17504150 ] Nick Burch commented on TIKA-3684: -- Same as Tika 2.x - pass a {{--config}} flag when you start the server

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-09 Thread Naama Hophstatder (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17504036#comment-17504036 ] Naama Hophstatder commented on TIKA-3684: - I don't know how should I configure the service as I'm

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-09 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17503637#comment-17503637 ] Tim Allison commented on TIKA-3684: --- That configuration should work with 1.24 as well. Is it not

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-09 Thread Naama Hophstatder (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17503397#comment-17503397 ] Naama Hophstatder commented on TIKA-3684: - Hi [~tallison] , could you help us using the config

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-03 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500720#comment-17500720 ] Tim Allison commented on TIKA-3684: --- Oops. Thank you! > Extract text returns the text multiple times >

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-02 Thread Naama Hophstatder (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500554#comment-17500554 ] Naama Hophstatder commented on TIKA-3684: - Thanks for your efforts, I took your xml config file

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-02 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500173#comment-17500173 ] Tim Allison commented on TIKA-3684: --- I attached an example for turning off the WMFParser and the

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-02 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500168#comment-17500168 ] Tim Allison commented on TIKA-3684: --- We could also parameterize the WMF and EMF parsers to turn off text

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-02 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500170#comment-17500170 ] Tim Allison commented on TIKA-3684: --- Sorry, didn't see your response. bq. has no "text" meaning in

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-02 Thread Naama Hophstatder (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500143#comment-17500143 ] Naama Hophstatder commented on TIKA-3684: - I see the results of the /rmeta endpoint, understand

[jira] [Commented] (TIKA-3684) Extract text returns the text multiple times

2022-03-02 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17500066#comment-17500066 ] Tim Allison commented on TIKA-3684: --- If you use the /rmeta endpoint (attached), you can see that there's