[jira] [Commented] (TIKA-2911) Add new parsers

Nathan Davies (Jira) Tue, 02 Aug 2022 07:12:06 -0700


    [ 
https://issues.apache.org/jira/browse/TIKA-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17574256#comment-17574256
 ]


Nathan Davies commented on TIKA-2911:
-------------------------------------

Hopefully this is the right ticket to add a comment too, it seems related.

I'm trying to use Tika to extract plain text from doc, docx, and odt files. The 
occasional docx file fails dues to Strict OOXML. This seem related to the POI 
issue: [https://bz.apache.org/bugzilla/show_bug.cgi?id=57699] The 
recommendation there was to ask here and see if it something Tika can support 
in some way.

I appreciate this is an older ticket, so I can make a new one if need be.

> Add new parsers
> ---------------
>
>                 Key: TIKA-2911
>                 URL: https://issues.apache.org/jira/browse/TIKA-2911
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> Let's use this ticket as the parent for adding new parsers.  This will allow 
> us to have a single point of reference for requests/plans for new parsers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (TIKA-2911) Add new parsers

Reply via email to