[ 
https://issues.apache.org/jira/browse/TIKA-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909023#comment-16909023
 ] 

Tim Allison commented on TIKA-2911:
-----------------------------------

Some other formats that come to mind:
* OOXML strict (probably best handled at the POI level, but we should be able 
to add that to the streaming docx/pptx fairly easily)
* Newer Apple iWorks file formats
* OneNote
* Binary plist
* Serif PagePlus
* zipx
* ...and?


Then there's a category of a need for an unpacker for "large container files" 
such as:
* warc
* parquet
* ...and?

See some discussion: 
https://twitter.com/_tallison/status/1149035618321743878?s=20

> Add new parsers
> ---------------
>
>                 Key: TIKA-2911
>                 URL: https://issues.apache.org/jira/browse/TIKA-2911
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> Let's use this ticket as the parent for adding new parsers.  This will allow 
> us to have a single point of reference for requests/plans for new parsers.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to