[
https://issues.apache.org/jira/browse/TIKA-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16909023#comment-16909023
]
Tim Allison commented on TIKA-2911:
-----------------------------------
Some other formats that come to mind:
* OOXML strict (probably best handled at the POI level, but we should be able
to add that to the streaming docx/pptx fairly easily)
* Newer Apple iWorks file formats
* OneNote
* Binary plist
* Serif PagePlus
* zipx
* ...and?
Then there's a category of a need for an unpacker for "large container files"
such as:
* warc
* parquet
* ...and?
See some discussion:
https://twitter.com/_tallison/status/1149035618321743878?s=20
> Add new parsers
> ---------------
>
> Key: TIKA-2911
> URL: https://issues.apache.org/jira/browse/TIKA-2911
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Priority: Major
>
> Let's use this ticket as the parent for adding new parsers. This will allow
> us to have a single point of reference for requests/plans for new parsers.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)