[
https://issues.apache.org/jira/browse/TIKA-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770713#comment-15770713
]
Tim Allison edited comment on TIKA-2224 at 12/22/16 6:19 PM:
-------------------------------------------------------------
Looks like we only have [one
OneNote|http://162.242.228.174/docs/commoncrawl2_likely_broken/CH/CHMSXRBWSRMZHRXVPWTQZMOXALIAMY35.one]
file in our regression corpus, and it is truncated/corrupt.
was (Author: [email protected]):
Looks like we only have [one
OneNote|http://162.242.228.174/docs/commoncrawl2_likely_broken/CH/CHMSXRBWSRMZHRXVPWTQZMOXALIAMY35.one]
file our regression corpus, and it is truncated/corrupt.
> Mime magic for OneNote formats
> ------------------------------
>
> Key: TIKA-2224
> URL: https://issues.apache.org/jira/browse/TIKA-2224
> Project: Tika
> Issue Type: Improvement
> Components: mime
> Affects Versions: 1.14
> Reporter: Nick Burch
> Attachments: Sample1.one
>
>
> As raised at
> http://stackoverflow.com/questions/41272195/onenote-support-for-apache-tika-parsers,
> we don't have any magic for the OneNote formats. Several years ago we dug
> out the file format specs (see
> http://lucene.472066.n3.nabble.com/Tika-OneNote-Support-td4020393.html), but
> didn't have volunteer energy to implement a parser. However, armed with those
> specs, we should be able to come up with some mime magic for detection
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)