[
https://issues.apache.org/jira/browse/TIKA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17472053#comment-17472053
]
Tim Allison commented on TIKA-3634:
-----------------------------------
Thank you for submitting the bug and sharing triggering files.
A couple of items unrelated to the problem:
* AppleSingleFileParser does not handle iworks files. That is for a
completely unrelated file format:
[https://en.wikipedia.org/wiki/AppleSingle_and_AppleDouble_formats]
* You shouldn't need to add: tika-parser-zip-commons,tika-parser-apple-module.
These should be included in tika-parsers-standard-package. If they're not,
that's a serious problem. Please open a different ticket.
I regret I'm still not clear on what we need to fix.
With Tika 1.28, I get {{application/vnd.apple.unknown.13}} for the *.numbers
file and *.pages file; I get {{application/vnd.apple.keynote.13}} for the .key
file. No attachments or text are extracted from any of those.
With Tika 2.2.1, I get {{application/vnd.apple.unknown.13}} all three (*.pages,
*.key , *.numbers files), but then the packageparser parses all embedded files
that Tika supports.
What is the desired behavior?
> Failed to Parser Apple related files
> ------------------------------------
>
> Key: TIKA-3634
> URL: https://issues.apache.org/jira/browse/TIKA-3634
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 2.2.1
> Reporter: Tika User
> Assignee: Tim Allison
> Priority: Blocker
> Attachments: brochure.pages, keynotecreated.key,
> mortgagecalculator.numbers
>
>
> Unable to parse '.Number', '.key', '.pages' file using below class in xml
> file(org.apache.tika.parser.apple.AppleSingleFileParser)
> Getting unkown mimetype : application/vnd.apple.unknown.13
> Using all these modules :
> tika-core,tika-parsers-standard-package,tika-parser-microsoft-module,tika-parser-sqlite3-package,tika-parser-scientific-module,tika-parser-zip-commons,tika-parser-apple-module
--
This message was sent by Atlassian Jira
(v8.20.1#820001)