[
https://issues.apache.org/jira/browse/TIKA-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351066#comment-17351066
]
Tim Allison commented on TIKA-1358:
-----------------------------------
I abandoned this for a couple of reasons:
1) Option 1 was to use the equivalent of POI's ooxml-schemas -- java classes
automatically built from the protobuf definitions. This library was HUGE, and
I couldn't justify that much weight given the silence on this issue.
2) iOS file formats are a crazily moving target. Even if we got the lighter
weight scrapers working without the full library, we'd have to adjust annually
at least to keep on top of the changes to the underlying file format, and we'd
have to do detection on the changing versions.
3) Frankly, my $dayjob wasn't interested in supporting work on this.
So, as a Mac user, it breaks my heart, but there are some serious hurdles.
> Add support for newer iWork file formats
> ----------------------------------------
>
> Key: TIKA-1358
> URL: https://issues.apache.org/jira/browse/TIKA-1358
> Project: Tika
> Issue Type: Wish
> Components: parser
> Affects Versions: 1.5
> Reporter: Jelle Kastelein
> Priority: Major
> Labels: new-parser, newbie
> Attachments: 666.pages, budget.txt, connors_20040127.txt,
> iwork13-testdocs-zips.zip, iwork13-testfiles-2014-11.zip, pages.txt
>
>
> IWork 2013 uses a revised file format which replaces the xml files that hold
> the content by .iwa files (a binary format). This file format is becoming
> increasingly relevant as more and more people are using apple products.
> However, it does not appear to work with the current IWorkPackageParser
> (tested with several of the example .pages files one can get from the
> iCloud).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)