[ 
https://issues.apache.org/jira/browse/TIKA-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14066338#comment-14066338
 ] 

Andreas Ebbert-Karroum commented on TIKA-1358:
----------------------------------------------

Looks like the iwa files are using [Google Protocol 
Buffers|https://developers.google.com/protocol-buffers/docs/overview?csw=1]. 
The github project 
[iWorkFileFormat|https://github.com/obriensp/iWorkFileFormat] might help 
getting the content out of them.

> Add support for newer iWork file formats
> ----------------------------------------
>
>                 Key: TIKA-1358
>                 URL: https://issues.apache.org/jira/browse/TIKA-1358
>             Project: Tika
>          Issue Type: Wish
>          Components: parser
>    Affects Versions: 1.5
>            Reporter: Jelle Kastelein
>              Labels: newbie
>
> IWork 2013 uses a revised file format which replaces the xml files that hold 
> the content by .iwa files (a binary format). This file format is becoming 
> increasingly relevant as more and more people are using apple products. 
> However, it does not appear to work with the current IWorkPackageParser 
> (tested with several of the example .pages files one can get from the 
> iCloud). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to