[ 
https://issues.apache.org/jira/browse/TIKA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated TIKA-402:
---------------------------------------

    Attachment: iwork.patch
                testKeynote.key

I couldn't find a java library that parses a keynote presentation, so I have 
made an initial patch that parses a keynote presentation. It is work 
in-progress and I was hoping to get some feedback. The attached presentation is 
a keynote version 5 presentation (but has keynote format version 2.x). 

The patch is working. If have tested this via the Tika CLI. Also 2 tests are 
included in the patch, one testing the parsing and one the auto detecting.

I have added the test file separately, because binary files can't be included 
in a patch. The keynote file should be placed the test-documents package in the 
parsers module's resource directory.

Older keynote format versions (1.x) are not supported yet, because the format 
is different. Also if I remember correctly that keynote file is a directory and 
not a compressed file. Support for Pages is not yet included.

> Support for Keynote and Pages documents
> ---------------------------------------
>
>                 Key: TIKA-402
>                 URL: https://issues.apache.org/jira/browse/TIKA-402
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>         Attachments: iwork.patch, testKeynote.key
>
>
> It would be nice to have support for documents created by Apple's Keynote and 
> Pages applications. Both file formats are described in 
> http://developer.apple.com/mac/library/documentation/AppleApplications/Conceptual/iWork2-0_XML/Chapter01/Introduction.html.
>  I'm not sure if there already are open source parser libraries for these 
> formats or if we'd need to directly process the XML content.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to