[ 
https://issues.apache.org/jira/browse/TIKA-152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillermo Arribas updated TIKA-152:
-----------------------------------

    Attachment: testEXCEL-formats.xlsx
                testWORD.docx
                TIKA-152.patch

Parser with support for structured text extraction for OOXML formats.
New dependency on artifactId "poi-ooxml" 3.5-beta4 required.

> Support for Office XML files
> ----------------------------
>
>                 Key: TIKA-152
>                 URL: https://issues.apache.org/jira/browse/TIKA-152
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>            Reporter: Jukka Zitting
>             Fix For: 0.3
>
>         Attachments: testEXCEL-formats.xlsx, testEXCEL.xlsx, testPPT.pptx, 
> testWORD.docx, TIKA-152.patch
>
>
> Apache POI has recently released the first betas of their support for Office 
> XML file formats. We should use that in Tika.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to