[ 
https://issues.apache.org/jira/browse/TIKA-262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daan de Wit updated TIKA-262:
-----------------------------

    Attachment: OfficeParser.java.patch

new patch, summary entries may not exist, all tests pass now.

> ParsingReader does not parse metadata for larger MS Office documents
> --------------------------------------------------------------------
>
>                 Key: TIKA-262
>                 URL: https://issues.apache.org/jira/browse/TIKA-262
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.3
>            Reporter: Daan de Wit
>         Attachments: lipsum.doc, OfficeParser.java.patch, 
> OfficeParser.java.patch, OfficeParser.java.patch, 
> tika-0.3_large-ms-office-metadata.patch
>
>
> The ParsingReader should cause the metadata to be extracted before anything 
> is read from the reader. This is not done for certain MS Office files, it 
> seems to be related to the size of the document.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to