[ 
https://issues.apache.org/jira/browse/TIKA-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Burch resolved TIKA-438.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 1.0

Fixed as part of TIKA-652

> Parse and return the complete set of custom document properties from MS 
> Office documents
> ----------------------------------------------------------------------------------------
>
>                 Key: TIKA-438
>                 URL: https://issues.apache.org/jira/browse/TIKA-438
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 0.7
>            Reporter: Mads Hansen
>            Priority: Minor
>              Labels: metadata, office, parser
>             Fix For: 1.0
>
>         Attachments: SummaryExtractor.java
>
>
> All MS Office document custom properties should be parsed and returned in the 
> Metadata set.  This would be consistent with how all HTML meta tags are 
> parsed and returned.
> CustomProperties are already being parsed to produce the Metadata.LANGUAGE 
> property when normalizing document properties into the Dublin Core metadata 
> set.  With minor modifications to the 
> org.apache.tika.parser.microsoft.SummaryExtractor class the entire set of 
> Custom Properties could be obtained and set for the document metadata.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to