[
https://issues.apache.org/jira/browse/TIKA-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Burch resolved TIKA-438.
-----------------------------
Resolution: Fixed
Fix Version/s: 1.0
Fixed as part of TIKA-652
> Parse and return the complete set of custom document properties from MS
> Office documents
> ----------------------------------------------------------------------------------------
>
> Key: TIKA-438
> URL: https://issues.apache.org/jira/browse/TIKA-438
> Project: Tika
> Issue Type: Improvement
> Components: parser
> Affects Versions: 0.7
> Reporter: Mads Hansen
> Priority: Minor
> Labels: metadata, office, parser
> Fix For: 1.0
>
> Attachments: SummaryExtractor.java
>
>
> All MS Office document custom properties should be parsed and returned in the
> Metadata set. This would be consistent with how all HTML meta tags are
> parsed and returned.
> CustomProperties are already being parsed to produce the Metadata.LANGUAGE
> property when normalizing document properties into the Dublin Core metadata
> set. With minor modifications to the
> org.apache.tika.parser.microsoft.SummaryExtractor class the entire set of
> Custom Properties could be obtained and set for the document metadata.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira