[
https://issues.apache.org/jira/browse/TIKA-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771026#comment-15771026
]
David Pilato commented on TIKA-2227:
------------------------------------
Sorry. Answer is {{TikaCoreProperties.KEYWORDS}}.
Don't know I missed it... :)
> Replacement of MSOffice#KEYWORDS for RTF and ODT docs
> -----------------------------------------------------
>
> Key: TIKA-2227
> URL: https://issues.apache.org/jira/browse/TIKA-2227
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.14
> Reporter: David Pilato
> Priority: Minor
>
> I'm trying to extract metadata from different type of documents.
> I'm using for that {{metadata.get(MSOffice.KEYWORDS)}} but it's marked as
> {{Deprecated}} by {{Office}} class.
> So I changed my code to use now {{metadata.get(Office.KEYWORDS)}} instead.
> It does not work for 2 types of docs:
> * RTF:
> https://github.com/dadoonet/fscrawler/blob/master/src/test/resources/documents/test.rtf
> * ODT:
> https://github.com/dadoonet/fscrawler/blob/master/src/test/resources/documents/test.odt
> It seems that RTF and ODT keywords are extracted to a {{"Keyword"}} metadata
> name although they should probably be generated to {{"meta:keyword"}}.
> You can reuse if needed the documents I linked to here as test case if needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)