Better handling of content type metadata
----------------------------------------
Key: TIKA-759
URL: https://issues.apache.org/jira/browse/TIKA-759
Project: Tika
Issue Type: Improvement
Components: metadata, mime
Reporter: Jukka Zitting
Assignee: Jukka Zitting
Priority: Minor
Currently we use the "Content-Type" metadata key for storing (and looking up)
the media type of a document. This is simple enough and works well especially
with HTTP, but not too well in line with XMP or other metadata standards like
Dublin Core. So as an improvement I propose the following:
* Switch to "dc:format" as the standard metadata key for the content type
* Keep the existing "Content-Type" key for backwards compatibility with
existing clients
* Make the Metadata class aware of such aliases
* Add getFormat() and setFormat() utility methods to Metadata to simplify
client code and to make the exact metadata key more of an implementation detail
in Tika
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira