[
https://issues.apache.org/jira/browse/TIKA-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13132808#comment-13132808
]
Chris A. Mattmann commented on TIKA-759:
----------------------------------------
+1 to this Jukka!
In OODT-ville, for many years we've had something called a "Profile", see:
http://svn.apache.org/repos/asf/oodt/trunk/profile/src/main/java/org/apache/oodt/profile/Profile.java
A Profile is a metadata description of a resource with 3 different sets of
attributes:
* housekeeping information about the Profile (its ID, created time, etc.)
* information about the data that the Profile points to (this is the Dublin
Core set of information + some mods, and is housed in the
http://svn.apache.org/repos/asf/oodt/trunk/profile/src/main/java/org/apache/oodt/profile/ResourceAttributes.java
file)
* domain-specific metadata, which we keep as a set of ProfileElements (housed
in the
http://svn.apache.org/repos/asf/oodt/trunk/profile/src/main/java/org/apache/oodt/profile/ProfileElement.java)
and its sub-classes, RangedProfileElement.java and
EnumeratedProfileElement.java. ProfileElements correspond to ISO-11179 style
elements, with information about (e.g., valid values, ranges, min/max, etc.)
Not saying we should adopt the above. Our OODT stuff is bloated in some areas,
and could be reduced, but just thought I'd pass it along for some inspiration!
:-)
> Better handling of content type metadata
> ----------------------------------------
>
> Key: TIKA-759
> URL: https://issues.apache.org/jira/browse/TIKA-759
> Project: Tika
> Issue Type: Improvement
> Components: metadata, mime
> Reporter: Jukka Zitting
> Assignee: Jukka Zitting
> Priority: Minor
>
> Currently we use the "Content-Type" metadata key for storing (and looking up)
> the media type of a document. This is simple enough and works well especially
> with HTTP, but not too well in line with XMP or other metadata standards like
> Dublin Core. So as an improvement I propose the following:
> * Switch to "dc:format" as the standard metadata key for the content type
> * Keep the existing "Content-Type" key for backwards compatibility with
> existing clients
> * Make the Metadata class aware of such aliases
> * Add getFormat() and setFormat() utility methods to Metadata to simplify
> client code and to make the exact metadata key more of an implementation
> detail in Tika
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira