Hi, On Mon, Feb 9, 2009 at 3:11 PM, Jonathan Koren <jonat...@soe.ucsc.edu> wrote: > On Feb 8, 2009, at 10:59 AM, Jukka Zitting wrote: >> Note that I'm only proposing that we change the keys of the six >> metadata entries I listed. > > But why only those six?
Because they are useful pieces of metadata that are already accurately defined in the respective XMP schemas. I for example didn't propose changing the MIDI metadata key "patches", as AFAIK there is no standard schema that covers that piece of information. > You're not proposing to support all of XMP, just the bare minimum that you > need this week. At some point you're going to want to add more metadata > and then you're going going to have to deal with the ontology mismatch > problem. I'm not proposing that we try to map all the metadata we support into the XMP schemas. All I'm trying to do is avoid using custom keys for information where a well defined and widely used standard alternative already exists. If there's an ontology mismatch, then we can use custom keys. But I don't see why we should invent new keys when standard alternatives with the exact same semantics already exist. A Tika-specific client shouldn't care whether the metadata key is "width", "tiff:ImageWidth", "xyzzy" or even "the return value of javax.imageio.ImageReader.getWidth(0)"; it should just use a constant like Metadata.IMAGE_WIDTH. The metadata key "tiff:ImageWidth" is well documented and makes life easier when your application needs to interact with existing XMP infrastructure (or other metadata tools that already know how to import XMP metadata), and I don't see why the key would be any worse than the alternatives. > You create a new class that takes the raw key-value pairs that stored in > Tika::Metadata and translates them to something else. Call it Metadata2XMP > or whatever. That can be packaged within Tika as a convenient class > that does least common denominator mapping in a well defined way. Having such a mapping class within Tika is an alternative, but as discussed in the Dublin Core thread [1] in December, I'm not sure if it's worth the added complexity. My proposal covers the use case with much less extra code or documentation. [1] http://markmail.org/message/zjsjslaelx6acf6z BR, Jukka Zitting