Hi, Following up from the Dublin Core discussion we had earlier, now with something a bit more concrete:
The current image and audio parsers use hardcoded strings like "width", "height", "encoding" and "samplerate" for extracted metadata. The semantics of these metadata keys are nowhere documented and little thought has been put on interoperability with external metadata applications. To improve things I'd like to replace these custom metadata keys with keys defined in part 2 of the XMP specification [1]. More specifically, I'd like to start using the following keys for image and audio metadata: * "tiff:ImageWidth" instead of "width" * "tiff:ImageHeight" instead of "height" * "xmpDM:audioCompressor" instead of "encoding" * "xmpDM:audioSampleRate" instead of "samplerate" * "xmpDM:audioSampleType" instead of "bits" * "xmpDM:audioChannelType" instead of "channels" The semantics of these metadata keys would be as documented in the XMP spec. Since we don't support namespacing of metadata keys (yet, see TIKA-61), these keys would simply use the preferred "tiff" and "xmpDM" prefixes embedded in the metadata key strings. [1] http://www.adobe.com/devnet/xmp/pdfs/XMPSpecificationPart2.pdf BR, Jukka Zitting