On Sat, 28 Feb 2015, Gautham Shankar wrote:
My progress has been updated on the below link.
https://wiki.apache.org/tika/TikaGeographicInformationParser
I would like you guys to comment on the Key Names that i have come up for
customized Meta data, this could certainly be shortened.
Ideally, we try not to invent our own metadata keys, but instead re-use
definitions/standards from elsewhere. We also try to map format-specific
keys onto general ones, to keep things consistent between different file
types
From a quick glance, it looks like a few of the metadata entris you have
are ones which could be mapped onto an existing key, and a few could be
mapped onto new metadata properties from external standards
It might also be worth looking at some of the other scientific formats,
and see if any commonality can be found with those / they can be changed
to be common. Where there's a concept that's the same, the different
formats should try to use the same metadata key.
(As an example, as a user, you don't need to know if a file format uses
Created On, Created At, First Created At, Created, or anything like that,
you just fetch dc:created and it's the same thing across all formats, and
you can go look up the Dublin Core specification if you want to check what
it means semantically)
Nick