On Sat, 28 Feb 2015, Gautham Shankar wrote:
My progress has been updated on the below link.

https://wiki.apache.org/tika/TikaGeographicInformationParser

I would like you guys to comment on the Key Names that i have come up for
customized Meta data, this could certainly be shortened.

Ideally, we try not to invent our own metadata keys, but instead re-use definitions/standards from elsewhere. We also try to map format-specific keys onto general ones, to keep things consistent between different file types

From a quick glance, it looks like a few of the metadata entris you have
are ones which could be mapped onto an existing key, and a few could be mapped onto new metadata properties from external standards

It might also be worth looking at some of the other scientific formats, and see if any commonality can be found with those / they can be changed to be common. Where there's a concept that's the same, the different formats should try to use the same metadata key.

(As an example, as a user, you don't need to know if a file format uses Created On, Created At, First Created At, Created, or anything like that, you just fetch dc:created and it's the same thing across all formats, and you can go look up the Dublin Core specification if you want to check what it means semantically)

Nick

Reply via email to