Hi Perian, Here's my two bits about metadata -- not as terribly informed as I ought to be but hopefully that will mean that it's more simplistic as a result.
First off, metadata isn't a term originating from the cultural heritage community so its usage within the library, museum and archives communities is almost grafted in and misunderstandings abound. In fact, it comes from the database science community (I think Bo Sundgren coined the term but I'm not entirely certain at this point) where it makes the most sense. Simply put, it is data about data which, as explanations go, does not explain a lot. However, within the context of databases, it does make a lot of sense. Since databases are really just big tables of information, what you can do with a specific row and column is determined by what you know of that particular row or column. Consider a column in a database called "registration date". If a computer is programmed to recognize any value within that column as a piece of text, then there's not a lot you can do with it. But if that same value is number, suddenly you can do a lot more. For instance, you could determine whether a value fell within a range of values. So identifying that a column in a database is a column of integer values is an example of metadata. In the cultural heritage community, it has been applied to a number of areas, some of which existed pre-metadata in the cultural heritage community. Take the library community for example -- most library people simply assume that the majority of metadata applied in this new era of digital objects is what already existed as cataloging. The traditional library catalogue entry would not contain information about who owned the book prior to the library however -- even if the book came from special collections. Museums, on the other, would have great interest in the object's provenance. The result is that each domain typically assumes that what it considers important about an object represents the entirety of what should be known about an object. In the connected world, this view is changing and hence, the use of metadata as a catch all term that covers all pieces of data that describes a digital object, regardless of the domain that it comes from. Broadly, an object can have information about what it is (descriptive metadata), how it's made (technical metadata), the process used to make it (administrative metadata), what it's made up of (structural metadata), what you need to know to protect it (preservation metadata) and how to use it (audience metadata, behavioral metadata, intent metadata). Within each of these categories of metadata, there are rules for the specific attributes of an object. Dublin core, VRA, MIX, MARC are all examples of this -- they provide a list of attributes within the context of the category of metadata that can be applied an object (e.g. title, creator, description). Generally, these rules do not have a specific way they need to be formatted -- MARC being a notable exception as its origins can be traced back to the era of the punch card. Standards like SGML, XML and specific DTDs/schemas for each of these provide the rules for how to format the information into a way that machines can use them. There are also rules for how the information within a given attribute is formatted and what should be included like CCO and AACR2 -- for instance, what to do with the initial article in a title or what constitutes a title as opposed to a subtitle. Because we'd all like to do things in the same way if possible (so that it's easier to cooperate), people will put out lists of standardized values for a given attribute so that everyone's working on the same page -- LCSH/LCC/DCC for subject headings for instance and things like AAT and ULAN. Finally, we want our system to talk to other peoples' systems so we need to put together a list of rules for how systems talk -- OAI, Z39.50 and SOAP are examples for how systems can bundle everything together and send it to another system. Also of note is how those bundles are structured -- these represent standards like METS, SCORM and MPEG-21 DIDL. Tim Perian Sully wrote: > Hi list of smart people much more knowledgeable than me: > > I'm trying to wrap my brain around the technical aspects of metadata > sharing and structures, reading though (and not entirely comprehending) > a lot of different sources. As I am a visual, hands-on type learner, I'm > trying to put everything I'm reading into non-technical language this > neophyte can understand. I'm pretty sure I've got #'s 2-4 wrong, but can > anyone help me unravel this....? > > 1) You have objects. You apply vocabularies to the objects in order to > describe them. The vocabularies facilitate how your object information > is seen by other computers. Examples of Vocabularies are: AAT, ULAN, > Chenhall's > > (I understand #1 pretty well. Here's where I start to get lost...) > > 2) In order for the other computers to understand what you're giving > them, the information needs to be arranged in a specific way. These are > the element sets...? these are MARC, LOC, VRA, Dublin Core > > 3) Because very few institutions have "pure" collections that fit into > one of the Vocabularies, we can use multiple Vocabularies. Do we use > multiples of #2 as well? These are defined and plugged into the element > sets. They are tagged as belonging to a specific Vocabulary > > (I think there's a middle piece in here I'm missing) > > 4) There is an umbrella structure, the Harvester, which can read #2 and > serve it to the user in readable form. Examples: OAI, MARC (also fits as > a #2), XML > > So as you can see, I'm dreadfully muddled. I know it's important to > understand it, but I'm just not able to wrap my head around the various > resources out there. I'm starting to think that Ask A Ninja is more my > level... > > Help! and thanks in advance > >
