Having done research, and now working in a very varied metadata role, I don't 
quite understand this discussion about data that is or isn't metadata. 
Scientific data is a great example of structured data, but it's not impossible 
to distinguish it from metadata purely describing a dataset.

However, if you have scientific research data created during the experiments, 
even if it's "operational", it's clearly part of "the" data. This doesn't mean 
there can't be metadata describing *that data*. Just because it's not glamorous 
data doesn't mean it's not essential to the scientific process. Similarly, just 
being about mundane or procedural things doesn't make data into metadata...!

You're absolutely right, the contextual information is certainly part of the 
experimental outcome in this example; otherwise it would be abstract data such 
as one might use in a textbook example.

Metadata would describe the dataset itself, not the scientific research. 
There's always a certain ambiguity involved in identifying "the data" as 
distinct from the metadata, and it's a false dichotomy to suggest metadata is 
not useful at all for the domain expert. It's contextual, and the definition is 
always at least partly based on your use case for the data and its description.

-----Original Message-----
From: Code for Libraries [mailto:[email protected]] On Behalf Of Nate 
Vack
Sent: 14 February 2012 14:45
To: [email protected]
Subject: Re: [CODE4LIB] Metadata

On Tue, Feb 14, 2012 at 1:22 AM, Graham Triggs <[email protected]> wrote:

> That's an interesting distinction though. Do you need all that data in 
> order to make sense of the results? You don't [necessarily] need to 
> know who conducted some research, or when they conducted it in order 
> to analyse and make sense of the data. In the context of having the 
> data, this other information becomes irrelevant in terms of 
> understanding what that data says.

It is *essential* to understanding what the data says. Perhaps you find out 
your sensor was on the fritz during a time period -- you need to be able to 
know what datasets are suspect. Maybe the blood pressure effect you're looking 
at is mediated by circadian rhythms, and hence, times of day.

Not all of your data is necessary in every analysis, but a bunch of blood 
pressure measurements in the absence of contextual information is universally 
useless.

The metadata is part of the data.

-n

Reply via email to