We have been having some discussion here about the use of the Provenance definition for detailing the source of data used to build data sets.
Currently Provenance provides information on the author, function creation data, and links to reference and modification definitions. We have been pondering what is the best technique for detailing where in a reference data may be sourced from (the page/paragraph/figure or table of reference documents). One consideration was to include this information as a standard XML comment; however, the information would then only be available to someone visually reading the dataset file. Another consideration was to at add content attribute tags to Provenance to allow <page number>, <paragraph number>, <figure number> and < table number>, etc to be defined explicitly, although this adds many attribute tags over complicates the provenance definition. The other option discussed was to included a description attribute to store the additional information.
The Provenance definition is only associated with tabular data and function tables. We were also thinking that it would be useful to associated provenance (optionally) with variableDefs and check data. We commonly use VariableDefs to store constants (wing area, span etc) as well as being used for MathML expressions. We can store the information on the data source in the description attribute, but would prefer to use a technique like the provenance definition as it would provide a consistent way of defining source information for all data/equations included in a dataset.
I am keen to here others ideas?
Aircraft Performance and Flight Dynamics
Air Vehicles Division
Defence Science and Technology Organisation
Ph: +61 (0) 3 9626 7318