The term 'convenience feature' is mentioned in the conventions document:
'The new scalar coordinate variable is a convenience feature which avoids
adding size one dimensions to variables.'
Data creators have seen the benefits in not encoding size one dimensions and
made use of this feature, it has proved very convenient. The conventions go on
to say:
'Scalar coordinate variables have the same information content and can be
used in the same contexts as a size one coordinate variable.'
But this statement is not quite true: the ordering of dimensions is not
encoded, and the ability to link many coordinates to the same dimension is
lost. The assumption in this statement is an aspiration which I think cannot
be delivered without particularly strict limitations on the use of scalars
during encoding.
Nowhere in the conventions does it state that if more than one single-valued
coordinate is related to the same degree of freedom, a dimension must be
declared for these and this relationship explicitly encoded.
Later, the case of character strings is addressed:
'If a character variable has only one dimension (the maximum length of the
string), it is regarded as a string-valued scalar coordinate variable,
analogous to a numeric scalar coordinate variable (see Section 5.7, “Scalar
Coordinate Variables”) '
which is a required feature, but the NUG only allows numerical valued data
arrays as Coordinate Variables, so a further section is added, in the
Terminology:
'scalar coordinate variable
A scalar variable that contains coordinate data. Functionally equivalent to
either a size one coordinate variable or a size one auxiliary coordinate
variable. '
These statements together provide information on how to write files, but they
are limited in their assistance to file reading and interpretation.
The conventions are not clear how to, or whether to make a distinction for a
particular scalar coordinate: it does not say that a scalar coordinate is a
Coordinate Variable or an Auxiliary Coordinate Variable, it says it is
functionally equivalent to either one or the other.
I have read these sections to mean that by encoding a scalar coordinate the
data creator is not providing information about how the coordinate is related
to the dimensions in the file, other than to say it applies to all of the cells
currently in the file.
As such, I disagree with the statement that that
'Scalar coordinate variables have the same information content and can be
used in the same contexts as a size one coordinate variable.'
In many cases this will turn out to be a valid interpretation but it is not the
only one, and this nuance is a really useful feature, which many data creators
have benefited from.
>From one point of view, a third type of Coordinate exists in CF, the Scalar
>Coordinate, which is neither a Coordinate Variable, nor an Auxiliary
>Coordinate. From another point of view a Scalar Coordinate is an Auxiliary
>Coordinates which has the potential to be an emergent Coordinate Variable, if
>required and consistent for the data consumer. (I am sure there are other
>useful perspectives we can consider)
We have come across many data sets from other data creators where a considered
reading of the data suggests that they have taken an interpretation such as
this as well. No distinction has been made between scalars which represent a
degree of freedom and scalars which do not.
The scalar coordinate is a convenient feature allowing metadata to be simply
encoded in a clear manner and I feel that the conventions document should adapt
to reflect the usage some sections of the community have adopted. It is not
ambiguous, it provides sufficient information to work with the file and the
data and metadata are well specified.
Indeed when converting from other formats (such as GRIB and BUFR) to CF it is
the logical way to encode the available metadata.
I am concerned about the implications for these data sets if the interpretation
of scalar coordinates is tightened in a future version of the conventions
document to explicitly disallow this useful and well used point of view. I
would like to stress again Jonathan's point, that all of this data is CF
compliant, the question is how consumers interpret the semantics of the data
set.
I think the utility of the scalar coordinate variable is significantly
diminished if Option B or some derivative of it is pursued for the next version
of the conventions. Option A preserves all of the interpretations of Option B
intact, but with caution needed on loading and interpretation not to read too
much information into any scalar coordinates present.
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata