Re: [CF Metadata] #104: Clarify the interpretation of scalar coordinate variables

John Caron Tue, 20 Aug 2013 16:51:44 -0700

This message came from the CF Trac system.  Do not reply.  Instead, enter your 
comments in the CF Trac system at https://cf-pcmdi.llnl.gov/trac/.


#104: Clarify the interpretation of scalar coordinate variables
-----------------------------+----------------------------------------------
  Reporter:  jonathan        |       Owner:  [email protected]
      Type:  enhancement     |      Status:  new                          
  Priority:  medium          |   Milestone:                               
 Component:  cf-conventions  |     Version:                               
Resolution:                  |    Keywords:                               
-----------------------------+----------------------------------------------
Comment (by caron):

 Hi Jonathan and all:

 Well, if we had to make a decision here, I would say that "coordinate
 variables" = "independent" and "auxiliary coordinates" = "dependent" would
 be the correct one. Obviously very simple and powerful idea.

 For all the examples I can think of, this interpretation seems reasonable
 to me. It would be good if others look at their data files and see if
 there is an important exception to this rule.

 For gridded data, I agree that aux coordinates lat(x,y) and lon(x,y) are
 best thought of as dependent on coordinate variables x(x) and y(y), even
 when x(x) and y(y) are missing.

 For point data, I agree that aux coordinates lat(time), lon(time) are best
 thought of as dependent on the coordinate variable time(time).

 However, under this interpretation, it seems that a scalar auxiliary
 coordinate should be thought of as dependent, not independent. For example
 if "lat(time)" is actually constant, so one uses a scalar auxiliary
 coordinate "lat" in its place, it seems that it is still a dependent
 variable, and should not be thought of as adding another degree of freedom
 to the data, ie with some implicit dimension lat(1).

 Conversely, if you want to indicate that the domain has another
 independent coordinate, you should dimension your data with it, even if
 that dimension happens to only be of length 1, and you should add a
 coordinate variable of length 1. A good example I run across a lot in GRIB
 data are vertical coordinates. If one could have multiple vertical levels,
 say on pressure levels, I will add a vertical dimension even if theres
 only one such level in the file. But for things like "Temperature at the
 tropopause" I wont add a vertical dimension. Obviously one could do it
 differently, but I think thats reasonable.

 This interpretation has the advantage that one can add as many auxiliary
 coordinates as you need, without increasing the dimensionality of the
 domain. I think that's the essence of what it means to have n independent
 coordinates.

 What do you think?

 John




 Regards, John

 Replying to [comment:45 jonathan]:
 > Dear John
 >
 > Yes, we agree, this whole ticket is really about clarifying the data
 model. The clarifications being discussed are concerned with how CF-netCDF
 metadata are interpreted, and do not affect the legality of CF-netCDF
 files, although they do imply that some ways are better than others for
 encoding a given dataset. The part you quote last is the crucial issue in
 this debate.
 >
 > I think you have made a good point, and thanks for that. In the [https
 ://cf-pcmdi.llnl.gov/trac/ticket/95#comment:52 draft data model as it
 stands], we distinguish dimension (Unidata, COARDS) coordinates and
 auxiliary coordinates purely on formal grounds (uniqueness, monotonicity,
 dimensionality, data type). That's because we based the data model on the
 CF-netCDF convention, of course. But I think you have correctly identified
 the conceptual distinction, and we should put that in the data model
 document as well, namely that the dimension coordinate variables are
 independent, and the auxiliary coordinate variables are dependent. It's
 because the dimension coordinates are independent that they must be unique
 and one-dimensional. Those formal properties don't help with a scalar
 coordinate, as you say, but the idea that it's an independent variable is
 still valid.
 >
 > In some situations, a CF-netCDF file might have auxiliary coordinate
 variables of dimensions which do not have dimension coordinate variables.
 One situation is an axis for an unordered collection, such as an ensemble
 axis. In that case, I suppose the index along the ensemble dimension is
 the independent variable, in a sense, but that is not useful information
 since the ordering is arbitrary, and there's no need for explicit
 independent coordinates. In the case you mentioned earlier, of 2D lat and
 lon auxiliary coordinate variables if the 1D projection coordinates are
 not given, I would say that the auxiliary coordinates are still dependent
 on the projection coordinates. Even though the latter are absent, there
 are formulae which define the relationship.
 >
 > Will this ticket be OK, do you think, if we add some text to insert a
 couple of sentences in the CF-netCDF standard, when the two kinds of
 coordinate variable are introduced, to point out their distinction in role
 of independence/dependence? If so, I'll draft some extra text for this
 ticket. Do you think we really need to ''define'' what independence and
 dependence mean in the CF-netCDF standard, or can we assume that people
 will understand them in their usual mathematical sense?
 >
 > Cheers
 >
 > Jonathan

-- 
Ticket URL: <https://cf-pcmdi.llnl.gov/trac/ticket/104#comment:46>
CF Metadata <http://cf-pcmdi.llnl.gov/>
CF Metadata

This message came from the CF Trac system.  To unsubscribe, without 
unsubscribing to the regular cf-metadata list, send a message to 
"[email protected]" with "unsubscribe cf-metadata" in the body of your 
message.

Re: [CF Metadata] #104: Clarify the interpretation of scalar coordinate variables

Reply via email to