Dear Mark We still have a difference in our interpretation of scalar coordinates. Maybe this is irreconcilable, but I am reluctant to reach that conclusion. The reason why this is hard to settle is that, as we have agreed, it has no implication at all for the netCDF file. It is just a matter of interpretation. It's hard to decide because we are not constrained by the file format. The constraints come from what you do with the data, and what design you would like for data analysis software.
Our difference is about whether scalar coordinate variables represent a CF construct of their own, or whether they are a convenient way of representing size-one dimension coordinate constructs and size-one string-valued auxiliary coordinate constructs, which are normally stored respectively as (Unidata convention) coordinate variables and (CF) string-valued auxiliary coordinate variables. As discussed, I don't recall an intention, when scalar coordinates were introduced, that they would be a new kind of thing; they were intended to be handy way of encoding an existing kind of thing, and that is what the words in the standard document still suggest to me. Therefore I think we need only two concepts and scalar coordinates are not a third concept. I argue for this because it's simpler (applying Occam's razor, if you like). We need only those two concepts to describe coordinate variables in CF-netCDF files. That means you have to decide which kind a scalar coordinate variable is representing. I think the convention (up to CF 1.5 - see DSG email) implies that if it's a numeric scalar, it's a size-one dimension coordinate construct, and if it's a single string, it's a size-one auxiliary coordinate construct. You are concerned this restricts your flexibility. I see why you say that, of course, but practically speaking I don't think it does. You can't leave it undecided what a scalar coordinate means (according to my view), but nothing prevents you from following a different interpretation to the above when you read the file. That may mean you have adopted a different view from the person who created the file, but if that person did not *want* to make it clear what sort it was (which is what you suggest) then surely he or she does not mind which interpretation you adopt. Equally, you can read the file with one interpretation, and then change your mind when it's in memory, by converting dimension coordinate constructs to auxiliaries or vice-versa, creating or dropping size-one dimensions. This is all easy to do in memory. The data is completely unaffected, apart from being possibly reshaped by the insertion or removal of size-one dimensions. It just shuffles the metadata around. It matters when you come to aggregate different variables, within a file or from different files. For instance, you can't aggregate two variables that have different scalar values of both experiment_id and ensemble_member_number unless you decide that these are both auxiliary coordinate variables of the same (omitted) size-one dimension. On the other hand, if you have four data variables, showing all the possible combinations of two scalars, (experiment_id 1, ensemble_member_number 1) (experiment_id 1, ensemble_member_number 2) (experiment_id 2, ensemble_member_number 1) (experiment_id 2, ensemble_member_number 2) you may wish to aggregate them with two size-2 dimensions. Thus, you may need to change the interpretation of scalars, in order to enable the aggregation or determine how it's done. But to get that flexibility doesn't require that initially you were undecided about what the scalars mean. It only needs you to be able change your mind about what they mean, which is easy to do (last paragraph). Aggregation of data variables is *not* part of the CF standard (at the moment). Therefore what you do about this lies in the realm of software design, which can adopt its own rules, which the writer of the data can't influence. Once the data is in memory, you can do what you like. You might wish to be flexible about interpretation of multi-valued coordinates, not just scalars. For instance, you might want to combine two data variables (lat,lon) with different lat and lon coordinate variables and dimensions, by flattening them both, to make lat and lon auxiliaries of a discrete (index) axis. That's not how the data was written, but so what? It might be convenient for the use of the data. It doesn't imply the coordinate variables to be of unspecific meaning in the first place. Best wishes Jonathan _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
