#147: clarification of standard and correction of conformance doc: formula_terms
-----------------------------+------------------------------
Reporter: taylor13 | Owner: cf-conventions@…
Type: defect | Status: new
Priority: high | Milestone:
Component: cf-conventions | Version:
Resolution: | Keywords:
-----------------------------+------------------------------
Comment (by taylor13):
Dear Jonathan and all,
As you know, we are about to write petabytes of hopefully CF-compliant
CMIP6 data. There is an urgent need to agree on how to proceed on this
ticket. If possible, I would like to squeeze it into CF 1.7.
To summarize this ticket: Data stored on model levels for CMIP5 was non-
compliant with the standard because formula_terms was attached to the
variable providing bounds for the vertical parametric coordinate, and
currently CF forbids this. (A formula_terms can only be attached to a
coordinate variable.) We plan to include formula_terms for bounds in
CMIP6 too, so it will also be non-compliant unless we change the standard.
I proposed that formula_terms should be allowed to be attached to
variables containing bounds of coordinates as well as being attached to
variables containing the coordinates themselves.
You thought that was a good idea, but also wanted to go further and allow
the bounds to be attached to the (parameter) variables pointed to by
formula_terms, even though these variables cannot in general be considered
coordinates. You thought this was “implicitly allowed by section 7.1”.
But that section is introduced with:
“To represent cells we add the attribute bounds to the appropriate
coordinate variable(s). The value of bounds is the name of the variable
that contains the vertices of the cell boundaries. We refer to this type
of variable as a "boundary variable.” A boundary variable will have one
more dimension than its associated coordinate or auxiliary coordinate
variable.”
This would seem to explicitly rule out use of formula_terms with any
variable other than a coordinate variable (and the parameters appearing in
formula_terms aren’t generally coordinate variables).
So, I think that however we record the values of the parameters needed to
convert the bounds of a parametric coordinate to a vertical location in
physical space, we will have to modify the current convention.
You have also argued that “it's useful to have a direct link from the
formula terms to its bounds for some calculations, because otherwise you
have to search the file to find it.” Earlier you expanded on this:
“… I think it's useful. Although "a" [one of the parameters needed to
define hybrid sigma coordinates] is not a coordinate, you might wish to do
coordinate-like things with it. If I give "a" its bounds, it makes it
self-contained, which I feel is a naturally CF way to go. For instance, I
could hand the varid of "a" to a subroutine with the request to compute
the width of the intervals in "a", in just the same way as I might do with
eta in your example, or with sigma in the previous example. Under your
scheme, the subroutine won't be able to process "a", however, because
there is no pointer from "a" to eta, without which you can't find the
bounds of "a".”
You say you might want to compute the width of “a”, but I can’t think of
any reason to do that (I noted earlier that the so-called “width” can turn
out to be 0 for some parameters.) I can’t think of any use for operating
on the values of parameters at cell bounds other than to compute the
position in space of the vertical coordinate. I would note that both eta
and sigma are actual parametric coordinates, and it clearly is sometimes
useful to compute the width of coordinate cells.
In any case, I can see no added convenience of attaching a bounds
attribute to the parameters themselves, rather than to the variable
containing the bounds coordinate variable. When you come across a
variable that is a function of a parametric vertical coordinate, you would
presumably look at the formula_term to determine what “containers” needed
defining. At that time you could note whether or not there were bounds
defined for that parametric vertical coordinate, and if there were, you
could easily extract and associate the variables containing the parameter
values at the bounds of the vertical coordinate with the variables
containing the parameter values at the coordinate nodes. This would be
quite straight-forward I should think.
Note that I don’t think we can interpret the “value of the parameter at
the bounds of a parametric vertical coordinate’s grid cell” as the “bounds
of the parameter” because cells can’t intrinsically be defined by the
parameter. The cells are defined by the parametric vertical coordinate
(which therefore have bounds). Like other variables (e.g., temperature,
humidity, etc.) that can be defined both at the coordinate locations and
the cell bounds, the parameter values can be defined at both places. But
the cell (along with its bounds) is defined by the coordinate, not the
variables that are a function of that coordinate. Do you agree?
The reason I have been so forceful in arguing against your position is
that I think it requires us to redefine what we’ve meant by a “cell”. Up
to now, a cell has been defined by the bounds attached to a variable used
as a coordinate variable. This meant that the grid cell bounds would
always have values between the values of the two cells they separated.
The concept of a physical cell (like intervals on the number lines taught
us in elementary school) is easy to grasp. If we modify this simple
concept and allow bounds for the parameters associated with parametric
vertical coordinates, I think we make it much harder for novices to
understand what we’re talking about. How can the bounds defining
contiguous cells in 1 dimension not be monotonic? That is what would be
required if we allowed bounds be attached to parameters rather than
limiting their use to coordinates.
I guess if you still don’t see why I’m so opposed to allowing both
options, and there are no other opinions expressed, we have two choices:
1) We allow both options
2) We remain unable to reach consensus, and CMIP6, like CMIP5, will
produce non-CF-compliant files
I anxiously await your thoughts.
best wishes,
Karl
--
Ticket URL: <https://cf-trac.llnl.gov/trac/ticket/147#comment:18>
CF Metadata <http://cf-convention.github.io/>
CF Metadata