Re: [CF-metadata] [cf-convention/cf-conventions] Reference UGRID conventions in CF (#153)

David Hassell Tue, 27 Jul 2021 12:18:16 -0700

Hi @ChrisBarker-NOAA and @pp-mo,

I have been away and am just catching up with the conversation. Thank you for 
an interesting read!


> From the data model perspective, there needs to be SOME way to define the 
> connectivity. how it's done is a matter of the "encoding", yes?

Absolutely.

Patrick's description of symmetry 
(https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/issues/153*issuecomment-882630042__;Iw!!G2kpM7uM-TzIFchu!lbgXr2bG5DfZVu02KjCqJBqYJUd461YJXHy_PUxq-1YQYcmNceBLkbAVeW4u6ezxu6cK3041iWg$
 ) is right - it is symmetric in the square matrix sense. This is indeed not at 
all how UGRID actually encodes this information, but the CF data model is 
independent of the encoding, and the symmetric matrix is logically what is 
going on here. Whilst there is no expectation that anyone should encode it in 
this manner, it is tempting (to me!) because the square connectivity matrix can 
be easily updated in subspacing operations.

It's a good point about what to do on this matrix's diagonal - I think that the 
option of these values having 'no meaning regardless of value' is sufficient 
for the data model. Whether you use booleans, integers, strings, etc. to denote 
connected/not connected is entirely an encoding choice and has no impact on the 
data model.


>  we really do need to capture the concept that the mesh is not unrelated 
> pieces, it is one thing -- you can't really know what any of the data fully 
> means without the full mesh description.

> (2) likewise, the CF description has no place for the cross-location 
> connectivities (like face-edge-connectivity)

The CF model does connect (e.g.) faces with edges and nodes, but in  a 
different manner to UGRID. In CF, a "cell" is typically defined as the "space" 
enclosed by bounds, and the edges of the cell are the connections between 
adjacent cell bounds. This space may be 0-d, in which case it is just a "node"; 
or 1-d, in which case it is an "edge" connecting two "nodes"; or 2-d, in which 
case it is a "face" defined by "edges" and "nodes"; (etc, but we don't 
generalise to 3-d and beyond, yet). The nodes (i.e. bounds) and edges do not 
have an independent existence - they are elements of the CF cell definition. 
The new domain topology construct makes explicit the cell connectivities.


> (1) Firstly, a UGRID mesh with multiple locations relates to multiple 
> domain-axes in CF
That can "work", because any data-variable can only reference one of them, as 
described :
> 
> >     we have no use case for two or more topology constructs, each of which 
> > applies to a single unique domain axis, and in fact we have no way of 
> > encoding it, so that case should indeed be excluded.
> 
> So, the CF 'topology construct' can only be atttached to a single domain axis 
> of a domain.
> That means that UGRID data which maps to a different mesh location is 
> modelled as belonging to a separate, independent "domain".
> This seems okay for now, but it means of course that the CF decsription has 
> no concept of the intercoupled nature of the different locations.

Yes. A CF field construct contains a domain that is limited to describe just 
the parent field's data. Although, be careful not to confuse "domain axis" and 
"domain". A "domain axis" is essentially a dimension of the domain. We restrict 
the new domain topology constructs to apply to a single domain axis simply 
because there is no current way of encoding a domain topology construct that 
applies to multiple domain axes. UGRID only describes a mesh with a 1-d 
discrete axis. If this ever changed, it would be described by a simple 
generalisation of the data model text. The CF data model mustn't provide 
capabilities that are not allowed by the CF conventions.

If there are no data variables in a file - i.e. just the mesh is stored in a 
dataset - then the entire mesh  definition is captured by a CF domain 
construct. However, if one or more a data variables are defined on the mesh, 
then their CF domains are only allowed to be those elements of the mesh that 
are in use. For example, if a temperature variable is stored on faces and a 
U-wind is stored at nodes, then the domain of the former will include the UGRID 
faces, edges and nodes, but the domain of  the latter will only know about the 
nodes. In both cases, a connectivity matrix will retain the required 
connectivities.  

I just wrote _"If there are no data variables in a file - i.e. just the mesh is 
stored in a dataset - then the entire mesh  definition is captured by a CF 
domain construct."_, however I realise that that's not necessarily the case if 
there are edge coordinates as part of a mesh with faces. In this case, the edge 
coordinates can only be represented by the CF data model in a second domain 
that comprised 1-d cells defining each edge and how it's connected to others. I 
can't decide right now if this is a problem for the data model (i.e. should 
edge coordinates be a new feature of coordinate constructs?). The only issue I 
might have is that the "round trip" of reading a mesh variable into a CF domain 
construct and then writing it back to disk might not give an exactly comparable 
result to what you started with, but that isn't a promise of the data model, so 
perhaps not an issue? @JonathanGregory, it would be interesting to know your 
thoughts on this.
 
CF nor UGRID makes the promise that multiple data variables defined by the same 
mesh variable are guaranteed to be in some way combinable. That is a decision 
or assumption that has to be made by the user. Therefore, separate domain 
definitions for each data variable is an appropriate view for the CF data 
model. This becomes clear when you consider, say, multiple datasets from the 
same model simulation - each dataset contains the same mesh, but we only know 
it is the same by inspection or by the promise of non-standardised metadata.

When formal connections between data variables are possible in CF we'll need 
some carefully thought out extensions to the data model, but that's not 
something for this discussion (fortunately!).
 


> If values on the nodes are treated as their own thing, you can plot them fine 
> -- but if you need to interpolate the values, you need to know how the faces 
> are defined.

If values are only defined at nodes, then the  cells of the domain are defined 
by single points with a connectivity array that implicitly  defines the edges 
and faces, so the data model is storing everything we need for interpolation.

> "In CF-netCDF a **domain** topology can only be provided for a domain defined 
> by a UGRID mesh topology variable"

Yes indeed - thanks for spotting my mistake.

All the best,
David


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/issues/153*issuecomment-887768641__;Iw!!G2kpM7uM-TzIFchu!lbgXr2bG5DfZVu02KjCqJBqYJUd461YJXHy_PUxq-1YQYcmNceBLkbAVeW4u6ezxu6cKh2lI92c$
 
This list forwards relevant notifications from Github.  It is distinct from 
[email protected], although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
[email protected].

Re: [CF-metadata] [cf-convention/cf-conventions] Reference UGRID conventions in CF (#153)

Reply via email to