# Introducing a CF domain variable

# Moderator
TBD

# Moderator Status Review [last updated: YYYY-MM-DD]
Brief comment on current status, update periodically

# Requirement Summary

The concept of a domain that describes data locations and cell properties is 
not currently mentioned in the CF conventions, because it does not correspond 
to any single entity in the netCDF file. Instead, the domain is stored 
implicitly in a number of other variables and attributes that are linked to the 
data variable in various ways defined by the conventions.

The domain is, however, well defined in the CF data model as an abstract 
concept (as opposed to a data model construct) that provides the linkage 
between the field construct and the metadata constructs that describe the 
relevant data locations and cell properties. There is currently no "domain 
construct" in the data model, since there is no corresponding CF-netCDF entity.

There is a need to be able to describe a domain independently of any data 
variables, which is currently not possible. Use cases include:

* Curated data streaming services for which it is impractical to send very 
large domain descriptions with every file.

* Storing time-dependent coordinates from remote sensing applications.

* Storing geometries without any timeseries data.

For such use cases, it is not satisfactory to try to locate an appropriate 
multidimensional data variable that describes the required domain, nor to 
create a dummy data variable for this purpose, which has no physical meaning.

Therefore, the inclusion of CF-netCDF domain variables that can encode a domain 
independently of any data, and a corresponding data model domain construct, 
will enhance CF by meeting these use cases.

# Technical Proposal Summary

## NetCDF encoding

A new "domain variable" will be introduced that is of arbitrary type since it 
contains no data. This variable will act as a container to bind together other 
variables that collectively define a domain, in a similar manner to how a data 
variable performs the same task.

It will support the same CF attributes as are allowed on the data variable for 
describing a domain, with exactly the same meanings and syntaxes: 
`cell_measures`, `coordinates`, `geometry`, and `grid_mapping`. These will be 
indicated as domain variable attributes by the additional "Do" indicator (short 
for Domain) in the "Use" column of [Appendix A: 
Attributes](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#attribute-appendix).

Any future CF attributes that a data variable may use to describe its domain 
will be similarly transferred to the domain variable, meaning that keeping the 
domain variable up to date with other enhancements will be a well defined and 
easy task.

There is no mechanism for referencing a domain variable from a data variable, 
i.e. a data variable must still encode its domain in the current, implicit 
manner. This is to preserve backwards compatibility with all existing software 
libraries that understand the current structure of a data variable; and to 
reduce redundancy or incompatibility issues that may arise if a data variable 
encodes its own domain _and_ references a domain variable.

A domain variable may exist in a file with or without other data variables.

## Data model

The domain in the data model will be transformed from an abstract concept into 
a "top-level" construct, i.e. one that can exist in the absence of any other 
constructs. Currently, the field construct (corresponding to a CF-netCDF data 
variable) is the only top-level construct.

The new domain construct will replace the current domain concept, replicating 
it every in every way apart from that it will be related to the field construct 
via an aggregation relationship, rather than by the current composition 
relationship of the abstract domain concept. This makes it clear that the 
domain construct can exist independently from the field construct.

It is of no consequence to the data model that a CF-netCDF data variable will 
not be able to explicitly reference a CF-netCDF domain variable. That is an 
encoding choice that does not affect the logical structure.

## Location in the conventions document

* The domain variable will be described in a new section: **5.8 Domain 
Variables**

* The following appendices will updated:

* Appendix A: Attributes

* Appendix I: The CF data model

* CF Conformance Requirements and Recommendations

# Benefits

All those who meet the use cases described in the **Requirements summary** will 
benefit from the new domain variable.

# Status Quo

At present, a domain can only be encoded implicitly via a data variable, 
leading to ambiguities when retrieving a domain from a dataset.

# Associated pull request

#302

# Detailed Proposal

Conventions text has been proposed in chapter 5, appendices A and I, and the 
conformance document in pull request #302

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/cf-convention/cf-conventions/issues/301

This list forwards relevant notifications from Github.  It is distinct from 
[email protected], although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
[email protected].

Reply via email to