Re: [CF-metadata] [cf-convention/cf-conventions] Lossy compression through coordinate sampling (#326)

2021-07-01 Thread AndersMS
@AndersMS pushed 1 commit.

839b49c63365a2f27a5bf47434def05636feaee3  Improve description of 
non-overlapping interpolation subareas


-- 
You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/pull/326/files/577fbcc7996d85157ebd5d829240b121c8ca2a76..839b49c63365a2f27a5bf47434def05636feaee3__;!!G2kpM7uM-TzIFchu!hF4g0shse5P50Ebgp_p2goaVVo5mZREFKScVzvWHmvGfBwUmf-65UT9ITlu6zq3SRi4d7oEP9nI$
 

This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.


Re: [CF-metadata] [cf-convention/cf-conventions] Lossy compression through coordinate sampling (#326)

2021-07-01 Thread AndersMS
@AndersMS pushed 1 commit.

577fbcc7996d85157ebd5d829240b121c8ca2a76  Improve description of 
non-overlapping interpolation subareas


-- 
You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/pull/326/files/35019927e4218044d09d1cc6a61cf727dc4cf58f..577fbcc7996d85157ebd5d829240b121c8ca2a76__;!!G2kpM7uM-TzIFchu!nwVKTLLQcfXPwV75rk_481Y37NWwNZYCwqy2ProdGoxDkt4ZTGUgeN0sRyliqv6K0--Nfci1Zdo$
 

This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.


Re: [CF-metadata] [cf-convention/cf-conventions] Lossy compression through coordinate sampling (#326)

2021-07-01 Thread AndersMS
@AndersMS pushed 2 commits.

e5feea3947e9958b79ba26da642a2fd1da66921b  Combine the tie_point_dimensions and 
tie_point_indices attributes (Change 1)
35019927e4218044d09d1cc6a61cf727dc4cf58f  Update figures to match new terms


-- 
You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/pull/326/files/db0eb4e0a75f20d0733996ab8579b26404b148fb..35019927e4218044d09d1cc6a61cf727dc4cf58f__;!!G2kpM7uM-TzIFchu!hNcBgpjWX988On1gRsmE2Vpqo3eYTW_U6nZsrMr4Y_ydj06svlcWKzxDo59xy3f6qm3YhlsYGnE$
 

This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.


Re: [CF-metadata] [cf-convention/cf-conventions] Reference UGRID conventions in CF (#153)

2021-07-01 Thread David Hassell
Dear Patrick,

Thank you for bringing up "location index set" variables. I agree that in the 
absence of the `cf_role` attribute it is not always possible to distinguish one 
from data variable, so I would be happy with retaining it on these variables.

By extension, I think that we should drop the suggestion (it is just that at 
this stage) of removing `cf_role` from the mesh topology variable. This is 
because a location index set variable is logically identical to a mash 
variable, so having a common mechanism of identification would be nice. 

> ( This problem is entirely analogous to an existing problem with 
> auxiliary-coordinates in standard CF : If the data-variable which references 
> them is removed, then they are not distinguishable from data-variables -- so 
> they "become" data-variables )

I wouldn't call this a problem, rather a feature! When we read a dataset, we 
tend to not cast variables that have been identified as auxiliary coordinates 
(or other roles) as data variables _as well_, but that is only a default 
behaviour that is what most of want most of the time.  

All the best,
David


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/issues/153*issuecomment-872289997__;Iw!!G2kpM7uM-TzIFchu!m3MvmAJbcTaPU6Ztlb1aEbCnoUt3-uDR3R1pG6HQ9fufA_aTTj_7DV37GTYArWT6nxp6IECmCAg$
 
This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.


Re: [CF-metadata] [cf-convention/cf-conventions] Lossy Compression by Coordinate Sampling (#327)

2021-07-01 Thread OceanDataLab
@AndersMS: yes I think replacing "sample/sampled" with "subsample/subsampled" 
would make the text more consistent.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/issues/327*issuecomment-872180730__;Iw!!G2kpM7uM-TzIFchu!hmExYad1E2EtKOBPK4kRHaiCFUQtV6xN0mKWNJOIVto5CipFkc4srlxWqVd6kdMyo8HqVdXfM5s$
 
This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.


Re: [CF-metadata] [cf-convention/cf-conventions] Lossy Compression by Coordinate Sampling (#327)

2021-07-01 Thread OceanDataLab
Hi,

Here is a new take on the computational precision paragraph:

 8.3.8 Computational Precision

The accuracy of the reconstituted coordinates will depend on the degree of 
subsampling, the choice of interpolation method and the choice of the 
floating-point arithmetic precision used in the interpolation method 
computations.

Implementation details of the interpolation methods and hardware can also have 
an impact on the accuracy of the reconstituted coordinates.

The creator of the compressed dataset must check that the coordinates 
reconstituted using the interpolation parameters specified in the file have 
sufficient accuracy compared to the coordinates at full resolution.

Although it may depend on the software and hardware used by the creator, the 
floating-point arithmetic precision used during this validation step must be 
specified in the `computational_precision` attribute of the interpolation 
method as an indication of potential floating-point precision issues during the 
interpolation computations.

The `computational_precision` attribute is mandatory and accepts the following 
values:

(table)
"32": 32-bit floating-point arithmetic, comparable to the binary32 standard in 
[IEEE_754]
"64": 64-bit floating-point arithmetic, comparable to the binary64 standard in 
[IEEE_754]

For the coordinates reconstitution process, using a floating-point arithmetic 
precision matching or exceeding the precision specified by 
`computational_precision` is likely to produce results with an accuracy similar 
to what the creator obtained during the validation of the dataset, but it 
cannot be guaranteed due to the software/hardware factor.

As an example, `computational_precision = "64"` would specify that, using the 
same software and hardware as the creator of the compressed dataset, sufficient 
accuracy could not be reached when using a floating-point precision lower than 
64-bit floating-point arithmetic in the interpolation computations required to 
reconstitute the coordinates.

**Bibliography**
**References**

[IEEE_754] [IEEE Standard for Floating-Point 
Arithmetic](https://urldefense.us/v3/__https://ieeexplore.ieee.org/stamp/stamp.jsp?tp==8766229=8766228__;!!G2kpM7uM-TzIFchu!h6g5VosoxJPZrjMnqKldnCSEYC-DjpPbYuBWm5Jd1jE2UXrJvx-8VuSHQj4VI5g_zrcX2-BLYGI$
 ), in IEEE Std 754-2019 (Revision of IEEE 754-2008) , vol., no., pp.1-84, 22 
July 2019, doi: 10.1109/IEEESTD.2019.8766229.


---

 Rationale:

The accuracy of the interpolation methods depends not only on the choices made
by the data producer (tie points density, area subdivisions, interpolation
method parameters, etc...) but also on the software (programming language,
libraries) and on the hardware (CPU/FPU) used by the data consumers.

The data producers only know about their own software and hardware, so the
computational_precision attribute can only mean that the data producer used
this floating point precision when they validated these data using their
implementation of the interpolation method, not that using this floating point
precision on any software/hardware combination will produce exactly the same
results.

I think the computational_precision attribute can only be considered as a hint
provided by the data producer regarding numerical issues they encountered when
trying to reconstruct the target variables at their full resolution with their
implementation of the interpolation method: if the computational_precision
exceeds the precision of the data type (e.g. a "64" computational_precision
used when interpolating a float variables), then users know that the data
producer did not obtain satisfying results when using a lower precision, hence
they should be wary of underflow/overflow errors when they interpolate these
data. So computational_precision is more of an informational hint than a
compulsory instruction given to the users (unless @erget 's CF police becomes
a reality), and it is not a reproductibility guarantee either.

Yet it is still a useful piece of information and no one except the data
producer can provide it since you need access to the original data at their
native resolution to make actual checks on the accuracy of the interpolation
method. As the information cannot be derived from the content of the file it
makes sense to require that data producers include this attribute
systematically: the computational_precision should be mandatory.

Sylvain

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/issues/327*issuecomment-872178234__;Iw!!G2kpM7uM-TzIFchu!h6g5VosoxJPZrjMnqKldnCSEYC-DjpPbYuBWm5Jd1jE2UXrJvx-8VuSHQj4VI5g_zrcX6hbMb2s$
 
This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a 

Re: [CF-metadata] [cf-convention/cf-conventions] Lossy Compression by Coordinate Sampling (#327)

2021-07-01 Thread AndersMS
Dear All,

Considering that we have now renamed the term _tie point interpolation 
dimension_ to _subsampled dimension_, should we possibly change the title

**Lossy Compression by Coordinate Sampling**

to 

**Lossy Compression by Coordinate Subsampling**

and the replace the occurrences of sample/sampled in the text with 
subsample/subsampled?

Anders

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/issues/327*issuecomment-872151596__;Iw!!G2kpM7uM-TzIFchu!mbL15pJsqDLSvUSrR8qOXHUG8lT1kAbisJUrSAsSPsLm6yt1yiL1zDzJJ_2j-MJAXmUDxRzC2Cc$
 
This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.


Re: [CF-metadata] [cf-convention/cf-conventions] Lossy compression through coordinate sampling (#326)

2021-07-01 Thread AndersMS
@AndersMS pushed 1 commit.

db0eb4e0a75f20d0733996ab8579b26404b148fb  Rename terms to: subsampled 
dimension, interpolated dimension and non-interpolated dimension


-- 
You are receiving this because you are subscribed to this thread.
View it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/pull/326/files/485d3b8a762d074844611b96fbfd5bbedb228d22..db0eb4e0a75f20d0733996ab8579b26404b148fb__;!!G2kpM7uM-TzIFchu!m14dVSsGzOwHi1q5ejmIuQLgnreHsF4dbAZKMBFjF76H1cQQiz_NBPKnGXawlUTj8Tgik7i6t78$
 

This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.


Re: [CF-metadata] [cf-convention/cf-conventions] Reference UGRID conventions in CF (#153)

2021-07-01 Thread Patrick Peglar
Hi, sorry to be late to the table with this issue, but I have been listening in 
here a while, in hopes of understanding better and maybe contributing.

I think that I may have spotted a potential problem with the removal of 
'cf_role' :
As stated (above), the role of a mesh-variable is identifiable by its having a 
'topology_dimension' attribute, but **_the same is not true of a 
location-index-set_** :  
So, if we include an index set in a "mesh only file" (aka "meshfile" or 
"gridfile" in some quarters), then we will be unable to distinguish it from a 
data-variable.
This in turn implies that when loading a file, we would need to "know" whether 
it was intended to be a "mesh file" or a "normal datafile" as you can't 
determine that by inspection. 

( This problem is entirely analagous to an existing problem with 
auxiliary-coordinates in standard CF :  If the data-variable which references 
them is removed, then they are not distinguishable from data-variables -- so 
they "become" variables )

Our context : Here at the UK MetOffice, we are working to support unstructured 
data within Iris, [using UGRID as the template for our internal 
data-model](https://urldefense.us/v3/__https://scitools-iris.readthedocs.io/en/mesh-data-model/generated/api/iris/experimental/ugrid.html__;!!G2kpM7uM-TzIFchu!gI2poPff_xwfIpnQMVw6F_zs4v2r3s3F4jWKcIS2Wef7scNBlJ8EEWlts5nRZ3b4KwzlqWjLQmM$
 ) (as we already do for CF).
Locally, we have a particular interest in the use of location-index-sets, and 
we are also intending to use "mesh files" i.e. files with only the mesh 
structure and no data.

Also, our practical experience in tools development and support shows that 
files that do not fully comply with conventions are, sadly, just not that 
uncommon even in the respected international archives.
Thus, just as Iris has to somehow handle files with invalid standard-names and 
units, or mis-specified grid-mappings, so our trial UGRID files suffer from 
problems like missing optional connectivity links, the odd miss-spelling and so 
on.
What this tells us is, that the "robustness" of the format is also a important 
consideration.

>From the point of view of a generic code library developer, the unambiguous 
>identification of the 'role' of elements within a file will definitely make 
>writing parsing code more straightforward -- and not least because dealing 
>with _incorrect_ input in a helpful way is an important usability factor (just 
>as it is in compiler design).
So, I must confess that I personally  was _preferring_ the way that UGRID 
labels each component unambiguously, instead of relying on links from other 
components to infer the role of a variable.

Solutioneering maybe, but ... could we instead **_allow the attribute to be 
named 'ugrid_role', and simultaneously deprecate the older 'cf_role' usage ?_**


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://urldefense.us/v3/__https://github.com/cf-convention/cf-conventions/issues/153*issuecomment-872063284__;Iw!!G2kpM7uM-TzIFchu!gI2poPff_xwfIpnQMVw6F_zs4v2r3s3F4jWKcIS2Wef7scNBlJ8EEWlts5nRZ3b4Kwzl0-z6CLg$
 
This list forwards relevant notifications from Github.  It is distinct from 
cf-metad...@cgd.ucar.edu, although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
cf-metadata-unsubscribe-requ...@listserv.llnl.gov.