Re: [CF-metadata] [cf-convention/cf-conventions] Introducing a CF domain variable (#301)

OceanDataLab Wed, 28 Oct 2020 13:51:20 -0700

Hi @JonathanGregory 

> I think that allowing data variables to refer to the domain with a single 
> reference _instead_ of providing the domain information by various references 
> on the data variable would be a drastic change to the convention. Although 
> not backwards incompatible in the sense that it wouldn't invalidate existing 
> conventions or data, it would require all software to be rewritten to support 
> this different method. I think that would be a bad decision.


I never suggested to use a single reference _instead_ of  the usual data 
variable attributes. This is something that is only mentioned in 
https://github.com/cf-convention/cf-conventions/issues/301#issuecomment-709406316
 and I think it was just due to a misunderstanding or a typo. So we agree that 
breaking backward compatibility would be a bad idea.

> Alllowing a domain reference _in addition_ to the other means of describing 
> the domain by the data variable would be redundant, and therefore potentially 
> inconsistent, which also doesn't sound good to me.

Here we disagree:

 1. without this proposal, if there is a single domain shared by several data 
variables and each data variables describes this domain with the usual 
attributes, you already have redundancies as the single domain is described on 
each data variable, and there is no way to check that the domain description is 
consistent among these data variables so the inconsistency risk is quite high.

 2. with this proposal but _without_ the domain variable reference that I 
mentioned, we gain the ability to access domain information directly (which is 
a very good thing) but we create an additional description of the domain, which 
could conflict with the description provided on data variables (that may 
already conflict with each other as seen in 1.). So the risk of inconsistency 
is slightly higher than in 1.

 3. with this proposal and _with_ the domain variable reference, you still have 
all the redundant descriptions that were already there in 1. and 2. but now you 
have a tool that allows you to automatically detect consistency issues that 
arise from the pre-existing redundancy problem. So from my point of view you 
not only get a clearer description of the data but also a way to validate 
domain information across redundant definitions (which could be implemented _or 
not_ in a software for automatic validation).

> I understand your argument that you want to use the domain reference as a way 
> to identify the domain uniquely, but I would argue that you can't really 
> depend on that method. It will only work within a single file (within which 
> one can depend on variable names as references) and netCDF datasets aren't 
> necessarily contained in single files. Hence you still need to be able to 
> decide whether domains are equal by inspecting the metadata and coordinates. 
> You would have to be able to do that also if assembling a dataset from 
> various sources.

I admit I have no experience with multi-file netCDF datasets so I may not fully 
grasp all the implications that adding the domain variable reference would have 
on this data structure. I quickly browsed the NcML documentation and it seems 
to allow the creation and modification of attributes on the variables of the 
multi-file dataset, so someone who wants to aggregate files from several 
sources could write a NcML file that correctly defines the domains and their 
references in the view offered by the multi-file dataset. But again, I have 
never worked with this kind of datasets so I may be completely wrong.

Cheers,

Sylvain

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/cf-convention/cf-conventions/issues/301#issuecomment-718200791

This list forwards relevant notifications from Github.  It is distinct from 
[email protected], although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
[email protected].

Re: [CF-metadata] [cf-convention/cf-conventions] Introducing a CF domain variable (#301)

Reply via email to