Hi @JonathanGregory > I think that allowing data variables to refer to the domain with a single > reference _instead_ of providing the domain information by various references > on the data variable would be a drastic change to the convention. Although > not backwards incompatible in the sense that it wouldn't invalidate existing > conventions or data, it would require all software to be rewritten to support > this different method. I think that would be a bad decision.
I never suggested to use a single reference _instead_ of the usual data variable attributes. This is something that is only mentioned in https://github.com/cf-convention/cf-conventions/issues/301#issuecomment-709406316 and I think it was just due to a misunderstanding or a typo. So we agree that breaking backward compatibility would be a bad idea. > Alllowing a domain reference _in addition_ to the other means of describing > the domain by the data variable would be redundant, and therefore potentially > inconsistent, which also doesn't sound good to me. Here we disagree: 1. without this proposal, if there is a single domain shared by several data variables and each data variables describes this domain with the usual attributes, you already have redundancies as the single domain is described on each data variable, and there is no way to check that the domain description is consistent among these data variables so the inconsistency risk is quite high. 2. with this proposal but _without_ the domain variable reference that I mentioned, we gain the ability to access domain information directly (which is a very good thing) but we create an additional description of the domain, which could conflict with the description provided on data variables (that may already conflict with each other as seen in 1.). So the risk of inconsistency is slightly higher than in 1. 3. with this proposal and _with_ the domain variable reference, you still have all the redundant descriptions that were already there in 1. and 2. but now you have a tool that allows you to automatically detect consistency issues that arise from the pre-existing redundancy problem. So from my point of view you not only get a clearer description of the data but also a way to validate domain information across redundant definitions (which could be implemented _or not_ in a software for automatic validation). > I understand your argument that you want to use the domain reference as a way > to identify the domain uniquely, but I would argue that you can't really > depend on that method. It will only work within a single file (within which > one can depend on variable names as references) and netCDF datasets aren't > necessarily contained in single files. Hence you still need to be able to > decide whether domains are equal by inspecting the metadata and coordinates. > You would have to be able to do that also if assembling a dataset from > various sources. I admit I have no experience with multi-file netCDF datasets so I may not fully grasp all the implications that adding the domain variable reference would have on this data structure. I quickly browsed the NcML documentation and it seems to allow the creation and modification of attributes on the variables of the multi-file dataset, so someone who wants to aggregate files from several sources could write a NcML file that correctly defines the domains and their references in the view offered by the multi-file dataset. But again, I have never worked with this kind of datasets so I may be completely wrong. Cheers, Sylvain -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/cf-convention/cf-conventions/issues/301#issuecomment-718200791 This list forwards relevant notifications from Github. It is distinct from [email protected], although if you do nothing, a subscription to the UCAR list will result in a subscription to this list. To unsubscribe from this list only, send a message to [email protected].
