Mark,

It seems to me that we have quite a few examples of coordinate variables that have extra attributes that further define the contents. Time coordinate variables, for example, have the calendar attribute. There are many standard names that direct the developer to specify an attribute (most always comment or flag_values/flag_meanings attributes) that further defines the contents. Any variable can validly have attributes associated with it.

If having a specific attribute name that's not mentioned in the CF Conventions document called for in a standard name definition is too troubling, the standard name definition could call for putting a string of the form 'ensemble size N' in a comment attribute. Or it could call for putting the ensemble size in a comment section in the cell_methods attribute on the data variable as Jonathan and Karl's suggested. Jonathan and Karl's suggestions imply a change to the conventions, since they propose new standardized cell method comment names.

In all of these cases the information will be available to a human who reads a file dump, but none of them make the information immediately available to software automation. The addition to the cell_methods attribute grammar is likely the least intrusive way to make it something that people can write general software for. The down side to this approach is that the information is not held with the coordinate variable, which is the most natural place for it.

Another alternative is to add a new section to Chapter 5 that defines an ensemble or sample pool coordinate type (or whatever name you prefer). It may be worth the extra trouble to go ahead and give it formal recognition instead of trying to work it into existing forms that, in my opinion, don't fit it too well. I appreciate the desire to find the least intrusive way to modify the conventions, but we can end up painting ourselves into corners in the process.

Grace and peace,

Jim

On 7/29/15 3:24 AM, Hedley, Mark wrote:
Hello Jim

this is a really neat alternative approach

I agree that the information about the ensemble_size is closely related to the realization coordinate and less closely related to the data variable, so this method encapsulates the metadata nicely.

Whilst the solution is elegant, I cannot see a previous example of a coordinate variable within CF defining extra attributes, so I'm a bit wary that this approach will require a change to the conventions document, not just a new standard_name.

Is there a neat way to use CF to provide metadata about a coordinate, rather than about a data variable?

I think it's well worth considering, but it may be a path of some resistance

many thanks
mark

------------------------------------------------------------------------
*From:* CF-metadata [[email protected]] on behalf of Jim Biard [[email protected]]
*Sent:* 23 July 2015 13:11
*To:* [email protected]
*Subject:* Re: [CF-metadata] original_ensemble_size

Hi.

It seems to me that you would want a coordinate variable with the standard name 'realization' (whether scalar or multi-valued) and give it an attribute with the name 'ensemble_size'. You can store the realization number in the variable and the ensemble size in the attribute.

Grace and peace,

Jim

On 7/23/15 6:11 AM, Hedley, Mark wrote:
I use the
'coordinates'
attribute on my data variable, referencing the scalar 'ensemble_size' variable, thus defining this ensemble_size as a scalar coordinate variable for the temperature dataset

mark

------------------------------------------------------------------------
*From:* CF-metadata [[email protected]] on behalf of Karl Taylor [[email protected]]
*Sent:* 22 July 2015 22:53
*Cc:* CF Metadata List
*Subject:* Re: [CF-metadata] original_ensemble_size

Hi all,

I'm still curious about something:

Suppose we have the temperature field stored from one member of an ensemble of size 10. We want to make the size of the ensemble known to the user. We store 10 as a scalar variable with standard name "ensemble_size", but how does that scalar get associated with our temperature variable (other than it having being stored in the same file)?

cheers,
Karl

On 7/22/15 1:59 AM, Hedley, Mark wrote:
Hello John, Karl et al

I'm not sure I agree with John's last statement. I think that an ensemble is a defined collection of members, so my need is the need for ensemble size to be defined explicitly. The distinction that not all members may be present characterises the need for this metadata descriptor, rather than just using the dimension size of realization, which does not meet my requirement.

On reflection, I think that I prefer Karl's name of 'ensemble_size'

To restate my use case, I have a data set from an ensemble, where there is a coordinate variable called 'realization'. Let's say there are 23 members, this dimension is size 23.

I want to reference the number of members in the ensemble, whilst sub-setting the data variable in various ways.

The suggestion is to add a scalar coordinate to my original dataset, which contains the number of members in the ensemble. Then any sub-setting operation will retain this coordinate, and I will always be able to state that this member is member 0 of 23, 5 of 23 etc

One requirement I have is to slice this variable, to result in a 2D data array, 2 1D coordinate variables: latitude and longitude; with all other coordinates as scalars.

If it is reasonable to talk about an ensemble as a defined collection of members, then I agree with Karl, that a standard_name of 'ensemble_size' fits the bill. The description fits my use case nicely

many thanks
mark


------------------------------------------------------------------------
*From:* CF-metadata [[email protected]] on behalf of John Graybeal [[email protected]]
*Sent:* 22 July 2015 05:52
*To:* Karl Taylor
*Cc:* CF Metadata List
*Subject:* Re: [CF-metadata] original_ensemble_size

Karl,

To my understanding (then and now), the use case is explicitly not what your definition describes. The entire point of the request was to provide a label that was clearly distinguished from the typical concept of ensemble size.

John



On Jul 21, 2015, at 16:36, Karl Taylor <[email protected] <mailto:[email protected]>> wrote:

Dear all,

I wonder if the following might also meet requirements of the use case:

name: *ensemble_size*
*
*
description: The number of member realizations in an ensemble. This name provides context for any specific realization, which might not be co-located with the other members of the ensemble.

Karl

On 7/20/15 9:49 PM, John Graybeal wrote:
To save others the lookup, the use case phrasing that Mark signed on to were these words: "In my use case, the whole ensemble is not present, I only have a subset of the members. I have a metadata element telling me how many members there were at the time the ensemble was created, which I would like to encode." The entire thread is titled 'realization | x of n', but it is pretty, umm, rich with detail.

The last email before discussion went silent appears to be mine:

Modified to fit Mark's use case, I think suitable text is:

name: *original_ensemble_size*
*
*
description: The number of member realizations in the originally constituted ensemble. This provides context for any specific realization, for example orienting a member relative to its original group (even if the group is no longer intact).

This does not mention forecasting, preserves the origination concept, and gives a bit of context, without constraining the application. It could even be an ensemble of observations, or cat videos, or ... you get the idea.

I will let someone else provide the example of how that is associated with the variable, it will be more authoritative!

John


On Jul 20, 2015, at 14:42, Karl Taylor <[email protected] <mailto:[email protected]>> wrote:

Hi Mark,

I didn't quite understand how the standard name gets associated with a variable (containing 1 or more realizations from the ensemble). Someone said it was through a scalar coordinate variable, but I don't see how the ensemble member is a function of the ensemble size, so why would this be appropriate?

Could you supply an example?

Also, I didn't follow why "original" was included in "original ensemble size". Surely, you wouldn't report this number unless you thought the ensemble size was pretty much set and wouldn't change. In that case there shouldn't be a need for a "modified ensemble size", so wouldn't "ensemble size" suffice?

thanks,
Karl


On 7/20/15 9:24 AM, Hedley, Mark wrote:
Hello CF

Late last year we had a discussion about storing
original_ensemble_size
in a CF file
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2014/thread.html#57756

There were a few options discussed, with John Graybeal making the suggestion
original_ensemble_size
/description: The number of members constituting an ensemble./
for a new standard_name definition, which seemed to fit the case very well

It does not seem to have been adopted into the standard names list as yet.

Please may this name and definition be adopted, or reasons not to detailed here?

thank you
mark




_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

_______________________________________________
CF-metadata mailing list
[email protected] <mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata






_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

--
CICS-NC <http://www.cicsnc.org/>Visit us on
Facebook <http://www.facebook.com/cicsnc>         *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA National Centers for Environmental Information <http://ncdc.noaa.gov/>
/formerly NOAA’s National Climatic Data Center/
151 Patton Ave, Asheville, NC 28801
e: [email protected] <mailto:[email protected]>
o: +1 828 271 4900

/Connect with us on Facebook for climate <https://www.facebook.com/NOAANCEIclimate> and ocean and geophysics <https://www.facebook.com/NOAANCEIoceangeo> information, and follow us on Twitter at @NOAANCEIclimate <https://twitter.com/NOAANCEIclimate> and @NOAANCEIocngeo <https://twitter.com/NOAANCEIocngeo>. /



--
CICS-NC <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>         *Jim Biard*
*Research Scholar*
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA National Centers for Environmental Information <http://ncdc.noaa.gov/>
/formerly NOAA’s National Climatic Data Center/
151 Patton Ave, Asheville, NC 28801
e: [email protected] <mailto:[email protected]>
o: +1 828 271 4900

/Connect with us on Facebook for climate <https://www.facebook.com/NOAANCEIclimate> and ocean and geophysics <https://www.facebook.com/NOAANCEIoceangeo> information, and follow us on Twitter at @NOAANCEIclimate <https://twitter.com/NOAANCEIclimate> and @NOAANCEIocngeo <https://twitter.com/NOAANCEIocngeo>. /


_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to