Dear Martin I agree with adding to the definition of region, and also area_type (for which this approach has also been advocated), that it may be convenient to store such variables as numbers with a flag_values and flag_meanings attribute. However I don't think it should be regarded as a different quantity, so I don't think it needs a different standard name. I don't think we should define standard numbers for regions or area_types, because this would be against the usual CF principle that files should describe their contents without need for reference to external tables. The representation of strings as numbers is not standardised, and is defined in the file by the flag attributes. That is why I regard it as an issue of encoding, more like scale and offset, and not a different quantity.
Best wishes Jonathan ----- Forwarded message from [email protected] ----- > Date: Thu, 26 May 2016 09:05:54 +0000 > From: [email protected] > To: [email protected] > Subject: [CF-metadata] Use of CF standard name region > > Dear All, > > I can see some sense in Jonathan's interpretation that a string valued > concept could be represented in a file by integers and > flag_values/flag_meanings. If we want to follow that interpretation, however, > the current standard name definition, which states that the variable contains > strings, is unhelpful. If we adopt this approach, it would help to modify > the standard name definition slightly: > From: A variable with the standard name of region contains strings which > indicate geographical regions. These strings must be chosen from the standard > region list. > To: A variable with the standard name of region contains strings which > indicate geographical regions, or refers to them through the > flag_values/flag_meanings construct. These strings must be chosen from the > standard region list. > > I have a slight preference for Karl's approach, as I feel that the above is > putting too many technical requirements into the standard name definition. > However, rather than use "index", it might be clearer to have a standard name > modifier "flag": > > basin: standard_name = "region flag" > basin: flag_values = "....." > > Using a variable "region_flag" or "region_index" would have the advantage > that we could keep the standard name definitions reasonably clear and > transparent. > > There is a more general question here about the treatment of formatting > constraints which are expressed in standard name definitions but not > explicitly represented in the convention text or the corresponding > conformance document. Would it be helpful to add an appendix of requirements > associated with specific standard names, so that the implications of > whichever options is chosen can be spelt out with an example? > > regards, > Martin > > ##################################### > > Dear Karl > > > This is why I suggested defining a new name modifier, "index". We > > could then write: > > > > basin: standard_name = "region index" > > > > alternatively we could just define a new standard name: > > standard_name="region_index" > > > > You suggest that we should simply allow the standard name "region" > > be used for both string variables or for integer variables when they > > are associated with strings with the flag_meanings attribute. > > Yes, that's right. We have previously recommended this treatment for area > type variables too. The flag attributes provide a self-describing encoding > mechanism that doesn't alter the intention of the data. > > > That would be fine, but I think we'll need to make this explicit. > > We could certanly do that. I wouldn't restrict it to this case, but point it > out as generally possible use of the flag attributes. > > > I don't think many folks view indexes as "encodings of a strings as a > > numbers". > > The difference is only that if you defined a new variable of basin index, you > would need an external table to translate its numerical values into basin > names. That would be not be self-describing metadata and it would not be > CF-like, I feel. > > Best wishes > > Jonathan > > ----- Forwarded message from Karl Taylor <taylor13 at > llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> ----- > > > Date: Sat, 21 May 2016 10:35:41 -0700 > > From: Karl Taylor <taylor13 at > > llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> > > To: cf-metadata at > > cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > Subject: Re: [CF-metadata] Use of CF standard name region > > User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) > > Gecko/20100101 Thunderbird/38.7.2 > > > > Hi Jonathan and Martin, > > > > I think the issue pertains to the following variable and metadata (I > > *think* this is how we did it in CMIP5): > > > > int basin(lon, lat) > > basin: standard_name = "region"; > > basin: flag_values = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10; > > basin: flag_meanings = "global_land", "southern_ocean", > > "atlantic_ocean", "pacific_ocean", "arctic_ocean", "indian_ocean", > > "mediterranean_sea", "black_sea", "hudson_bay", "baltic_sea", > > "red_sea"; > > [and there were additional attributes] > > > > The construct is fine, I think, but according to the standard name > > table, "region" is supposed to be reserved for string variables. > > Here it is attached to the "basin" variable, which is an integer > > index (or I guess we could call it a "flag"). > > > > This is why I suggested defining a new name modifier, "index". We > > could then write: > > > > basin: standard_name = "region index" > > > > alternatively we could just define a new standard name: > > standard_name="region_index" > > > > You suggest that we should simply allow the standard name "region" > > be used for both string variables or for integer variables when they > > are associated with strings with the flag_meanings attribute. That > > would be fine, but I think we'll need to make this explicit. I > > don't think many folks view indexes as "encodings of a strings as a > > numbers". > > > > So I think we have a few options. Perhaps others might weigh in. > > > > best regards, > > Karl > > > > > > > > > > On 5/21/16 2:05 AM, Jonathan Gregory wrote: > > >Dear Martin and Karl > > > > > >Actually I think the way it's done in CMIP5 is consistent with the > > >convention. > > >It is correct that region is the standard name for a string-valued > > >variable, > > >and flag_values and flag_meanings supply a method to encode the strings as > > >numbers. This is very much like Example 3.3 in Section 3.5, where > > >string-valued > > >status flags are encoded as numbers. On this email list we have advised > > >people > > >from time to time to use flag_values and flag_meanings in this way to > > >encode > > >strings as numbers. > > > > > >You could argue that it is a bit different in principle. The intention of > > >Sect > > >3.5 is to supply a way to decode numbers in a data variable into strings. > > >That > > >is arguably not identical with an intention of providing a way to encode > > >strings as numbers in a data variable, but since the process is reversible > > >the > > >mechanism works both ways! If you think that this use of the convention is > > >not > > >obvious as it stands, then I would propose that we insert an extra > > >sentence in > > >Sect 3.5 pointing out the use of this mechanism to encode strings. We could > > >include the CMIP5 basins as an example of it. > > > > > >Best wishes > > > > > >Jonathan > > > > > >----- Forwarded message from Karl Taylor <taylor13 at > > >llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> ----- > > > > > >>Date: Fri, 20 May 2016 15:16:23 -0700 > > >>From: Karl Taylor <taylor13 at > > >>llnl.gov<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata>> > > >>To: cf-metadata at > > >>cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > >>Subject: Re: [CF-metadata] Use of CF standard name region > > >>User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:38.0) > > >> Gecko/20100101 Thunderbird/38.7.2 > > >> > > >>Hi all, > > >> > > >>Perhaps we should define a new standard_name: e.g., basin_index (or > > >>region_index) to replace the misused "region" standard_name. > > >> > > >>I would note that in the conventions document in example 3.3 there > > >>is a standard name: "sea_water_speed status_flag" > > >> > > >>"status_flag" is a standard "name modifier" (see appendix C). > > >> > > >>So, if we want to modify the convention, we could define a new name > > >>modifier (say "index") and explicitly indicate that flag_values can > > >>be used as indexes (when they are integers). > > >> > > >>regards, > > >>Karl > > >> > > >> > > >>On 5/20/16 12:44 PM, martin.juckes at > > >>stfc.ac.uk<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > >>wrote: > > >>>Hello All, > > >>> > > >>>In CMIP5 the variable "basin" was used as a fixed spatial field with > > >>>integer values and the CF Standard Name "region", which has the > > >>>definition "A variable with the standard name of region contains strings > > >>>which indicate geographical regions. These strings must be chosen from > > >>>the standard region list." > > >>> > > >>>The integer valued CMIP5 variable is clearly not consistent with this > > >>>definition. The CMIP5 variable was defined with flag_values and > > >>>flag_meanings, such that the flag_meanings were from the CF standard > > >>>region list. > > >>> > > >>>The question is, should we redefine the CMIP5 variable somehow, or would > > >>>it be acceptable to adjust the CF Standard Name definition for region to > > >>>accept this usage which appears clear enough and is presumably much > > >>>easier for plotting packages to handle than a spatial array of string > > >>>values, > > >>> > > >>>regards, > > >>>Martin > > >>> > > >>> > > >>>_______________________________________________ > > >>>CF-metadata mailing list > > >>>CF-metadata at > > >>>cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > >>>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > >>_______________________________________________ > > >>CF-metadata mailing list > > >>CF-metadata at > > >>cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > >>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > > > >----- End forwarded message ----- > > >_______________________________________________ > > >CF-metadata mailing list > > >CF-metadata at > > >cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > >http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > > _______________________________________________ > > CF-metadata mailing list > > CF-metadata at > > cgd.ucar.edu<http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata> > > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > ----- End forwarded message ----- > > _______________________________________________ > CF-metadata mailing list > [email protected] > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata ----- End forwarded message ----- _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
