Dear Jim > I think deprecation for variables with a flag_values attribute is > going to far. We have quite a few variables where the values are > numeric and have units, but we encode specific error conditions with > values (outside the valid range) using flag_values and > flag_meanings.
OK. Would you favour deprecating units for string/char-valued data variables? By the way, the standard_name of region indicates a variable with string values. I'd forgotten that one before, although we've had it for longer than area_type. I notice that the standard_name table gives the canonical unit as "string" in this case. Best wishes Jonathan > On 11/6/14, 12:38 PM, Jonathan Gregory wrote: > >One further thought: we could deprecate the units attribute for variables of > >string and char type, or having a flag_values att. These conditions can be > >detected automatically, so the CF_checker would give a warning, for instance. > >Jonathan > > > >----- Forwarded message from Jonathan Gregory <[email protected]> > >----- > > > >Dear Mark > > > >>>If this is a problem only for string-valued quantities, or string-valued > >>>quantities which have been encoded as numbers using flag_values, I'm not > >>>sure there is a need to modify udunits. > >>This is not the limit of the issue. > >> > >>There are a number of situations where numerical labels are used but are > >>not to be interpreted in any way as dimensionless quantities. > >> > >>A pertinent example is a 'realization' coordinate. The realizations are > >>numbered 1,2,3,4,5 but these are labels, they are not numeric values. I > >>would never interpolate my data to estimate what realization 3.6 > >>represented, nor would I multiply my air_pressure data values by the > >>realization number and expect them to still be in units of Pa > >Jim mentions raw binary data from instruments as another example. These are > >numbers, like the realization number. I can't see the harm in their being > >regarded as dimensionless. > > > >You can't interpolate between realizations because the ensemble axis is a > >discrete axis, not because realization number is not dimensionless. You > >might, > >for example, have a collection of timeseries in a single data variable, with > >aux coord vars of lat and lon to locate the stations. These have units, but > >you can't meaningfully interpolate between them. > > > >If it doesn't make sense to multiply the realization number by air_pressure, > >it doesn't really matter what units the product would have if you did it, I > >would say. > > > >I'm not arguing this point out of perversity (honestly!) but since this is > >a backward-incompatible change you are suggesting I think there has to be a > >very strong reason why we need to make it. It may be that starting from > >scratch you might decide to define a category of number which is neither > >dimensionless nor dimensional, but we are not starting from scratch. > > > >Best wishes > > > >Jonathan > > > >>From: CF-metadata [[email protected]] on behalf of Jonathan > >>Gregory [[email protected]] > >>Sent: 31 October 2014 16:23 > >>To: [email protected] > >>Subject: [CF-metadata] Fwd: Re: string valued coordinates > >> > >>Dear Mark > >> > >>If this is a problem only for string-valued quantities, or string-valued > >>quantities which have been encoded as numbers using flag_values, I'm not > >>sure > >>there is a need to modify udunits. Could we not just say that the notions of > >>units and physical dimensions (or being dimensionless) do not apply to > >>string- > >>valued quantities? In that case the units, whether given or not given, can > >>be > >>ignored. We would declare that the units attribute is standardised only for > >>numerical quantities; it is always legal to attach non-standardised > >>attributes, > >>so no harm would be done by giving units anyway. > >> > >>Best wishes > >> > >>Jonathan > >> > >> > >>>Thank you for all the responses, it sounds like 'all of the above' is the > >>>preferred response to my suggestions of plausible next steps. I will > >>>pursue all of these. > >>> > >>>Eizi's point about having no_unit in udunits is sound; I suggest we > >>>request udunits use > >>> 'no_unit' > >>>as a representation of > >>>'?' > >>>such that the behaviour is consistent; 'no_unit' should always raise an > >>>exception when used in the udunits processing rules, exactly as '?' does. > >>> > >>>With regard to meaning, I have found the wikipedia entry useful: > >>>http://en.wikipedia.org/wiki/Dimensionless_quantity > >>>`In dimensional > >>>analysis<http://en.wikipedia.org/wiki/Dimensional_analysis>, a > >>>dimensionless quantity or quantity of dimension one is a > >>>quantity<http://en.wikipedia.org/wiki/Quantity> without an associated > >>>physical dimension<http://en.wikipedia.org/wiki/Dimensional_analysis>. It > >>>is thus a "pure" number, and as such always has a dimension of > >>>1.[1]<http://en.wikipedia.org/wiki/Dimensionless_quantity#cite_note-1>' > >>>which it has sourced from > >>>"1.8 (1.6) quantity of dimension one dimensionless > >>>quantity"<http://www.iso.org/sites/JCGM/VIM/JCGM_200e_FILES/MAIN_JCGM_200e/01_e.html#L_1_8>. > >>> International vocabulary of metrology ? Basic and general concepts and > >>>associated terms (VIM). > >>>ISO<http://en.wikipedia.org/wiki/International_Organization_for_Standardization>. > >>> 2008. Retrieved 2011-03-22. > >>> > >>>This is a good enough source for me. > >>> > >>>I will wait to give space for more comments, then, if people are content, > >>>I will raise a change request with udunits. > >>>Assuming this is accepted and processed I will raise a change request for > >>>CF to add some text to 3.1. > >>>Finally I will request a change for any standard_names which appear not to > >>>follow this approach (I have only 'area_type' so far). > >>> > >>>I hope this seems like a reasonable response. > >>> > >>>________________________________ > >>>From: Eizi TOYODA [[email protected]] > >>>Sent: 31 October 2014 08:44 > >>>To: John Graybeal > >>>Cc: Hedley, Mark; CF Metadata List > >>>Subject: Re: [CF-metadata] string valued coordinates > >>> > >>>Hi John > >>> > >>>>I think '?' is not a definition that is helpful to most users -- it is > >>>>more like an indication that the string -- the empty string in this case > >>>>for example -- has not provided a meaningful indication of what the units > >>>>are. > >>>I share the same impression. I was thinking it would be nicer for > >>>maintener of udunits. We should ask modifying udunits so that it would > >>>refuse processing "no_units" otherwise ut_multiply("no_units", "no_units") > >>>returns "no_units 2". If I remember right the unit string "?" causes > >>>immediate error, so we don't have to change udunits. > >>> > >>>But I'm okay if the majority here agrees that sort of thing is not a > >>>responsibility of udunits. > >>> > >>>Best, > >>>Eizi > >>> > >>> > >>> > >>>Best Regards, > >>>-- > >>>Eiji (aka Eizi) TOYODA > >>>http://www.google.com/profiles/toyoda.eizi > >>> > >>>On Fri, Oct 31, 2014 at 9:45 AM, John Graybeal > >>><[email protected]<mailto:[email protected]>> > >>>wrote: > >>>Thanks for summing this up so neatly Mark! > >>> > >>>We could take the view that the conventions would benefit from the > >>>addition of some text into 3.1 to explicitly make the point about > >>>quantities which are not dimensioned or dimensionless. > >>>We could alternatively defer to udunits as most unit questions do, which > >>>already exhibits this behaviour, and just patch the 'area_type' and any > >>>similar names with erroneous canonical units. > >>>We could also request that udunits be updated with a clearer string for > >>>this case, given the need for it, such as including the term 'no_units' as > >>>a valid udunits term to mean there are no units here: this is not > >>>dimensionless, this is not dimensioned. > >>> > >>>Why is the first option exclusive to the others? Seems useful to improve > >>>the documentation regardless. > >>> > >>>So I agree that '1' makes no sense for area_type. I'm wondering if someone > >>>can crisply describe what is meant when we (or UDUNITS) say a unit is > >>>dimensionless? I'm not entirely sure I get it. > >>> > >>>In any case, I think '?' is not a definition that is helpful to most users > >>>-- it is more like an indication that the string -- the empty string in > >>>this case for example -- has not provided a meaningful indication of what > >>>the units are. > >>> > >>>So my ideal solution has CF well aligned with UDUNITS, and a clear concept > >>>and definition. Which I think suggests asking UDUNITS for a term > >>>'no_units', defined as "the values do not have units; values are neither > >>>dimensioned nor dimensionless." > >>> > >>>John > >>> > >>> > >>>On Oct 30, 2014, at 11:06, Hedley, Mark > >>><[email protected]<mailto:[email protected]>> wrote: > >>> > >>>>The unit of '1' is generally used to indicate fractions and the like. In > >>>>cases where I am storing a raw binary value, I leave off the units > >>>>attribute, as the 'number' isn't something that should be treated as a > >>>>decimal quantity. > >>>This is the same behaviour as I was looking to adopt, but CF 3.1 makes > >>>this incorrect, I believe, as a lack of a units attribute is to be > >>>interpreted as a units of '1'. > >>> > >>>I think a clear way to define that a quantity is not dimensioned and is > >>>not dimensionless is required. I would have liked to use the lack of a > >>>unit for this purpose, but this has already been taken, so something else > >>>is needed. > >>> > >>>>My preference is that one explicitly puts in the units. For > >>>>dimensionless, "1" or "" is ok for udunits. > >>>udunits2 treats '1' and '' differently. > >>> > >>> a unit of '1' has a definition of '1' > >>> a unit of '' has a definition of '?' > >>> > >>>The CF conventions description of units (3.1) states that an absence of a > >>>units attribute is deemed to be equivalent to dimensionless, a unit of > >>>'1'. This is the convention, and it has been in force a long time. > >>> > >>>However CF makes no statement that I can find regarding a unit of ''. > >>>Thus I believe we defer back to udunits, which CF states is how units are > >>>defined. Udunits states that a unit of '' is undefined, the quantity is > >>>not dimensioned and is not dimensionless. We could adopt this to use for > >>>the cases in question. > >>> > >>>>area_type is given in the standard_name table as having a unit of 1. It > >>>>is a categorical string-valued quantity. > >>>On the basis of the discussion, I would suggest that this is an error. If > >>>area_type is a categorical string-valued quantity, it is not > >>>dimensionless, it is not continuous and numerical, it is not a pure number > >>>and should not be treated as such. I think we should fix this. > >>> > >>>We could take the view that the conventions would benefit from the > >>>addition of some text into 3.1 to explicitly make the point about > >>>quantities which are not dimensioned or dimensionless. > >>>We could alternatively defer to udunits as most unit questions do, which > >>>already exhibits this behaviour, and just patch the 'area_type' and any > >>>similar names with erroneous canonical units. > >>>We could also request that udunits be updated with a clearer string for > >>>this case, given the need for it, such as including the term 'no_units' as > >>>a valid udunits term to mean there are no units here: this is not > >>>dimensionless, this is not dimensioned. > >>>I don't mind which route is preferred, I'm happy to put a change together > >>>and pursue it; whichever way seems better to people. > >>> > >>>cheers > >>>mark > >>> > >>>________________________________ > >>>From: CF-metadata > >>>[[email protected]<mailto:[email protected]>] > >>> on behalf of Jim Biard [[email protected]<mailto:[email protected]>] > >>>Sent: 30 October 2014 16:12 > >>>To: [email protected]<mailto:[email protected]> > >>>Subject: Re: [CF-metadata] string valued coordinates > >>> > >>>CF says that if the units attribute is missing, then the quantity has no > >>>units. > >>> > >>>The Conventions document, section 3.1 says: > >>> > >>>The units attribute is required for all variables that represent > >>>dimensional quantities (except for boundary variables defined in Section > >>>7.1, ?Cell Boundaries? > >>><http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#cell-boundaries> > >>> and climatology variables defined in Section 7.4, ?Climatological > >>>Statistics? > >>><http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#climatological-statistics> > >>> ). > >>> > >>>and > >>> > >>>Units are not required for dimensionless quantities. A variable with no > >>>units attribute is assumed to be dimensionless. However, a units attribute > >>>specifying a dimensionless unit may optionally be included. The Udunits > >>>package defines a few dimensionless units, such as percent , but is > >>>lacking commonly used units such as ppm (parts per million). This > >>>convention does not support the addition of new dimensionless units that > >>>are not udunits compatible. The conforming unit for quantities that > >>>represent fractions, or parts of a whole, is "1". The conforming unit for > >>>parts per million is "1e-6". Descriptive information about dimensionless > >>>quantities, such as sea-ice concentration, cloud fraction, probability, > >>>etc., should be given in the long_name or standard_name attributes (see > >>>below) rather than the units. > >>> > >>>The unit of '1' is generally used to indicate fractions and the like. In > >>>cases where I am storing a raw binary value, I leave off the units > >>>attribute, as the 'number' isn't something that should be treated as a > >>>decimal quantity. > >>> > >>>Grace and peace, > >>> > >>>Jim > >>> > >>>On 10/30/14, 11:35 AM, John Caron wrote: > >>>My preference is that one explicitly puts in the units. For dimensionless, > >>>"1" or "" is ok for udunits. If the units attribute isnt there, I assume > >>>that the user forgot to specify it, so the units are unknown. > >>> > >>>Im not sure what CF actually says, but it would be good to clarify. > >>> > >>>John > >>> > >>>On Thu, Oct 30, 2014 at 2:37 AM, Hedley, Mark > >>><[email protected]<mailto:[email protected]>> wrote: > >>>Hello CF > >>> > >>>>From: CF-metadata > >>>>[[email protected]<mailto:[email protected]>] > >>>> on behalf of Jonathan Gregory > >>>>[[email protected]<mailto:[email protected]>] > >>>>Yes, there are some standard names which imply string values, as Karl > >>>>says. If the standard_name table says 1, that means the quantity is > >>>>dimensionless, so it's also fine to omit the units, as Jim says. > >>>I would like to raise question about this statement. Omitting the units > >>>and stating that the units are '1' are two very different things; > >>> dimensionless != no_unit > >>>is an important statement which should be clear to data consumers and > >>>producers. > >>> > >>>If the standard name table defines a canonical unit for a standard_name of > >>>'1' then I expect this quantity to be dimensionless, with a unit of '1' or > >>>some multiple there of. > >>>If the standard name states that the canonical unit for a standard_name is > >>>'' then I expect that quantity to have no unit stated. > >>>Any deviation from this behaviour is a break with the conventions. I have > >>>code which explicitly checks this for data sets. > >>> > >>>Are people aware of examples of the pattern of use described by Jonathan, > >>>such as a categorical quantities identified by a standard_name with a > >>>canonical unit of '1'? > >>> > >>>thank you > >>>mark > >>> > >>>_______________________________________________ > >>>CF-metadata mailing list > >>>[email protected]<mailto:[email protected]> > >>>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > >>> > >>> > >>> > >>> > >>> > >>>_______________________________________________ > >>>CF-metadata mailing list > >>>[email protected]<mailto:[email protected]> > >>>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > >>> > >>> > >>>-- > >>><iiagagce.png><http://www.cicsnc.org/>Visit us on > >>>Facebook<http://www.facebook.com/cicsnc> Jim Biard > >>>Research Scholar > >>>Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/> > >>>North Carolina State University <http://ncsu.edu/> > >>>NOAA's National Climatic Data Center <http://ncdc.noaa.gov/> > >>>151 Patton Ave, Asheville, NC 28801 > >>>e: [email protected]<mailto:[email protected]> > >>>o: +1 828 271 4900 > >>> > >>> > >>> > >>> > >>>_______________________________________________ > >>>CF-metadata mailing list > >>>[email protected]<mailto:[email protected]> > >>>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > >>> > >>> > >>>_______________________________________________ > >>>CF-metadata mailing list > >>>[email protected]<mailto:[email protected]> > >>>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > >>> > >>> > >>>_______________________________________________ > >>>CF-metadata mailing list > >>>[email protected] > >>>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > >> > >>----- End forwarded message ----- > >>_______________________________________________ > >>CF-metadata mailing list > >>[email protected] > >>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > >>_______________________________________________ > >>CF-metadata mailing list > >>[email protected] > >>http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > >----- End forwarded message ----- > > > >----- End forwarded message ----- > >_______________________________________________ > >CF-metadata mailing list > >[email protected] > >http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > -- > CICS-NC <http://www.cicsnc.org/> Visit us on > Facebook <http://www.facebook.com/cicsnc> *Jim Biard* > *Research Scholar* > Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/> > North Carolina State University <http://ncsu.edu/> > NOAA's National Climatic Data Center <http://ncdc.noaa.gov/> > 151 Patton Ave, Asheville, NC 28801 > e: [email protected] > o: +1 828 271 4900 > > > > ----- End forwarded message ----- _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
