> If this is a problem only for string-valued quantities, or string-valued > quantities which have been encoded as numbers using flag_values, I'm not sure > there is a need to modify udunits.
This is not the limit of the issue. There are a number of situations where numerical labels are used but are not to be interpreted in any way as dimensionless quantities. A pertinent example is a 'realization' coordinate. The realizations are numbered 1,2,3,4,5 but these are labels, they are not numeric values. I would never interpolate my data to estimate what realization 3.6 represented, nor would I multiply my air_pressure data values by the realization number and expect them to still be in units of Pa mark ________________________________________ From: CF-metadata [[email protected]] on behalf of Jonathan Gregory [[email protected]] Sent: 31 October 2014 16:23 To: [email protected] Subject: [CF-metadata] Fwd: Re: string valued coordinates Dear Mark If this is a problem only for string-valued quantities, or string-valued quantities which have been encoded as numbers using flag_values, I'm not sure there is a need to modify udunits. Could we not just say that the notions of units and physical dimensions (or being dimensionless) do not apply to string- valued quantities? In that case the units, whether given or not given, can be ignored. We would declare that the units attribute is standardised only for numerical quantities; it is always legal to attach non-standardised attributes, so no harm would be done by giving units anyway. Best wishes Jonathan > Thank you for all the responses, it sounds like 'all of the above' is the > preferred response to my suggestions of plausible next steps. I will pursue > all of these. > > Eizi's point about having no_unit in udunits is sound; I suggest we request > udunits use > 'no_unit' > as a representation of > '?' > such that the behaviour is consistent; 'no_unit' should always raise an > exception when used in the udunits processing rules, exactly as '?' does. > > With regard to meaning, I have found the wikipedia entry useful: > http://en.wikipedia.org/wiki/Dimensionless_quantity > `In dimensional analysis<http://en.wikipedia.org/wiki/Dimensional_analysis>, > a dimensionless quantity or quantity of dimension one is a > quantity<http://en.wikipedia.org/wiki/Quantity> without an associated > physical dimension<http://en.wikipedia.org/wiki/Dimensional_analysis>. It is > thus a "pure" number, and as such always has a dimension of > 1.[1]<http://en.wikipedia.org/wiki/Dimensionless_quantity#cite_note-1>' > which it has sourced from > "1.8 (1.6) quantity of dimension one dimensionless > quantity"<http://www.iso.org/sites/JCGM/VIM/JCGM_200e_FILES/MAIN_JCGM_200e/01_e.html#L_1_8>. > International vocabulary of metrology ? Basic and general concepts and > associated terms (VIM). > ISO<http://en.wikipedia.org/wiki/International_Organization_for_Standardization>. > 2008. Retrieved 2011-03-22. > > This is a good enough source for me. > > I will wait to give space for more comments, then, if people are content, I > will raise a change request with udunits. > Assuming this is accepted and processed I will raise a change request for CF > to add some text to 3.1. > Finally I will request a change for any standard_names which appear not to > follow this approach (I have only 'area_type' so far). > > I hope this seems like a reasonable response. > > ________________________________ > From: Eizi TOYODA [[email protected]] > Sent: 31 October 2014 08:44 > To: John Graybeal > Cc: Hedley, Mark; CF Metadata List > Subject: Re: [CF-metadata] string valued coordinates > > Hi John > > > I think '?' is not a definition that is helpful to most users -- it is more > > like an indication that the string -- the empty string in this case for > > example -- has not provided a meaningful indication of what the units are. > > I share the same impression. I was thinking it would be nicer for maintener > of udunits. We should ask modifying udunits so that it would refuse > processing "no_units" otherwise ut_multiply("no_units", "no_units") returns > "no_units 2". If I remember right the unit string "?" causes immediate > error, so we don't have to change udunits. > > But I'm okay if the majority here agrees that sort of thing is not a > responsibility of udunits. > > Best, > Eizi > > > > Best Regards, > -- > Eiji (aka Eizi) TOYODA > http://www.google.com/profiles/toyoda.eizi > > On Fri, Oct 31, 2014 at 9:45 AM, John Graybeal > <[email protected]<mailto:[email protected]>> wrote: > Thanks for summing this up so neatly Mark! > > We could take the view that the conventions would benefit from the addition > of some text into 3.1 to explicitly make the point about quantities which are > not dimensioned or dimensionless. > We could alternatively defer to udunits as most unit questions do, which > already exhibits this behaviour, and just patch the 'area_type' and any > similar names with erroneous canonical units. > We could also request that udunits be updated with a clearer string for this > case, given the need for it, such as including the term 'no_units' as a valid > udunits term to mean there are no units here: this is not dimensionless, this > is not dimensioned. > > Why is the first option exclusive to the others? Seems useful to improve the > documentation regardless. > > So I agree that '1' makes no sense for area_type. I'm wondering if someone > can crisply describe what is meant when we (or UDUNITS) say a unit is > dimensionless? I'm not entirely sure I get it. > > In any case, I think '?' is not a definition that is helpful to most users -- > it is more like an indication that the string -- the empty string in this > case for example -- has not provided a meaningful indication of what the > units are. > > So my ideal solution has CF well aligned with UDUNITS, and a clear concept > and definition. Which I think suggests asking UDUNITS for a term 'no_units', > defined as "the values do not have units; values are neither dimensioned nor > dimensionless." > > John > > > On Oct 30, 2014, at 11:06, Hedley, Mark > <[email protected]<mailto:[email protected]>> wrote: > > > The unit of '1' is generally used to indicate fractions and the like. In > > cases where I am storing a raw binary value, I leave off the units > > attribute, as the 'number' isn't something that should be treated as a > > decimal quantity. > > This is the same behaviour as I was looking to adopt, but CF 3.1 makes this > incorrect, I believe, as a lack of a units attribute is to be interpreted as > a units of '1'. > > I think a clear way to define that a quantity is not dimensioned and is not > dimensionless is required. I would have liked to use the lack of a unit for > this purpose, but this has already been taken, so something else is needed. > > > My preference is that one explicitly puts in the units. For dimensionless, > > "1" or "" is ok for udunits. > > udunits2 treats '1' and '' differently. > > a unit of '1' has a definition of '1' > a unit of '' has a definition of '?' > > The CF conventions description of units (3.1) states that an absence of a > units attribute is deemed to be equivalent to dimensionless, a unit of '1'. > This is the convention, and it has been in force a long time. > > However CF makes no statement that I can find regarding a unit of ''. Thus I > believe we defer back to udunits, which CF states is how units are defined. > Udunits states that a unit of '' is undefined, the quantity is not > dimensioned and is not dimensionless. We could adopt this to use for the > cases in question. > > > area_type is given in the standard_name table as having a unit of 1. It is > > a categorical string-valued quantity. > > On the basis of the discussion, I would suggest that this is an error. If > area_type is a categorical string-valued quantity, it is not dimensionless, > it is not continuous and numerical, it is not a pure number and should not be > treated as such. I think we should fix this. > > We could take the view that the conventions would benefit from the addition > of some text into 3.1 to explicitly make the point about quantities which are > not dimensioned or dimensionless. > We could alternatively defer to udunits as most unit questions do, which > already exhibits this behaviour, and just patch the 'area_type' and any > similar names with erroneous canonical units. > We could also request that udunits be updated with a clearer string for this > case, given the need for it, such as including the term 'no_units' as a valid > udunits term to mean there are no units here: this is not dimensionless, this > is not dimensioned. > I don't mind which route is preferred, I'm happy to put a change together and > pursue it; whichever way seems better to people. > > cheers > mark > > ________________________________ > From: CF-metadata > [[email protected]<mailto:[email protected]>] > on behalf of Jim Biard [[email protected]<mailto:[email protected]>] > Sent: 30 October 2014 16:12 > To: [email protected]<mailto:[email protected]> > Subject: Re: [CF-metadata] string valued coordinates > > CF says that if the units attribute is missing, then the quantity has no > units. > > The Conventions document, section 3.1 says: > > The units attribute is required for all variables that represent dimensional > quantities (except for boundary variables defined in Section 7.1, ?Cell > Boundaries? > <http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#cell-boundaries> > and climatology variables defined in Section 7.4, ?Climatological > Statistics? > <http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#climatological-statistics> > ). > > and > > Units are not required for dimensionless quantities. A variable with no units > attribute is assumed to be dimensionless. However, a units attribute > specifying a dimensionless unit may optionally be included. The Udunits > package defines a few dimensionless units, such as percent , but is lacking > commonly used units such as ppm (parts per million). This convention does not > support the addition of new dimensionless units that are not udunits > compatible. The conforming unit for quantities that represent fractions, or > parts of a whole, is "1". The conforming unit for parts per million is > "1e-6". Descriptive information about dimensionless quantities, such as > sea-ice concentration, cloud fraction, probability, etc., should be given in > the long_name or standard_name attributes (see below) rather than the units. > > The unit of '1' is generally used to indicate fractions and the like. In > cases where I am storing a raw binary value, I leave off the units attribute, > as the 'number' isn't something that should be treated as a decimal quantity. > > Grace and peace, > > Jim > > On 10/30/14, 11:35 AM, John Caron wrote: > My preference is that one explicitly puts in the units. For dimensionless, > "1" or "" is ok for udunits. If the units attribute isnt there, I assume that > the user forgot to specify it, so the units are unknown. > > Im not sure what CF actually says, but it would be good to clarify. > > John > > On Thu, Oct 30, 2014 at 2:37 AM, Hedley, Mark > <[email protected]<mailto:[email protected]>> wrote: > Hello CF > > > From: CF-metadata > > [[email protected]<mailto:[email protected]>] > > on behalf of Jonathan Gregory > > [[email protected]<mailto:[email protected]>] > > > Yes, there are some standard names which imply string values, as Karl says. > > If the standard_name table says 1, that means the quantity is > > dimensionless, so it's also fine to omit the units, as Jim says. > > I would like to raise question about this statement. Omitting the units and > stating that the units are '1' are two very different things; > dimensionless != no_unit > is an important statement which should be clear to data consumers and > producers. > > If the standard name table defines a canonical unit for a standard_name of > '1' then I expect this quantity to be dimensionless, with a unit of '1' or > some multiple there of. > If the standard name states that the canonical unit for a standard_name is '' > then I expect that quantity to have no unit stated. > Any deviation from this behaviour is a break with the conventions. I have > code which explicitly checks this for data sets. > > Are people aware of examples of the pattern of use described by Jonathan, > such as a categorical quantities identified by a standard_name with a > canonical unit of '1'? > > thank you > mark > > _______________________________________________ > CF-metadata mailing list > [email protected]<mailto:[email protected]> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > > > > _______________________________________________ > CF-metadata mailing list > [email protected]<mailto:[email protected]> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > -- > <iiagagce.png><http://www.cicsnc.org/>Visit us on > Facebook<http://www.facebook.com/cicsnc> Jim Biard > Research Scholar > Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/> > North Carolina State University <http://ncsu.edu/> > NOAA's National Climatic Data Center <http://ncdc.noaa.gov/> > 151 Patton Ave, Asheville, NC 28801 > e: [email protected]<mailto:[email protected]> > o: +1 828 271 4900 > > > > > _______________________________________________ > CF-metadata mailing list > [email protected]<mailto:[email protected]> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > _______________________________________________ > CF-metadata mailing list > [email protected]<mailto:[email protected]> > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata > > > _______________________________________________ > CF-metadata mailing list > [email protected] > http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata ----- End forwarded message ----- _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
