> As I read the CF Conventions document, my conclusion is that CF currently 
> conflates the two concepts 'doesn't have units because the concept is 
> inapplicable' and 'doesn't have units because the quantity is a pure number'.

I read the conventions slightly differently, which is part of the reason I 
think we need a little clarification in the text.

My reading is that CF does not explicitly recognise 'doesn't have units because 
the concept is inapplicable' in the conventions.  However it does recognise 
udunits' role in the definition of applicable units.

udunits has this concept, which can be utilised by
 units = ''
meaning that we can already achieve my aims with the capabilities we have.

So, I would like to adapt your rules to read:

  *   If the variable contains pure numerical values, such as a fractions, for 
which no other applicable unit exists, put units = '1'.
  *   If the variable contains strings, flags, or non-numerical binary 
quantities, put units = ''
  *   Never leave off the units attribute, as this is always interpreted as 
units = '1'.

and state this explicitly in the CF conventions.

My further comments about the lack of clarity in the syntax
  units = ''
is an additional conversation, which may be useful, but is not central to the 
discussion, in my mind.  We can decide to request and adopt
  units = 'no_units'
  units = 'None'
if we feel this would add clarity, but it does not change the behaviour.

all the best
mark
________________________________
From: Jim Biard [[email protected]]
Sent: 04 November 2014 17:45
To: Hedley, Mark; [email protected]
Subject: Re: [CF-metadata] string valued coordinates

Mark,

As I read the CF Conventions document, my conclusion is that CF currently 
conflates the two concepts 'doesn't have units because the concept is 
inapplicable' and 'doesn't have units because the quantity is a pure number'. 
Current practice, as evidenced by the standard_names table, has been to 
sometimes specify units = '1' for cases where units are inapplicable, so 
neither lack of a units attribute, nor the presence of a units attribute with a 
value of 1 can be assumed to unambiguously mean only one thing.

In data products that I have authored or guided others in authoring, the 
(personal) rule I have followed is:

  *   If the variable contains pure numerical values, such as a fractions, for 
which no other applicable unit exists, put units = '1'.
  *   If the variable contains strings, flags, or non-numerical binary 
quantities, don't give it a units attribute.

I find this to be unambiguous and compatible with the standard. I think we can 
reword the standard to reflect something like this, and there won't be any 
backward-compatibility issues. I don't find any need to add an explicit unit 
that means there isn't a unit.

Grace and peace,

Jim

On 11/4/14, 11:53 AM, Hedley, Mark wrote:
Hello Jim

I want to be really clear on this, as this is crucial.  If I am interpreting 
this wrong I would really like to know.

> as backward compatibility will pretty much require that having no units 
> attribute be interpretable as having a units attribute saying 'no_unit'.

I think this is incorrect.  Backwards compatibility requires that an absence of 
a units attribute is exactly the same as units='1'.

This is what CF mandates, as I read it.  This is very different from your 
comments.

Please may you consider
http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#units
and let us know if your position remains the same?
I am afraid I do not think it is born out by the specification.


all the best
mark

________________________________
From: Jim Biard [[email protected]<mailto:[email protected]>]
Sent: 04 November 2014 16:45
To: Hedley, Mark; [email protected]<mailto:[email protected]>
Subject: Re: [CF-metadata] string valued coordinates

Mark,

I agree that CF is currently ambiguous on this front, and I'm fine with 
improving definitions going forward, but 'no_unit' smacks of the classic 'this 
page intentionally left blank' found in government documents. I think it's 
overkill, as backward compatibility will pretty much require that having no 
units attribute be interpretable as having a units attribute saying 'no_unit'.

Grace and peace,

Jim

On 11/4/14, 11:38 AM, Hedley, Mark wrote:
Hello Jim

> A variable with no units attribute at all is also pretty unambiguously a 
> marker for something that isn't intended to be a even a pure number.

If only this were the case.  CF conventions state that:
Units are not required for dimensionless quantities. A variable with no units 
attribute is assumed to be dimensionless. However, a units attribute specifying 
a dimensionless unit may optionally be included.
http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#units

Thus, the absence of a unit is to be interpreted identically to a statement that
units = '1'

This is the current situation and it is likely that there is lots of data like 
this around.

> Do we really need something more than a disambiguation of units = '1' vs no 
> units attribute present?

Yes, I think we do: this situation is not ambiguous in CF, they are the same 
thing.

What I believe we require is a udunits entity which is clearly 'there is no 
unit of measure here, this is not dimensioned and not dimensionless'

The udunits value
''
delivers this functionality (I think), but it does not read very well, hence my 
suggestion that we ask for a new entry in udunits,
'no_unit'
which is hopefully clear in its meaning and interpretation
and which behaves the same as '' : failing all udunits processing attempts and 
operating as 'not a unit'

all the best
mark

________________________________
From: CF-metadata 
[[email protected]<mailto:[email protected]>] on 
behalf of Jim Biard [[email protected]<mailto:[email protected]>]
Sent: 31 October 2014 15:18
To: [email protected]<mailto:[email protected]>
Subject: Re: [CF-metadata] string valued coordinates

Mark,

I'm not clear on what you are suggesting that udunits do with 'no_unit' or '?'.

I had thought that the desire was to be able to differentiate between a pure 
number (as you mention below) and a value (whether a string or a bit pattern) 
that should not be interpreted as any number at all.

As the situation stands, a units value of '1' is pretty unambiguously a marker 
for a pure number. We may need to modify docs to make this clearer, but I don't 
think that poses a problem. A variable with no units attribute at all is also 
pretty unambiguously a marker for something that isn't intended to be a even a 
pure number. Again, we may need to modify docs to make this clearer. Because 
these two concepts are somewhat conflated in the current documentation and 
usage (area_type being an example), there is the issue of other places where 
cleanup would be good going forward, but even if you have a units value of '1' 
on a non-number, it doesn't hurt anything in practice.

Do we really need something more than a disambiguation of units = '1' vs no 
units attribute present?

Grace and peace,

Jim

On 10/31/14, 11:04 AM, Hedley, Mark wrote:
Thank you for all the responses, it sounds like 'all of the above' is the 
preferred response to my suggestions of plausible next steps.  I will pursue 
all of these.

Eizi's point about having no_unit in udunits is sound; I suggest we request 
udunits use
  'no_unit'
as a representation of
'?'
such that the behaviour is consistent; 'no_unit' should always raise an 
exception when used in the udunits processing rules, exactly as '?' does.

With regard to meaning, I have found the wikipedia entry useful:
http://en.wikipedia.org/wiki/Dimensionless_quantity
`In dimensional analysis<http://en.wikipedia.org/wiki/Dimensional_analysis>, a 
dimensionless quantity or quantity of dimension one is a 
quantity<http://en.wikipedia.org/wiki/Quantity> without an associated physical 
dimension<http://en.wikipedia.org/wiki/Dimensional_analysis>. It is thus a 
"pure" number, and as such always has a dimension of 
1.[1]<http://en.wikipedia.org/wiki/Dimensionless_quantity#cite_note-1>'
which it has sourced from
"1.8 (1.6) quantity of dimension one dimensionless 
quantity"<http://www.iso.org/sites/JCGM/VIM/JCGM_200e_FILES/MAIN_JCGM_200e/01_e.html#L_1_8>.
 International vocabulary of metrology — Basic and general concepts and 
associated terms (VIM). 
ISO<http://en.wikipedia.org/wiki/International_Organization_for_Standardization>.
 2008. Retrieved 2011-03-22.

This is a good enough source for me.

I will wait to give space for more comments, then,  if people are content, I 
will raise a change request with udunits.
Assuming this is accepted and processed I will raise a change request for CF to 
add some text to 3.1.
Finally I will request a change for any standard_names which appear not to 
follow this approach (I have only 'area_type' so far).

I hope this seems like a reasonable response.

________________________________
From: Eizi TOYODA [[email protected]<mailto:[email protected]>]
Sent: 31 October 2014 08:44
To: John Graybeal
Cc: Hedley, Mark; CF Metadata List
Subject: Re: [CF-metadata] string valued coordinates

Hi John

> I think '?' is not a definition that is helpful to most users -- it is more 
> like an indication that the string -- the empty string in this case for 
> example -- has not provided a meaningful indication of what the units are.

I share the same impression.   I was thinking it would be nicer for maintener 
of udunits.  We should ask modifying udunits so that it would refuse processing 
"no_units" otherwise ut_multiply("no_units", "no_units") returns "no_units 2".  
 If I remember right the unit string "?" causes immediate error, so we don't 
have to change udunits.

But I'm okay if the majority here agrees that sort of thing is not a 
responsibility of udunits.

Best,
Eizi



Best Regards,
--
Eiji (aka Eizi) TOYODA
http://www.google.com/profiles/toyoda.eizi

On Fri, Oct 31, 2014 at 9:45 AM, John Graybeal 
<[email protected]<mailto:[email protected]>> wrote:
Thanks for summing this up so neatly Mark!

We could take the view that the conventions would benefit from the addition of 
some text into 3.1 to explicitly make the point about quantities which are not 
dimensioned or dimensionless.
We could alternatively defer to udunits as most unit questions do, which 
already exhibits this behaviour, and just patch the 'area_type' and any similar 
names with erroneous canonical units.
We could also request that udunits be updated with a clearer string for this 
case, given the need for it, such as including the term 'no_units' as a valid 
udunits term to mean there are no units here: this is not dimensionless, this 
is not dimensioned.

Why is the first option exclusive to the others? Seems useful to improve the 
documentation regardless.

So I agree that '1' makes no sense for area_type. I'm wondering if someone can 
crisply describe what is meant when we (or UDUNITS) say a unit is 
dimensionless? I'm not entirely sure I get it.

In any case, I think '?' is not a definition that is helpful to most users -- 
it is more like an indication that the string -- the empty string in this case 
for example -- has not provided a meaningful indication of what the units are.

So my ideal solution has CF well aligned with UDUNITS, and a clear concept and 
definition. Which I think suggests asking UDUNITS for a term 'no_units', 
defined as "the values do not have units; values are neither dimensioned nor 
dimensionless."

John


On Oct 30, 2014, at 11:06, Hedley, Mark 
<[email protected]<mailto:[email protected]>> wrote:

> The unit of '1' is generally used to indicate fractions and the like. In 
> cases where I am storing a raw binary value, I leave off the units attribute, 
> as the 'number' isn't something that should be treated as a decimal quantity.

This is the same behaviour as I was looking to adopt, but CF 3.1 makes this 
incorrect, I believe, as a lack of a units attribute is to be interpreted as a 
units of '1'.

I think a clear way to define that a quantity is not dimensioned and is not 
dimensionless is required.  I would have liked to use the lack of a unit for 
this purpose, but this has already been taken, so something else is needed.

> My preference is that one explicitly puts in the units. For dimensionless, 
> "1" or "" is ok for udunits.

udunits2 treats '1' and '' differently.

  a unit of '1' has a definition of '1'
  a unit of '' has a definition of '?'

The CF conventions description of units (3.1) states that an absence of a units 
attribute is deemed to be equivalent to dimensionless, a unit of '1'.  This is 
the convention, and it has been in force a long time.

However CF makes no statement that I can find regarding a unit of ''.  Thus I 
believe we defer back to udunits, which CF states is how units are defined.  
Udunits states that a unit of '' is undefined, the quantity is not dimensioned 
and is not dimensionless.  We could adopt this to use for the cases in question.

> area_type is given in the standard_name table as having a unit of 1. It is a 
> categorical string-valued quantity.

On the basis of the discussion, I would suggest that this is an error.  If 
area_type is a categorical string-valued quantity, it is not dimensionless, it 
is not continuous and numerical, it is not a pure number and should not be 
treated as such.  I think we should fix this.

We could take the view that the conventions would benefit from the addition of 
some text into 3.1 to explicitly make the point about quantities which are not 
dimensioned or dimensionless.
We could alternatively defer to udunits as most unit questions do, which 
already exhibits this behaviour, and just patch the 'area_type' and any similar 
names with erroneous canonical units.
We could also request that udunits be updated with a clearer string for this 
case, given the need for it, such as including the term 'no_units' as a valid 
udunits term to mean there are no units here: this is not dimensionless, this 
is not dimensioned.
I don't mind which route is preferred, I'm happy to put a change together and 
pursue it; whichever way seems better to people.

cheers
mark

________________________________
From: CF-metadata 
[[email protected]<mailto:[email protected]>] on 
behalf of Jim Biard [[email protected]<mailto:[email protected]>]
Sent: 30 October 2014 16:12
To: [email protected]<mailto:[email protected]>
Subject: Re: [CF-metadata] string valued coordinates

CF says that if the units attribute is missing, then the quantity has no units.

The Conventions document, section 3.1 says:

The units attribute is required for all variables that represent dimensional 
quantities (except for boundary variables defined in Section 7.1, “Cell 
Boundaries” 
<http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#cell-boundaries>
 and climatology variables defined in Section 7.4, “Climatological Statistics” 
<http://cfconventions.org/Data/cf-conventions/cf-conventions-1.6/build/cf-conventions.html#climatological-statistics>
 ).

and

Units are not required for dimensionless quantities. A variable with no units 
attribute is assumed to be dimensionless. However, a units attribute specifying 
a dimensionless unit may optionally be included. The Udunits package defines a 
few dimensionless units, such as percent , but is lacking commonly used units 
such as ppm (parts per million). This convention does not support the addition 
of new dimensionless units that are not udunits compatible. The conforming unit 
for quantities that represent fractions, or parts of a whole, is "1". The 
conforming unit for parts per million is "1e-6". Descriptive information about 
dimensionless quantities, such as sea-ice concentration, cloud fraction, 
probability, etc., should be given in the long_name or standard_name attributes 
(see below) rather than the units.

The unit of '1' is generally used to indicate fractions and the like. In cases 
where I am storing a raw binary value, I leave off the units attribute, as the 
'number' isn't something that should be treated as a decimal quantity.

Grace and peace,

Jim

On 10/30/14, 11:35 AM, John Caron wrote:
My preference is that one explicitly puts in the units. For dimensionless, "1" 
or "" is ok for udunits. If the units attribute isnt there, I assume that the 
user forgot to specify it, so the units are unknown.

Im not sure what CF actually says, but it would be good to clarify.

John

On Thu, Oct 30, 2014 at 2:37 AM, Hedley, Mark 
<[email protected]<mailto:[email protected]>> wrote:
Hello CF

> From: CF-metadata 
> [[email protected]<mailto:[email protected]>] 
> on behalf of Jonathan Gregory 
> [[email protected]<mailto:[email protected]>]

> Yes, there are some standard names which imply string values, as Karl says. 
> If the standard_name table says 1, that means the quantity is dimensionless, 
> so it's also fine to omit the units, as Jim says.

I would like to raise question about this statement.  Omitting the units and 
stating that the units are '1' are two very different things;
    dimensionless != no_unit
is an important statement which should be clear to data consumers and producers.

If the standard name table defines a canonical unit for a standard_name of '1' 
then I expect this quantity to be dimensionless, with a unit of '1' or some 
multiple there of.
If the standard name states that the canonical unit for a standard_name is '' 
then I expect that quantity to have no unit stated.
Any deviation from this behaviour is a break with the conventions.  I have code 
which explicitly checks this for data sets.

Are people aware of examples of the pattern of use described by Jonathan, such 
as a categorical quantities identified by a standard_name with a canonical unit 
of '1'?

thank you
mark

_______________________________________________
CF-metadata mailing list
[email protected]<mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata





_______________________________________________
CF-metadata mailing list
[email protected]<mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


--
<iiagagce.png><http://www.cicsnc.org/>Visit us on
Facebook<http://www.facebook.com/cicsnc>        Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: [email protected]<mailto:[email protected]>
o: +1 828 271 4900




_______________________________________________
CF-metadata mailing list
[email protected]<mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


_______________________________________________
CF-metadata mailing list
[email protected]<mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata





_______________________________________________
CF-metadata mailing list
[email protected]<mailto:[email protected]>
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata


--
[CICS-NC] <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>       Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: [email protected]<mailto:[email protected]>
o: +1 828 271 4900





--
[CICS-NC] <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>       Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: [email protected]<mailto:[email protected]>
o: +1 828 271 4900





--
[CICS-NC] <http://www.cicsnc.org/> Visit us on
Facebook <http://www.facebook.com/cicsnc>       Jim Biard
Research Scholar
Cooperative Institute for Climate and Satellites NC <http://cicsnc.org/>
North Carolina State University <http://ncsu.edu/>
NOAA's National Climatic Data Center <http://ncdc.noaa.gov/>
151 Patton Ave, Asheville, NC 28801
e: [email protected]<mailto:[email protected]>
o: +1 828 271 4900




_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to