Re: [CF-metadata] use of integral_wrt_depth_of_sea_water_practical_salinity

2018-04-16 Thread Jonathan Gregory
Dear Sebastien et al.

It's allowed to put "depth: mean" in cell_methods even if there is no depth
coordinate variable (and no bounds). This is described in sect 7.3.4 of the
convention. It's allowed by the "first" case described there, because depth
is a standard name. We could suit your case better if we explicitly allowed
the "second" case of 7.3.4 to apply to the vertical coordinate, meaning the
range over the complete vertical extent where the quantity is defined i.e.
from the sea surface to the sea floor for an ocean quantity. Would this be a
good solution?

Since some more general issues have been raised, I'd like to comment on them.

First, there are a number of pairs of standard names, where one of the pair is
for the whole vertical extent of the atmosphere or the ocean, and the other is
for a layer within it e.g.
  atmosphere_mass_content_of_cloud_ice
  mass_content_of_cloud_ice_in_atmosphere_layer
This is my fault or choice, I believe, but from a *long* time ago - almost 20
years ago. I've often thought that maybe this was a mistake, because it is the
sort of distinction which could be made by bounds, and perhaps this present
discussion indicates that we should change it. One possibility would to make
the in_atmosphere/ocean_layer aliases of the atmosphere/ocean_ names, and say
in the definition that if coordinate bounds are not specified it means the
entire vertical extent of the atmosphere/ocean. That is, the distinction would
rely on the presence of bounds. Would this be good?

Second, Sebastien comments that, "Many standard names make reference to time,
space, post-processing." Actually I do not think that is true. As you say,
the description of the processing belongs in the cell_methods. That is why we
don't have standard_names for daily maximum and daily mean air temperature,
for example, although they are common concepts. However, it does depend what
you regard as "post-processing". The integral_wrt_X_of_Y is regarded by the
standard name guidelines as a "transformation", which derives one quantity
from one or more other quantities, and not as post-processing. In this case in
CF terms it is clear that the new quantity is different from the old one,
because the units of the new one are the product of the units of X and Y,
whereas the units of a quantity which has been post-processed by cell_methods
can depend only on the units it originally had.

Third, there have been many discussions about whether to allow lots more names
of a certain kind (as we did in the case of the isotopes recently, and as for
chemical species) or whether instead to factorise a new distinction into a
coordinate variable (as Roy is proposing for the biological taxa, and as for
area types and region names). We always consider this choice carefully! I think
there are good arguments for having most of the non-numeric metadata in the
standard name - see www.met.reading.ac.uk/~jonathan/CF_metadata/14.1/#direction
for my reasons.

Best wishes

Jonathan

- Forwarded message from Sebastien Villaume  
-

> Date: Mon, 16 Apr 2018 16:05:16 + (GMT-00:00)
> From: Sebastien Villaume 
> To: Martin Juckes - UKRI STFC 
> Cc: Karl Taylor , cf-metadata@cgd.ucar.edu, Jonathan
>   Gregory 
> Subject: Re: use of integral_wrt_depth_of_sea_water_practical_salinity
> X-Mailer: Zimbra 8.6.0_GA_1200 (ZimbraWebClient - FF57 (Linux)/8.6.0_GA_1200)
> 
> Hi Martin,
> 
> This is interesting because it makes me realize that I am not the only one 
> facing these issues with "special" bounds that are function of other 
> variables...
> 
> I like the idea of "pseudo-controlled" cell_method construction but in my 
> case I would require something like:
> 
> cell_method = "depth: integral from X to Y (where Z)" 
> 
> with X being "surface" and Y being "sea_floor" with eventually a "where" 
> clause with Z being "sea".
> 
> 
> I think that this kind of issues should not be solved on a case-by-case basis 
> but addressed in a general context because the case-by-case approach always 
> leads to specific solutions...
> 
> 
> /Sébastien
> 
> - Original Message -
> > From: "Martin Juckes - UKRI STFC" 
> > To: "Karl Taylor" , "Sebastien Villaume" 
> > 
> > Cc: cf-metadata@cgd.ucar.edu, "Jonathan Gregory" 
> > Sent: Monday, 16 April, 2018 10:02:28
> > Subject: Re: use of integral_wrt_depth_of_sea_water_practical_salinity
> 
> > Hello Karl, Sebastien,
> > 
> > 
> > I'm not sure that I've understood the whole thread, but to me it looks as 
> > though
> > the coordinate bounds would be the natural place to deal with this, though 
> > it
> > would require a modification to the convention.
> > 
> > 
> > There was a related, inconclusive, discussion in 2016 on the encoding of
> > histogram bin ranges in the case 

[CF-metadata] Standard Names to support Trac ticket 99

2018-04-16 Thread Jonathan Gregory
Dear Roy

Thanks for this. It looks sensible and well-constructed to me. I have two
comments.

* In response to your question, I think biological_taxon_lsid is better, since
you propose that's what we use. The more generic version would be suitable if
we offered a choice about which sort of ID to use, but it would present a
difficulty if you wanted to provide more than one kind of ID; this would need
more than one coord var, and it would be helpful to give them different
standard names.

* In the concentration names, I think "biological taxon" means "organisms
of biological taxon", doesn't it? I suggest it would be better to spell this
out in some way in the standard name. For example,
  number_concentration_of_biological_taxon_in_sea_water
might (surprisingly) be interpreted as meaning how many species there are
per unit volume.

Best wishes

Jonathan


- Forwarded message from "Lowry, Roy K."  -

> Date: Fri, 13 Apr 2018 14:02:59 +
> From: "Lowry, Roy K." 
> To: "cf-metadata@cgd.ucar.edu" 
> Subject: [CF-metadata] Standard Names to support Trac ticket 99
> 
> Dear All,
> 
> 
> Here is an initial batch of 8 Standard Names to support the CF taxon 
> dimension. Two are dimension labels whilst the other six are measurements to 
> which the taxon is a co-ordinate. Five of these are to cover Daniel's 
> proposal that prompted the resurrection of Ticket 99.
> 
> 
> I've presented a summary list followed by a full list with units and 
> definitions.  I have one uncertainty in my mind (biological_taxon_label 
> versus biological_taxon_lsid) where I would really appreciate input.
> 
> 
> Cheers, Roy.
> 
> biological_taxon_name
> biological_taxon_identifier or biological_taxon_lsid – any preferences
> number_concentration_of_biological_taxon_in_sea_water
> mass_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
> mass_concentration_of_biological_taxon_expressed_as_chlorophyll_in_sea_water
> mass_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
> mole_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
> mole_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
> 
> 
> biological_taxon_name
> 
> A plaintext human-readable label, usually a Latin binomial such as Calanus 
> finmarchicus, applied to a biological taxon. Biological taxon is a name or 
> other label identifying an organism or a group of organisms as belonging to a 
> unit of classification in a hierarchical taxonomy.
> 
> dimensionless
> 
> biological_taxon_identifier
> 
> An opaque label, most usefully a URI that resolves to an authoritative 
> information source, applied to a biological taxon. Biological taxon is a name 
> or other label identifying an organism or a group of organisms as belonging 
> to a unit of classification in a hierarchical taxonomy. The identifier 
> adopted for CF is the Life Science Identifier (LSID), a URN with the syntax 
> ‘urn:lsid:::[:]’. For example, the 
> copepod Calocalanus pavo may be represented by LSIDs 
> ‘urn:lsid:marinespecies.org:taxname:104669’ (based on WoRMS) and 
> urn:lsid:itis.gov:itis_tsn:85335’ (based on ITIS). These URNs may be 
> converted to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/'.
> 
> dimensionless
> 
> OR
> 
> biological_taxon_lsid
> 
> The Life Science Identifier (LSID) is a standard URI for a biological taxon. 
> Biological taxon is a name or other label identifying an organism or a group 
> of organisms as belonging to a unit of classification in a hierarchical 
> taxonomy. The LSID is a URN with the syntax 
> ‘urn:lsid:::[:]’. For example, the 
> copepod Calocalanus pavo may be represented by LSIDs 
> ‘urn:lsid:marinespecies.org:taxname:104669’ (based on WoRMS) and 
> urn:lsid:itis.gov:itis_tsn:85335’ (based on ITIS). These URNs may be 
> converted to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/'.
> 
> dimensionless
> 
> number_concentration_of_biological_taxon_in_sea_water
> 
> Number concentration means the count of an entity per unit volume and is used 
> in the construction ‘number_concentration_of_X_in_Y’, where X is a material 
> constituent of Y.. Biological taxon is a name or other label identifying an 
> organism or a group of organisms as belonging to a unit of classification in 
> a hierarchical taxonomy. Number concentration of biota is also referred to as 
> abundance.
> 
> m-3
> 
> mass_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
> 
> Mass concentration means mass per unit volume and is used in the construction 
> ‘mass_concentration_of_X_in_Y’, where X is a material constituent of Y. A 
> chemical species denoted by X may be described by a single term such as 
> 'nitrogen' or a phrase such as
> 'nox_expressed_as_nitrogen'. The phrase 'expressed_as' is used in the 
> construction ‘A_expressed_as_B’, where B is a chemical constituent of A. It 
> means that the quantity indicated by 

Re: [CF-metadata] use of integral_wrt_depth_of_sea_water_practical_salinity

2018-04-16 Thread Sebastien Villaume
Hi Martin,

This is interesting because it makes me realize that I am not the only one 
facing these issues with "special" bounds that are function of other 
variables...

I like the idea of "pseudo-controlled" cell_method construction but in my case 
I would require something like:

cell_method = "depth: integral from X to Y (where Z)" 

with X being "surface" and Y being "sea_floor" with eventually a "where" clause 
with Z being "sea".


I think that this kind of issues should not be solved on a case-by-case basis 
but addressed in a general context because the case-by-case approach always 
leads to specific solutions...


/Sébastien

- Original Message -
> From: "Martin Juckes - UKRI STFC" 
> To: "Karl Taylor" , "Sebastien Villaume" 
> 
> Cc: cf-metadata@cgd.ucar.edu, "Jonathan Gregory" 
> Sent: Monday, 16 April, 2018 10:02:28
> Subject: Re: use of integral_wrt_depth_of_sea_water_practical_salinity

> Hello Karl, Sebastien,
> 
> 
> I'm not sure that I've understood the whole thread, but to me it looks as 
> though
> the coordinate bounds would be the natural place to deal with this, though it
> would require a modification to the convention.
> 
> 
> There was a related, inconclusive, discussion in 2016 on the encoding of
> histogram bin ranges in the case where some bins are not defined by the
> numerical ranges that the current convention permits for coordinate bounds
> (http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2016/059037.html ). The 
> idea
> of using flag_values and flag_meanings came up. For the current example you
> could set the lower value of the depth coordinate bounds of the vertical
> integral to -5 [m] and then have flag_values=-5,
> flag_meanings="ocean_floor".
> 
> 
> Alternatively, there appears to be agreement in
> https://cf-trac.llnl.gov/trac/ticket/152 that the cell_methods construction
> "mean where X" does not need to me restricted to horizontal spatial means. 
> That
> ticket discusses using it for temporal means, but it could also be used for
> depth means, as in:
> 
> "cell_methods = mean: depth where sea".  The idea that the CF area type "sea"
> can be depth dependant was accepted in a discussion of usage in CMIP6, where 
> we
> have many variables which require the surface sea extent, and others which
> require the total sea extent, including the small but significant portion
> extending under floating ice shelves. This would make the flag_values/meanings
> construction redundant.
> 
> 
> Incidentally, the cell_methods string can be parsed by David Hassell's  cf
> python library (https://pypi.python.org/pypi/cf-python ). This doesn't 
> entirely
> solve the problem because of the variable quality of the information that has
> been encoded in the cell_methods string in the past ... but it does give us a
> tool to use in our efforts to improve the situation.
> 
> 
> regards,
> 
> Martin
> 
> 
> 
> 
> From: CF-metadata  on behalf of Sebastien
> Villaume 
> Sent: 13 April 2018 17:30
> To: Karl Taylor
> Cc: cf-metadata@cgd.ucar.edu; Jonathan Gregory
> Subject: Re: [CF-metadata] use of
> integral_wrt_depth_of_sea_water_practical_salinity
> 
> Hi Karl,
> 
> I tend to agree that this solution is far from ideal.
> 
> The core issue is that there is no clear separation between a parameter
> (diagnostic quantities, observables, coordinates etc.) and what you do with it
> in CF: everything is squeezed in the standard name and in the cell_method (in 
> a
> non-consistent way).
> 
> In an ideal world, the standard names should only describe bare parameters and
> everything related to processing should go into something else. But many
> standard names make reference to time, space, post-processing, extra useful
> informations, etc.
> The cell_method attribute is in principle there to represent any
> (post-)processing but it is not always the case, sometimes the informations 
> are
> in the standard name directly or sometimes the cell_method is too limited to
> describe what needs to be described. like in my case here...
> To maintain a strict separation, the "integral_wrt_X_of_Y" should be one of 
> the
> cell_method from the beginning I also never understood why "difference" is
> not a valid method in the table E.1 of appendix E since "sum" is there.
> 
> I noticed few months ago a thread discussing ontologies in connection with the
> proposal of standard names for isotopes. Hundreds of new standard names were
> added. To me this was all wrong: only few standard names should have been
> added: mass_concentration, density, optical_depth, whatever physical property
> you like. Each variable holding one of these standard name should point to a
> scalar through a controlled attribute. The scalar should  name the isotope or
> the type of particle or the chemical 

Re: [CF-metadata] use of integral_wrt_depth_of_sea_water_practical_salinity

2018-04-16 Thread Martin Juckes - UKRI STFC
Hi Karl,


yes, thanks for correcting that,


Martin


From: Karl Taylor 
Sent: 16 April 2018 16:43
To: Juckes, Martin (STFC,RAL,RALSP); Sebastien Villaume
Cc: cf-metadata@cgd.ucar.edu; Jonathan Gregory
Subject: Re: use of integral_wrt_depth_of_sea_water_practical_salinity

Hi Martin,

To be sure, did you reverse the words in your example of cell_methods?
Should it read:

"cell_methods = depth: mean where sea" ?

best regards,
Karl



On 4/16/18 2:02 AM, Martin Juckes - UKRI STFC wrote:
> Hello Karl, Sebastien,
>
>
> I'm not sure that I've understood the whole thread, but to me it looks as 
> though the coordinate bounds would be the natural place to deal with this, 
> though it would require a modification to the convention.
>
>
> There was a related, inconclusive, discussion in 2016 on the encoding of 
> histogram bin ranges in the case where some bins are not defined by the 
> numerical ranges that the current convention permits for coordinate bounds 
> (http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2016/059037.html ). The 
> idea of using flag_values and flag_meanings came up. For the current example 
> you could set the lower value of the depth coordinate bounds of the vertical 
> integral to -5 [m] and then have flag_values=-5, 
> flag_meanings="ocean_floor".
>
>
> Alternatively, there appears to be agreement in 
> https://cf-trac.llnl.gov/trac/ticket/152 that the cell_methods construction 
> "mean where X" does not need to me restricted to horizontal spatial means. 
> That ticket discusses using it for temporal means, but it could also be used 
> for depth means, as in:
>
> "cell_methods = mean: depth where sea".  The idea that the CF area type "sea" 
> can be depth dependant was accepted in a discussion of usage in CMIP6, where 
> we have many variables which require the surface sea extent, and others which 
> require the total sea extent, including the small but significant portion 
> extending under floating ice shelves. This would make the 
> flag_values/meanings construction redundant.
>
>
> Incidentally, the cell_methods string can be parsed by David Hassell's  cf 
> python library (https://pypi.python.org/pypi/cf-python ). This doesn't 
> entirely solve the problem because of the variable quality of the information 
> that has been encoded in the cell_methods string in the past ... but it does 
> give us a tool to use in our efforts to improve the situation.
>
>
> regards,
>
> Martin
>
>
>
> 
> From: CF-metadata  on behalf of Sebastien 
> Villaume 
> Sent: 13 April 2018 17:30
> To: Karl Taylor
> Cc: cf-metadata@cgd.ucar.edu; Jonathan Gregory
> Subject: Re: [CF-metadata] use of 
> integral_wrt_depth_of_sea_water_practical_salinity
>
> Hi Karl,
>
> I tend to agree that this solution is far from ideal.
>
> The core issue is that there is no clear separation between a parameter 
> (diagnostic quantities, observables, coordinates etc.) and what you do with 
> it in CF: everything is squeezed in the standard name and in the cell_method 
> (in a non-consistent way).
>
> In an ideal world, the standard names should only describe bare parameters 
> and everything related to processing should go into something else. But many 
> standard names make reference to time, space, post-processing, extra useful 
> informations, etc.
> The cell_method attribute is in principle there to represent any 
> (post-)processing but it is not always the case, sometimes the informations 
> are in the standard name directly or sometimes the cell_method is too limited 
> to describe what needs to be described. like in my case here...
> To maintain a strict separation, the "integral_wrt_X_of_Y" should be one of 
> the cell_method from the beginning I also never understood why 
> "difference" is not a valid method in the table E.1 of appendix E since "sum" 
> is there.
>
> I noticed few months ago a thread discussing ontologies in connection with 
> the proposal of standard names for isotopes. Hundreds of new standard names 
> were added. To me this was all wrong: only few standard names should have 
> been added: mass_concentration, density, optical_depth, whatever physical 
> property you like. Each variable holding one of these standard name should 
> point to a scalar through a controlled attribute. The scalar should  name the 
> isotope or the type of particle or the chemical constituent, etc.
> I can already see coming hundreds of new standard names each time a new 
> useful property for isotopes or molecules is required.
>
> You will not prevent explosion of standard names if you don't limit them to 
> the "what". The "when" should go in the time variable(s), the "where" in the 
> spatial variables, and finally the "how" either in the cell_method with clear 
> controlled vocabulary or using a new controlled mechanism yet to define.
>
> /Sébastien
>
> - 

Re: [CF-metadata] use of integral_wrt_depth_of_sea_water_practical_salinity

2018-04-16 Thread Karl Taylor

Hi Martin,

To be sure, did you reverse the words in your example of cell_methods?  
Should it read:


"cell_methods = depth: mean where sea" ?

best regards,
Karl



On 4/16/18 2:02 AM, Martin Juckes - UKRI STFC wrote:

Hello Karl, Sebastien,


I'm not sure that I've understood the whole thread, but to me it looks as 
though the coordinate bounds would be the natural place to deal with this, 
though it would require a modification to the convention.


There was a related, inconclusive, discussion in 2016 on the encoding of histogram bin 
ranges in the case where some bins are not defined by the numerical ranges that the 
current convention permits for coordinate bounds 
(http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2016/059037.html ). The idea of using 
flag_values and flag_meanings came up. For the current example you could set the lower 
value of the depth coordinate bounds of the vertical integral to -5 [m] and then have 
flag_values=-5, flag_meanings="ocean_floor".


Alternatively, there appears to be agreement in https://cf-trac.llnl.gov/trac/ticket/152 
that the cell_methods construction "mean where X" does not need to me 
restricted to horizontal spatial means. That ticket discusses using it for temporal 
means, but it could also be used for depth means, as in:

"cell_methods = mean: depth where sea".  The idea that the CF area type "sea" 
can be depth dependant was accepted in a discussion of usage in CMIP6, where we have many variables 
which require the surface sea extent, and others which require the total sea extent, including the 
small but significant portion extending under floating ice shelves. This would make the 
flag_values/meanings construction redundant.


Incidentally, the cell_methods string can be parsed by David Hassell's  cf 
python library (https://pypi.python.org/pypi/cf-python ). This doesn't entirely 
solve the problem because of the variable quality of the information that has 
been encoded in the cell_methods string in the past ... but it does give us a 
tool to use in our efforts to improve the situation.


regards,

Martin




From: CF-metadata  on behalf of Sebastien Villaume 

Sent: 13 April 2018 17:30
To: Karl Taylor
Cc: cf-metadata@cgd.ucar.edu; Jonathan Gregory
Subject: Re: [CF-metadata] use of 
integral_wrt_depth_of_sea_water_practical_salinity

Hi Karl,

I tend to agree that this solution is far from ideal.

The core issue is that there is no clear separation between a parameter 
(diagnostic quantities, observables, coordinates etc.) and what you do with it 
in CF: everything is squeezed in the standard name and in the cell_method (in a 
non-consistent way).

In an ideal world, the standard names should only describe bare parameters and 
everything related to processing should go into something else. But many 
standard names make reference to time, space, post-processing, extra useful 
informations, etc.
The cell_method attribute is in principle there to represent any 
(post-)processing but it is not always the case, sometimes the informations are 
in the standard name directly or sometimes the cell_method is too limited to 
describe what needs to be described. like in my case here...
To maintain a strict separation, the "integral_wrt_X_of_Y" should be one of the cell_method from 
the beginning I also never understood why "difference" is not a valid method in the table E.1 
of appendix E since "sum" is there.

I noticed few months ago a thread discussing ontologies in connection with the 
proposal of standard names for isotopes. Hundreds of new standard names were 
added. To me this was all wrong: only few standard names should have been 
added: mass_concentration, density, optical_depth, whatever physical property 
you like. Each variable holding one of these standard name should point to a 
scalar through a controlled attribute. The scalar should  name the isotope or 
the type of particle or the chemical constituent, etc.
I can already see coming hundreds of new standard names each time a new useful 
property for isotopes or molecules is required.

You will not prevent explosion of standard names if you don't limit them to the "what". The "when" 
should go in the time variable(s), the "where" in the spatial variables, and finally the "how" 
either in the cell_method with clear controlled vocabulary or using a new controlled mechanism yet to define.

/Sébastien

- Original Message -

From: "Karl Taylor" 
To: "Sebastien Villaume" , "Lowry, Roy K." 
, "Jonathan Gregory"

Cc: cf-metadata@cgd.ucar.edu
Sent: Friday, 13 April, 2018 16:32:39
Subject: Re: [CF-metadata] use of 
integral_wrt_depth_of_sea_water_practical_salinity
Dear all,

I am wary of a "slippery slope" if every calculation performed on a
quantity results in a new standard name for that 

[CF-metadata] Fw: Standard Names to support Trac ticket 99

2018-04-16 Thread Lowry, Roy K.
Forgot to do reply all


Please note that I partially retired on 01/11/2015. I am now only working 7.5 
hours a week and can only guarantee e-mail response on Wednesdays, my day in 
the office. All vocabulary queries should be sent to enquir...@bodc.ac.uk. 
Please also use this e-mail if your requirement is urgent.



From: Lowry, Roy K.
Sent: 16 April 2018 11:09
To: Daniel Neumann
Subject: Re: [CF-metadata] Standard Names to support Trac ticket 99


Thanks Daniel,


To clarify LSID isn't a database, it's an identifier for an organism that 
neatly brings together multiple taxonomies under the single umbrella of the 
Catalogue of Life project. It also resolves, actually in multiple ways, into a 
URL that then provides access into a database providing information on that 
organism.


We came to the conclusion that we should use LSIDs in CF in the first round of 
discussions on Trac 99. My quandary is not whether we should use them, but 
whether the Standard Name should specify 'lsid' or just 'identifier'.  
'Identifier' is what we discussed, but 'lsid' opens the door for future 
Standard Names based on other governances should there be a need to deal with 
entities not covered by lsids. I'm aware of one possible issue related to 
coccoliths plus the possibility of dealing with organism parts (e.g. cod 
livers).


Cheers, Roy.


Please note that I partially retired on 01/11/2015. I am now only working 7.5 
hours a week and can only guarantee e-mail response on Wednesdays, my day in 
the office. All vocabulary queries should be sent to enquir...@bodc.ac.uk. 
Please also use this e-mail if your requirement is urgent.



From: Daniel Neumann 
Sent: 16 April 2018 10:44
To: Lowry, Roy K.
Subject: Re: [CF-metadata] Standard Names to support Trac ticket 99

Dear Roy,

Thank you for bringing this topic forward!

I contacted the responsible person for our institute's data publishing und 
metadata policy and will talk to her about the choice of the LSID database. She 
is more into that topic than I am. It may take some days.

Cheers,
Daniel


On 13.04.2018 16:02, Lowry, Roy K. wrote:

Dear All,


Here is an initial batch of 8 Standard Names to support the CF taxon dimension. 
Two are dimension labels whilst the other six are measurements to which the 
taxon is a co-ordinate. Five of these are to cover Daniel's proposal that 
prompted the resurrection of Ticket 99.


I've presented a summary list followed by a full list with units and 
definitions.  I have one uncertainty in my mind (biological_taxon_label versus 
biological_taxon_lsid) where I would really appreciate input.


Cheers, Roy.


biological_taxon_name

biological_taxon_identifier or biological_taxon_lsid – any preferences

number_concentration_of_biological_taxon_in_sea_water

mass_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water

mass_concentration_of_biological_taxon_expressed_as_chlorophyll_in_sea_water

mass_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water
mole_concentration_of_biological_taxon_expressed_as_carbon_in_sea_water
mole_concentration_of_biological_taxon_expressed_as_nitrogen_in_sea_water




biological_taxon_name


A plaintext human-readable label, usually a Latin binomial such as Calanus 
finmarchicus, applied to a biological taxon. Biological taxon is a name or 
other label identifying an organism or a group of organisms as belonging to a 
unit of classification in a hierarchical taxonomy.


dimensionless


biological_taxon_identifier


An opaque label, most usefully a URI that resolves to an authoritative 
information source, applied to a biological taxon. Biological taxon is a name 
or other label identifying an organism or a group of organisms as belonging to 
a unit of classification in a hierarchical taxonomy. The identifier adopted for 
CF is the Life Science Identifier (LSID), a URN with the syntax 
‘urn:lsid:::[:]’. For example, the 
copepod Calocalanus pavo may be represented by LSIDs 
‘urn:lsid:marinespecies.org:taxname:104669’ (based on WoRMS) and 
urn:lsid:itis.gov:itis_tsn:85335’ (based on ITIS). These URNs may be converted 
to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/'.


dimensionless


OR


biological_taxon_lsid


The Life Science Identifier (LSID) is a standard URI for a biological taxon. 
Biological taxon is a name or other label identifying an organism or a group of 
organisms as belonging to a unit of classification in a hierarchical taxonomy. 
The LSID is a URN with the syntax 
‘urn:lsid:::[:]’. For example, the 
copepod Calocalanus pavo may be represented by LSIDs 
‘urn:lsid:marinespecies.org:taxname:104669’ (based on WoRMS) and 
urn:lsid:itis.gov:itis_tsn:85335’ (based on ITIS). These URNs may be converted 
to URLs delivering RDF by prefixing with 'http://lsid.tdwg.org/'.


dimensionless


number_concentration_of_biological_taxon_in_sea_water


Number 

Re: [CF-metadata] New standard names for C4MIP - part 2

2018-04-16 Thread Alison Pamment - UKRI STFC
Dear Martin and Chris,

Thank you for getting back to me about these names and thank you, Martin, for 
the improved definitions of plants and plant respiration. The raOther name will 
now be:
surface_upward_mass_flux_of_carbon_dioxide_expressed_as_carbon_due_to_plant_respiration_in_miscellaneous_living_matter
 (kg m-2 s-1)
'The surface called "surface" means the lower boundary of the atmosphere. 
"Upward" indicates a vector component which is positive when directed upward 
(negative downward). In accordance with common usage in geophysical 
disciplines, "flux" implies per unit area, called "flux density" in physics. 
The chemical formula for carbon dioxide is CO2. The phrase "expressed_as" is 
used in the construction A_expressed_as_B, where B is a chemical constituent of 
A. It means that the quantity indicated by the standard name is calculated 
solely with respect to the B contained in A, neglecting all other chemical 
constituents of A. The specification of a physical process by the phrase 
"due_to_" process means that the quantity named is a single term in a sum of 
terms which together compose the general quantity named by omitting the phrase. 
"Miscellaneous living matter" means all those parts of plants that are not 
leaf, wood, root or other separately named components. Plant respiration is the 
sum of res
 piration by parts of plants both above and below the soil. It is assumed that 
all the respired carbon dioxide is emitted to the atmosphere. Plants refers to 
the kingdom of plants in the modern classification which excludes fungi.  
Plants are autotrophs i.e. "producers" of the biomass using carbon obtained 
from carbon dioxide.'

This name is accepted for publication in the standard name table and will be 
included in today's update.

I will add the additional definition text to all the plant respiration names 
and create the following aliases:
heterotrophic_respiration_carbon_flux -> 
surface_upward_mass_flux_of_carbon_dioxide_expressed_as_carbon_due_to_heterotrophic_respiration
soil_respiration_carbon_flux -> 
surface_upward_mass_flux_of_carbon_dioxide_expressed_as_carbon_due_to_respiration_in_soil
surface_upward_carbon_mass_flux_due_to_plant_respiration_for_biomass_growth -> 
surface_upward_mass_flux_of_carbon_dioxide_expressed_as_carbon_due_to_plant_respiration_for_biomass_growth
surface_upward_carbon_mass_flux_due_to_plant_respiration_for_biomass_maintenance
 -> 
surface_upward_mass_flux_of_carbon_dioxide_expressed_as_carbon_due_to_plant_respiration_for_biomass_maintenance.

These changes will also be added in today's update.

Thank you very much for the discussion of these and all the other C4MIP names - 
I think the C4MIP names are now complete!

Best wishes,
Alison

--
Alison Pamment Tel: +44 1235 778065
NCAS/Centre for Environmental Data ArchivalEmail: alison.pamm...@stfc.ac.uk
STFC Rutherford Appleton Laboratory 
R25, 2.22
Harwell Oxford, Didcot, OX11 0QX, U.K.

-Original Message-
From: Jones, Chris D [mailto:chris.d.jo...@metoffice.gov.uk] 
Sent: 12 April 2018 17:17
To: Juckes, Martin (STFC,RAL,RALSP) ; Pamment, Alison 
(STFC,RAL,RALSP) ; cf-metadata@cgd.ucar.edu
Subject: RE: New standard names for C4MIP - part 2

Yes, I agree with the need for raOther and the suggested names - thanks Chris

--
Dr Chris Jones
Head, Earth System and Mitigation Science Team Met Office Hadley Centre, 
FitzRoy Road, Exeter, EX1 3PB, U.K. 
Tel: +44 (0)1392 884514  Fax: +44 (0)1392 885681
E-mail: chris.d.jo...@metoffice.gov.uk  http://www.metoffice.gov.uk 


-Original Message-
From: Martin Juckes - UKRI STFC [mailto:martin.juc...@stfc.ac.uk]
Sent: 12 April 2018 15:53
To: Alison Pamment - UKRI STFC ; Jones, Chris D 
; cf-metadata@cgd.ucar.edu
Subject: Re: New standard names for C4MIP - part 2

Dear Chris, Alison,


We do have a requirement for "raOther" in CMIP6, so please go ahead. But, for 
consistency with the others I think it should be "_due_to_plant_respiration_", 
rather that just "_due_to_respiration_", and include a phrase on plant 
respiration in the help text. I've checked some background, to fill in gaps in 
my education, and learned that fungi are no longer plants ... at least not in 
the strict sense of the accepted scientific classification system. In order for 
these standard names to be correct for the requested variables, which are for 
autotrophic fluxes, I think we should make clear that we are using "plant" in 
this scientific sense, rather than in the broader sense following the pre-1960 
classification. With this meaning, I think we can strengthen the statement 
about autotrophs since, as far as I can tell, all plants are autotrophs. The 
current help text for "plant_respiration_carbon_flux" implies that plants 
respire biomass, which doesn't look right to me.


The current text used in the description of 

Re: [CF-metadata] New standard names for C4MIP - part 2

2018-04-16 Thread Alison Pamment - UKRI STFC
Dear Evan and Martin,

Thanks for clearing up this point. I will change the coordinate variable to be 
soil_pool and amend the definition of soil_pool_carbon_decay_rate accordingly.

Both names will be included in today's standard name table update.

Best wishes,
Alison

--
Alison Pamment Tel: +44 1235 778065
NCAS/Centre for Environmental Data ArchivalEmail: alison.pamm...@stfc.ac.uk
STFC Rutherford Appleton Laboratory 
R25, 2.22
Harwell Oxford, Didcot, OX11 0QX, U.K.

From: Juckes, Martin (STFC,RAL,RALSP) 
Sent: 12 April 2018 14:33
To: Manning, Evan M (398B) ; Pamment, Alison 
(STFC,RAL,RALSP) ; chris.d.jo...@metoffice.gov.uk
Subject: Re: New standard names for C4MIP - part 2

Dear Evan,

I agree, it should be "soil_pool" (I had copied "soilpool" across from the MIP 
variable name, where underscores are not used, but in the CF standard name it 
should be added for consistency),

regards,
Martin


From: Manning, Evan M (398B) 
Sent: 11 April 2018 14:07
To: Pamment, Alison (STFC,RAL,RALSP); mailto:chris.d.jo...@metoffice.gov.uk; 
Juckes, Martin (STFC,RAL,RALSP)
Subject: New standard names for C4MIP - part 2 
Quick check before it is final. Do we really want the underscore between "soil" 
and "pool" in "soil_pool_carbon_decay_rate" but not in "soilpool"

-- Evan

On 4/11/18, 5:49 AM, "Alison Pamment - UKRI STFC" wrote:

Dear Chris and Martin,

Thanks very much for the discussion of proposals 27 and 28.

Chris wrote:
> For 28 - yes, I agree this is OK, so that's done too.
> 
> For 27 - thanks for the info Martin - I agree this makes sense as a way to go 
> (having a string valued coordinate, not standardised). The name itself looks 
> OK to me too.

27. soil_pool_carbon_decay_rate (s-1)
' "Soil carbon" is the organic matter present in soil quantified by the mass of 
carbon it contains. Soil carbon is returned to the atmosphere as the organic 
matter decays. Each modelled soil carbon pool has a characteristic turnover 
time, which is modified by environmental conditions such as temperature and 
moisture so that the turnover time varies in space and time. The quantity with 
standard name soil_pool_carbon_decay_rate is defined as 1/(turnover time). The 
data variable should be accompanied by a string valued coordinate variable or 
scalar coordinate variable with standard name soilpool.'

soilpool (no units because this quantity is string valued)
'A variable with the standard name of soilpool contains strings which indicate 
the character of the soil pool classified according to the decay rate of the 
organic carbon material it contains. These strings have not yet been 
standardised.'

I think we're now agreed on both these standard names. They are accepted and 
will be included in the April 16th standard names update.

28. mass_fraction_of_carbon_dioxide_tracer_in_air (Units: 1)
'The chemical formula for carbon dioxide is CO2. Mass fraction is used in the 
construction "mass_fraction_of_X_in_Y", where X is a material constituent of Y. 
It means the ratio of the mass of X to the mass of Y (including X). A chemical 
species denoted by X may be described by a single term such as "nitrogen" or a 
phrase such as "nox_expressed_as_nitrogen". A "tracer" is a quantity advected 
by a model to facilitate analysis of flow patterns.'

This name is accepted and will be included in the April 16th update.

Best wishes,
Alison

--
Alison Pamment Tel: +44 1235 778065
NCAS/Centre for Environmental Data Archival Email: 
mailto:alison.pamm...@stfc.ac.uk
STFC Rutherford Appleton Laboratory 
R25, 2.22
Harwell Oxford, Didcot, OX11 0QX, U.K.

-Original Message-
From: Jones, Chris D [mailto:chris.d.jo...@metoffice.gov.uk] 
Sent: 05 April 2018 12:43
To: Juckes, Martin (STFC,RAL,RALSP) ; Pamment, Alison (STFC,RAL,RALSP) ; 
mailto:cf-metadata@cgd.ucar.edu
Subject: RE: New standard names for C4MIP - part 2

Great - thanks both,

Looks like we can tick off 15, 16, 24.

For 28 - yes, I agree this is OK, so that's done too.

For 27 - thanks for the info Martin - I agree this makes sense as a way to go 
(having a string valued coordinate, not standardised). The name itself looks OK 
to me too.

Chris
--
Dr Chris Jones
Head, Earth System and Mitigation Science Team Met Office Hadley Centre, 
FitzRoy Road, Exeter, EX1 3PB, U.K. 
Tel: +44 (0)1392 884514 Fax: +44 (0)1392 885681
E-mail: mailto:chris.d.jo...@metoffice.gov.uk http://www.metoffice.gov.uk 

-Original Message-
From: Martin Juckes - UKRI STFC [mailto:martin.juc...@stfc.ac.uk]
Sent: 04 April 2018 20:37
To: Alison Pamment - UKRI STFC ; Jones, Chris D ; 
mailto:cf-metadata@cgd.ucar.edu
Subject: Re: New standard names for C4MIP - part 2

Dear Alison, Chris,


I've added some comments on item 27 below (in blue if your client shows colour),


regards,

Martin



From: CF-metadata on behalf of Alison 

Re: [CF-metadata] use of integral_wrt_depth_of_sea_water_practical_salinity

2018-04-16 Thread Martin Juckes - UKRI STFC
Hello Karl, Sebastien,


I'm not sure that I've understood the whole thread, but to me it looks as 
though the coordinate bounds would be the natural place to deal with this, 
though it would require a modification to the convention.


There was a related, inconclusive, discussion in 2016 on the encoding of 
histogram bin ranges in the case where some bins are not defined by the 
numerical ranges that the current convention permits for coordinate bounds 
(http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2016/059037.html ). The idea 
of using flag_values and flag_meanings came up. For the current example you 
could set the lower value of the depth coordinate bounds of the vertical 
integral to -5 [m] and then have flag_values=-5, 
flag_meanings="ocean_floor".


Alternatively, there appears to be agreement in 
https://cf-trac.llnl.gov/trac/ticket/152 that the cell_methods construction 
"mean where X" does not need to me restricted to horizontal spatial means. That 
ticket discusses using it for temporal means, but it could also be used for 
depth means, as in:

"cell_methods = mean: depth where sea".  The idea that the CF area type "sea" 
can be depth dependant was accepted in a discussion of usage in CMIP6, where we 
have many variables which require the surface sea extent, and others which 
require the total sea extent, including the small but significant portion 
extending under floating ice shelves. This would make the flag_values/meanings 
construction redundant.


Incidentally, the cell_methods string can be parsed by David Hassell's  cf 
python library (https://pypi.python.org/pypi/cf-python ). This doesn't entirely 
solve the problem because of the variable quality of the information that has 
been encoded in the cell_methods string in the past ... but it does give us a 
tool to use in our efforts to improve the situation.


regards,

Martin




From: CF-metadata  on behalf of Sebastien 
Villaume 
Sent: 13 April 2018 17:30
To: Karl Taylor
Cc: cf-metadata@cgd.ucar.edu; Jonathan Gregory
Subject: Re: [CF-metadata] use of 
integral_wrt_depth_of_sea_water_practical_salinity

Hi Karl,

I tend to agree that this solution is far from ideal.

The core issue is that there is no clear separation between a parameter 
(diagnostic quantities, observables, coordinates etc.) and what you do with it 
in CF: everything is squeezed in the standard name and in the cell_method (in a 
non-consistent way).

In an ideal world, the standard names should only describe bare parameters and 
everything related to processing should go into something else. But many 
standard names make reference to time, space, post-processing, extra useful 
informations, etc.
The cell_method attribute is in principle there to represent any 
(post-)processing but it is not always the case, sometimes the informations are 
in the standard name directly or sometimes the cell_method is too limited to 
describe what needs to be described. like in my case here...
To maintain a strict separation, the "integral_wrt_X_of_Y" should be one of the 
cell_method from the beginning I also never understood why "difference" is 
not a valid method in the table E.1 of appendix E since "sum" is there.

I noticed few months ago a thread discussing ontologies in connection with the 
proposal of standard names for isotopes. Hundreds of new standard names were 
added. To me this was all wrong: only few standard names should have been 
added: mass_concentration, density, optical_depth, whatever physical property 
you like. Each variable holding one of these standard name should point to a 
scalar through a controlled attribute. The scalar should  name the isotope or 
the type of particle or the chemical constituent, etc.
I can already see coming hundreds of new standard names each time a new useful 
property for isotopes or molecules is required.

You will not prevent explosion of standard names if you don't limit them to the 
"what". The "when" should go in the time variable(s), the "where" in the 
spatial variables, and finally the "how" either in the cell_method with clear 
controlled vocabulary or using a new controlled mechanism yet to define.

/Sébastien

- Original Message -
> From: "Karl Taylor" 
> To: "Sebastien Villaume" , "Lowry, Roy K." 
> , "Jonathan Gregory"
> 
> Cc: cf-metadata@cgd.ucar.edu
> Sent: Friday, 13 April, 2018 16:32:39
> Subject: Re: [CF-metadata] use of 
> integral_wrt_depth_of_sea_water_practical_salinity

> Dear all,
>
> I am wary of a "slippery slope" if every calculation performed on a
> quantity results in a new standard name for that quantity.  We have
> tried to avoid that in most cases by use of the cell methods, bounds,
> and climatology attributes.  Isn't there some way to accommodate this in
> a more general way?  I agree that