Hi, all!

The cf metadata standard lacks specifications for dealing with probabilities. 
To amend this, we would like to propose the following additions to the cf 
metadata standard. This proposal is largely based on an old discussion on this 
list, from 2011. The discussion is in the archive, with the first message here: 
http://mailman.cgd.ucar.edu/pipermail/cf-metadata/2011/049412.html

This suggestion consists of three separate parts: A new standard_name modifier, 
a new standard_name, and conventions for handling more complex cases.


1) A new standard name modifier: confidence

The simplest cases may be expressed using a new standard name modifier, 
confidence. It changes the variable's meaning to say something about our 
confidence of another variable's correctness. Appending this to a variable 
always changes its units to 1 or something equivalent. In this case, 1 means 
high confidence, and 0 means no confidence.

The modifier may be used like this:

float wind_speed(time, latitude, longitude) ;
  wind_speed:units = "m/s" ;
  wind_speed:standard_name = "wind_speed" ;
  wind_speed:ancillary_variables = "wind_speed_confidence" ;

float wind_speed_confidence(time, latitude, longitude) ;
  wind_speed_confidence:units = "1" ;
  wind_speed_confidence:standard_name = "wind_speed confidence" ;

In the second variable, the units have changed from "m/s" to "1", and the 
standard_name contains the new modifier. This means that the data expresses 
confidence that wind_speed is correct for each time and grid point. Note that 
in this case, we do not explicitly specify what we mean by "correct".


2) A new standard_name: cumulative_distribution_function_over_realization

This is intended to be used when specifying various scenarios as percentiles 
for data. It is a list of quantiles of some sort (such as percentiles), which 
may be used as dimensions for other variables. It may for example look like 
this:

float percentile(percentile) ;  
  percentile:units = "%" ;
  percentile:standard_name = 
"cumulative_distribution_function_over_realization" ;
  
float air_temperature_percentiles(time, percentile, latitude, longitude) ;
  air_temperature_percentiles:units = "K" ;
  air_temperature_percentiles:standard_name = "air_temperature" ;

The percentile dimension may for example contain these values: 10, 25, 50, 75, 
90. air_temperature_percentiles would then contain data for five different 
cases.


3) Conventions for intervals

In some cases, it may be necessary to be more specific when stating confidence 
of a given value. What do we mean when we say that we are 79% certain that air 
temperature will be 16 degrees? In this case, we may want to be more specific. 
We may for instance want to say that we are 79% certain that temperature will 
be between +/- 1 degrees of 16 degrees, and 93% certain that temperature will 
be between +/- 2 degrees of that. For this purpose we introduce a convention 
for specifying this: intervals.

Using intervals does not require any new standard names or modifiers. Instead, 
we use bounds to specify the ranges for data. Here is an example, using 
air_temperature, where we give confidence that a temperature forecast is within 
+/- 1.5 or +/- 2.5 degrees:

float temperature_bounds(interval_of_air_temperature, confidence_bounds) ; 
[-1.5, 1.5, -2.5, 2.5]
  temperature_bounds:long_name = "bounds of temperature - for confidence 
variables" ;

float interval_of_air_temperature(interval_of_air_temperature) ; [1.5, 2.5]
  interval_of_air_temperature:bounds = "temperature_bounds" ;
  interval_of_air_temperature:units = "K" ;
  interval_of_air_temperature:long_name = "air_temperature offset from a given 
value (in either direction)" ;

float air_temperature_confidence(time, interval_of_air_temperature, latitude, 
longitude) ;
  air_temperature_confidence:units = "1" ;
  air_temperature_confidence:standard_name = "air_temperature confidence" ;
  air_temperature_confidence:long_name = "probability of air_temperature within 
+/- interval_of_air_temperature" ;

float air_temperature(time, latitude, longitude) ;
  air_temperature:units = "K" ;
  air_temperature:standard_name = "air_temperature" ;
  air_temperature:ancillary_variables = "air_temperature_confidence" ;

There are many things to note here.

  * air_temperature_confidence uses the new standard name modifier, confidence.
  * air_temperature_confidence has an extra dimension, 
interval_of_air_temperature. This specifies what range of air temperature, 
relative to the forecast, we specify our confidence for.
  * interval_of_air_temperature, in turn, specifies a bounds variable, which 
gives the exact temperature offset in each direction. 


Specifying lower limits of precipitation

We may also want to express chances that some value will be above or below a 
certain threshold. A similar construct may be used for this. Here is an 
example, where we express confidence that the amount of rain for a period will 
be above a certain threshold:

float precipitation_bounds(lower_limit_of_precipitation, confidence_bounds) ; 
[0.1,inf, 0.2,inf, 0.5,inf, 1,inf, 2,inf, 5,inf]
  precipitation_bounds:long_name = "bounds of precipitation - for confidence 
variables" ;

float lower_limit_of_precipitation(lower_limit_of_precipitation) ; [0.1, 0.2, 
0.5, 1, 2, 5]
  lower_limit_of_precipitation:bounds = "precipitation_bounds" ;
  lower_limit_of_air_temperature:units = "kg/m2" ;
  lower_limit_of_precipitation:long_name = "lower limit of precipitation" ;

float precipitation_limit_confidence(time, lower_limit_of_precipitation, 
latitude, longitude) ;
  precipitation_limit_confidence:units = "1" ;
  precipitation_limit_confidence:standard_name = "precipitation_amount 
confidence" ;
  precipitation_limit_confidence:long_name = "probability of 
precipitation_amount above precipitation_limit" ;
  precipitation_limit_confidence:cell_methods = "time: sum" ;

I this case, we use infinity as upper bounds for precipitation. In cases where 
bounds type does not have a special infinity value, such as int, that 
variable's max value should be used.

Any comments?


-- Vegard
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to