Hello All,

There are a number of variables in the CMIP6 data request which are requested 
as weighted means, such as the age of snow as a time mean weighted by mass of 
snow. We also have area weighted means, as in the monthly mean temperature of 
sea-ice weighted by sea ice area. The latter can be handled well enough with 
the existing cell_methods syntax: "time: mean where sea_ice". For the former, 
the CMIP5 approach was to have a comment indicating that weighting should be 
used. As this is a reasonably common operation it would be nice to have 
something more explicit in the cell_methods attribute.

I can think of several possibilities:
(1) add a "weighted-mean" method, and leave it up to the data provider to give 
additional information. This would at least alert the user that they need to 
look for additional information.  This would be an improvement. In the present 
convention "time: mean" can mean either a simple mean or a weighted mean. By 
adding the "weighted-mean" option we would be able to stipulate that "time: 
mean" only be used for non-weighted means, and this would reduce an existing 
ambiguity.

(2) add a "weighted-by: <variable name>" option in the cell-methods comment 
statement, similar to the "interval: ..." clause, e.g. "time: mean 
(weighted-by: snw)". This would give more information, but if the comment is 
considered as optional it does not remove the ambiguity that "time: mean" can 
apply to either weighted or un-weighted mean. Making the comment obligatory for 
weighted means would blur the status of the comment. There is also the problem 
that since the variable "snw" is not going to be in the file the information 
remains incomplete.

(3) add a weighted clause, e.g. "time: mean [where .... ] weighted snm".  The 
main problem here is that parsing cell_methods is already complicated, and this 
would add to that difficulty, though only in a small incremental way.

(4) as (1), but with a additional requirement that the dimension over which the 
weighted mean is being taken carry information about the weighting. The 
information could be attached either as a specified attribute "weighting". 
Because the weighting variable will generally be at a higher frequency than the 
weighted-mean we are trying to describe it will not be sensible to include it, 
so this attribute will at most provide a clue about the provenance. For 
example, it might be of the form "<variable name> [(<optional comment>)]".
e.g. 'weighting: snw (daily snow mass --- archived in the "day" MIP table)'.

The last option appears the cleanest to me, as it does not change the grammar 
of the cell_methods string and adds additional information to the relevant 
dimension in a fairly self-explanatory way.

Perhaps this has been discussed before? Any other thoughts?

regards,
Martin

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to