Dear Jonathan, yes, it is absolutely clear that "where" can only be used with area types. It is also clear, I thought, that some of these area types may vary with time: the area type list includes "fire" and "cloud", for example.
7.3.3 gives a template for the cell_methods element: dim1: [dim2: [dim3: ...]] method [where type1 [over type2]] [within|over days|years] [(comment)] and goes on to say: "The valid values for dim1 [dim2[dim3 ...] ] are the names of dimensions of the data variable, names ofscalar coordinate variables of the data variable, valid standard names,or the word area." There is no stated restriction on the values of "dim" which can precede "where ...". You appear to be taking a geographical interpretation of "where" and assuming that it can only apply to spatial information, but have been reading it from a mathematical perspective, in which it can refer to any dimension. In mathematics, statements of the form "sum of A where condition B" carry no implication that "where" has anything to do with area. From this perspective, there is no need to introduce "when" .. The use case that prompted this, from SIMIP, corresponds to your 3rd example, in which we are averaging over all points in the cell and time period covered for which the area type is valid, giving each point equal weight. This can be handled, as you and Karl have pointed out earlier, with a comment in the cell_methods string of the form "(weighted by ....)", but I feel that the use case is clear enough that there is a need for it to be treated in the conventions. regards, Martin ############################################################## Dear Martin In my reading of 7.3.3 and the conformance document, it seems clear that "where" is intended to be used with area types. > There is an issue, it appears, about the use of the "where" modifier for > cell_methods elements other than "area:". Jonathan believes "where" should > only apply for area on the basis that this where the motivation comes from in > the first paragraph of section 7.3.3. The subsequent paragraphs in section > 7.3.3. describe the use of "where" with a generic element "name: ....". The > compliance document clearly states that "where" can be used with any string. I'm sorry, I can't find that - please could you point it out? In http://cfconventions.org/Data/cf-documents/requirements-recommendations/requirements-recommendations-1.6.html regarding method [where type1 [over type2]] it says The valid values for type1 are the name of a string-valued auxiliary or scalar coordinate variable with a standard_name of area_type, or any string value allowed for a variable of standard_name of area_type. We could generalise area_types to mean "states" so they can apply in time as well as space. I think all the existing ones could be interpreted in this way i.e. with the sense of "when" rather than "where". Vegetation is sometimes present and sometimes absent at any given spot, for instance, just as it is present in some spots and not others at any given time. Suppose you want to calculate a radiative flux for a grid-box in cloud-free air. You can do this on each instantaneous timestep for the cloud-free fraction of the grid-box, and then calculate a time-mean of these timestep values i.e. "area: mean where clear_sky time: mean". If the input data supplies a higher spatial resolution than the grid-box, so you have many timeseries, you could alternatively do it the other way round, and first calculate, for each of the points, the value of the flux for those timesteps when there is no cloud, then calculate an area-mean of these local values i.e. "time: mean where clear_sky area: mean". These aren't the same because they imply different weights. For example, suppose you have three points within the grid-box and two times, and the data is as follows: a X X b c X where X means cloudy, and a, b, c are clear-sky values. According to the first method, the value is a/2 + b/4 + c/4, and according to the second method it is a/4 + b/4 + c/2, if I've done my sums right. There is a third method, in which we consider both time and space together: "time: area: mean where clear_sky". In this case the value is a/3 + b/3 + c/3. If I'm right about this, I think we could make this generalisation and it would not be problematic. However, as usual, we should only make the change if there is a use-case which demands it. Best wishes Jonathan _______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
