Thomas,

Is there a particular reason why you aren't using a single variable to indicate your different classifications? If you want to go with a variable that uses the standard name "area_type", it seems to me that it should be constructed like

   string sea_ice_type(time, xc, yc):

       :standard_name = "area_type";
       :long_name = "sea ice type classification";

The values of the elements of the variable are then all strings from the area type table, and you request the addition of the names you need to the area type table. Of course, this is a terribly bulky way of doing this, so I'm not surprised that you would like to find a different way.

My first thought on how to represent the information is something like

   byte sea_ice_type(time, xc, yc):

       :standard_name = "????";
       :long_name = "sea ice type classification";
       :flag_values = 1b 2b 3b 4b 100b 101b 102b;
       :flag_meanings = "ice_free_sea first_year_sea_ice
       multi_year_sea_ice ambiguous no_data unclassified land";

I think it might be best to request a new standard name, such as "class" or "classification", that is defined in general terms to indicate any "flag" variable that has a discrete set of integer values used to categorize. Read the definition of the "soil_type" standard name for an example. Alternatively, you could request that the area_type definition be expanded to include a numeric classification scheme, as is specified for soil_type. If people are determined to have a standardized vocabulary for area type, then the definition could indicate that the names used in the flag_meanings attribute would come from the area type table. Then, at that point, you request that the needed names be added to the area type table.

Grace and peace,

Jim

On 9/22/2011 3:52 AM, Thomas Lavergne wrote:
Dear all,

This message is to revive (part of) a short-lived thread in the last days of 
May 2011. Jonathan was kind enough to answer some of my questions then, but it 
never ended in a definite solution to my problem, thus the need (for me) to 
revive the subject.

I have a (satellite product) data field which is a land-surface classification. That means for each cell in my grid, I have a single "area 
type": "open water", "first year ice", "multi year ice", "land", "unclassified", and "no 
data". For those interested, an example (picture of) product is accessible here (with color scale if you zoom-in)
http://osisaf.met.no/p/ice/nh/type/imgs/OSI_HL_SAF_201109211200_pal.jpg

What is the "CF" way of storing such a dataset, and what is the associated standard_name? In the 
latest version of the CF document, I found a quite in-depth description of how to specify statistics per 
area-type inside a grid cell, and I suspect I must use the "area_type" mechanism. But when I assign 
an area_type to my cell, it means that "the cell is mostly covered with area_type" (and I am not 
guessing the area fraction).

1) When it comes to the definition of my "area_type", do I have to use the CF standard 
ones? I understand it is best, but on the other hand, the strings entering 
"flag_meanings" are not standardized in any way, yet are usefull information that a human 
can take advantage of, and that a machine can easily use to create a colored map, and associated 
legend.

2) Still, I am open to defining them:
"open_water" will probably be changed to "ice_free_sea" (already in the standard table), "first_year_sea_ice" and "multi_year_sea_ice" might be a bit 
tricky to define (is it "sea ice that survived a summer melting" or "sea ice whose age is larger than 1 year"?). "land" is already in the standard table. What 
happens then to "ambiguous" (we tried to estimate the type of sea ice, but failed: we cannot decide between first_year and multi_year). This might also be interpreted as 
"even mixture of several (but not all) area types". Finally "unclassified" and "no data" is where we do not have sufficient data or confidence to even start 
the classification: we know up-front the result will be too uncertain.


3) would the following 2 datasets be accepted (omitting the dimension and 
grid_mapping definition)?
byte sea_ice_type(time, xc, yc):
    sea_ice_type:standard_name = "area_type" ;
    sea_ice_type:long_name = "sea ice type classification" ;
    sea_ice_type:_FillValue = -1b ;
    sea_ice_type:valid_min = 1b ;
    sea_ice_type:valid_max = 4b ;
    sea_ice_type:area_type_values = 1b, 2b, 3b, 4b ;
    sea_ice_type:area_type_meanings = "ice_free_sea first_year_sea_ice 
multi_year_sea_ice ambiguous" ;
byte sea_ice_type_qflags(time, xc, yc):
    sea_ice_type_qflags:standard_name = "sea_ice_type status_flag";
    sea_ice_type_qflags:_FillValue    = -1b ;
    sea_ice_type_qflags:valid_min     = 0b ;
    sea_ice_type_qflags:valid_max     = 102b ;
    sea_ice_type_qflags:flag_values   = 0b, 100b, 101b, 102b ;
    sea_ice_type_qflags:flag_meanings = "nominal_quality no_data unclassified 
land" ;

The first dataset is inspired by CF "flags" (with<x>_values and<x>_meanings,<x>  being 
"area_type"). The first one documents where a sea_ice_type is actually defined (that is where we have data, and where 
we are not over (or close to) land). The second one is a regular CF "status_flag" describing the quality of the 
classification in each cell (and explain why there is no valid classification in some cells).

4) if the former is not accepted, does CF have a standard way to storing these 
classifications? Please direct me to the appropriate section in the doc, and 
sorry I missed that one.

5) if my proposal is ok, then should we define the two attributes "area_type_values" and "area_type_meanings"? 
Maybe they could be a generalization of<x>_values /<x>_meanings, along with "flag_values" / 
"flags_meanings"?

6) Is "area_type" truely the standard_name I am going for my first dataset? Maybe something like 
"sea_ice area_type" ("area_type" as a standard_name modifier) is what I want?

Thank you for reading so far in my question! I hope you can help me define how my file 
should look like before it is "released" (and later more difficult to amend). 
Hopefully, this will also convert in a CF-standard way of handling surface classification 
(if it is currently missing, that is).

All the best,
Thomas
_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

--
Jim Biard

Government Contractor, STG Inc.
Remote Sensing and Applications Division (RSAD)
National Climatic Data Center
151 Patton Ave.
Asheville, NC 28801-5001

[email protected]
828-271-4900

_______________________________________________
CF-metadata mailing list
[email protected]
http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata

Reply via email to