Background: As you all know, the vast majority of standard_names are for numeric variables and have an associated "Canonical Units". However, there are some existing standard_names for string variables (e.g., area_type, institution, land_cover, land_cover_lccs, platform_id, platform_name, region, sensor_band_identifier, source, and surface_cover). They do not have associated Canonical Units.
If the "data_type=string" and "charset" attributes are accepted by CF and thus we can clearly identify String variables and know their character encoding, I would like to propose that we add several additional standard_names that identify/describe the String variables in the same way that other standard_names describe numeric variables. I want to get your comments and suggestions before I formally propose them. Here are the possible additional String standard_names, with definitions and [comments in brackets]. I have current needs/uses for almost of these (most exceptions are noted below), e.g., I have a tabular dataset where each row has information about a different project at NOAA. doi Each String specifies a single Digital Object Identifier. email_address Each String specifies a single email address. phone_number Each String specifies a single, voice (thus not including fax numbers), international (thus starting with +countryCode) phone number. The E.164 format is required: +countryCode subscriberNumberIncludingAreaCode e.g., "+1 202 456 1111" (The White House!) https://en.wikipedia.org/wiki/E.164 Spaces between the country code, the area code, the prefix, and the number are strongly encouraged by not required. Parentheses and dashes are discouraged. uri Each String specifies a single URI. url Each String specifies a complete, single URL. It must start with a "scheme" (http:// , https:// , ftp:// , etc.). [It would be possible in the future add related standard_names by appending a specific subtype, e.g., url_project_webpage, url_iso19115_2, url_image if there is a need and if people think it's a good idea.] html_document Each String specifies a complete HTML document. [I am not sure about this one. I admit I don't have a current use case, but I think it is important to distinguish a complete HTML document from a snippet.] html_description Each String is a snippet of text using HTML markup which describes something [e.g., a project, a buoy, the condition of a beached whale, ...] html_snippet Each String is a snippet of text using HTML markup tags that isn't a complete HTML document. This is to be used for html snippets whenever there isn't a suitable, more specific variant, //italics e.g., html_description [I'm open to words other than "snippet".] json Each String is JSON-text: a JSON object, array, number, string, or one of the following three literal names: false, null, true. See http://www.rfc-editor.org/rfc/rfc7159.txt json_geojson Each String is GeoJSON, as specified by https://tools.ietf.org/html/rfc7946 wkt_geometry Each String specifies a complete WKT geometry as specified in the ISO/IEC 13249-3:2016 standard, "Information technology – Database languages – SQL multimedia and application packages – Part 3: Spatial" (SQL/MM). [If additional variants need to be specified in the future, we can append _*subtype*, e.g., wkt_geometry_iso13249_3_2016. NOTE that the use of wkt_geometry with a String variable (a multidimensional char with a charset attribute) doesn't preclude other methods of storing geometries.] wkt_crs Each String specifies a WKT CRS as specified by ISO 19162:2015, "Geographic information – Well-known text representation of coordinate reference systems". xml_document Each String specifies a complete XML document. Use this only if there isn't a suitable, more specific variant, e.g., xml_iso19115_2. [I am not sure about this one. I admit I don't have a current use case, but I think it is important to distinguish a complete XML document from a snippet.] xml_iso19115_2 Each String specifies a complete ISO 19115-2 / ISO 19139 XML document. [Ted Habermann: does this make your day? :-) ] xml_iso19115_1 Each String specifies a complete ISO 19115-1 XML document. [My need for this is not immediate, but I know it is coming.] Additional Comments These are somewhat different than the current standard_names. Here is the reasoning behind them: As with existing standard_names, the goal was short, human-readable names which follow the CF naming convention. syntax_meaning - Although MIME types are too general for our purposes and only apply to entire documents, I like their use of type/subtype (although I used '_' as the separator instead of '/') and I like that the "type" prefix can serve a software-related function (e.g., all standard_names above that start with "xml" indicate that the content can be parsed with an XML parser). So when relevant, the proposed standard_names specify syntax and meaning, using for format *syntax_meaning*, e.g., xml_iso19115_2. Interestingly, I think the actual ISO 19115-2 document just specifies the meaning/content, while ISO 19139 specified the XML representation of that content, so it is a good example of the need for *syntax_meaning* notation. text_plain - I didn't include anything like "text_plain" because that is, in a practical sense, the default for Strings, and because it is implied by more specific standard_names like existing platform_name, region, source. single vs. plural - For many standard_names, I specified that each String specify a single item. I'm open to allowing multiple values if the separator is specified in the standard_names definition. Thank you for considering these names. -- Sincerely, Bob Simons IT Specialist Environmental Research Division NOAA Southwest Fisheries Science Center 99 Pacific St., Suite 255A (New!) Monterey, CA 93940 (New!) Phone: (831)333-9878 (New!) Fax: (831)648-8440 Email: bob.sim...@noaa.gov The contents of this message are mine personally and do not necessarily reflect any position of the Government or the National Oceanic and Atmospheric Administration. <>< <>< <>< <>< <>< <>< <>< <>< <><
_______________________________________________ CF-metadata mailing list CF-metadata@cgd.ucar.edu http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata