On Wed, Feb 22, 2017 at 10:38 AM, Bob Simons - NOAA Federal < [email protected]> wrote:
> As for needing a different subject for the email: I'm lumping together 2 > new related attribute names: "charset=..." and "data_type=string|char" so > that the information stored in char variables in netcdf-3 files can be > easily and unambiguously interpreted. > somehow it got smashed in with the thread about geometries.. maybe that was my email client. But anyway, away we go! > You are correct. My proposal is for netcdf-3 files since they only support > chars, not true strings. > so maybe make it clear that for netcdf4, one should use strings? I'm not sure if there is anything in CF now that is 3 vs 4 specific... > As for "encoding" vs "charset", I'm open to different names. I chose > "charset" because that is the name used in HTML and is widely used in other > places. Yes, XML uses "encoding". To me, the word "charset" seems > preferable because it is more specific than "encoding" (which also has a > more general purpose meaning). > not a biggie -- +0 for encoding from me. > As for full Unicode support via UTF-8 vs UTF-16: > well, UTF-16 is the worst option -- let's never use that! UCS-4 is the way to go if you want full unicode support and constant bytes per charactor. though "wastes" space. > Since netcdf-3 only supports 8bit chars, the 16bit UTF-16 is not an option. > well, sure, but at the binary level a CHAR is simply an unsigned 8-bit integer -- so you could stuff any encoding into an array of CHAR. But UTF-8 is the only way I know of to support full Unicode using only > 8bit chars for the underlying storage. > see above, but: > It is very widely used. Every modern piece of software that can read or > write text files supports it. It is the default for both XML and HTML 5. > yeah, it really is the best compromise -- and becoming the universal form for data interchange. > If the file writer doesn't need full Unicode, they can use "ISO-8859-1" > (which is compatible with 7bit ASCII) > I'd vote for ASCII and ISO-8859-1 as the only options (Or the HIGHLY RECOMMENDED options, at least). -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception [email protected]
_______________________________________________ CF-metadata mailing list [email protected] http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
