> Also be aware that LATIN-1 is not compatible with UTF-8 with code points 
> above 127

Indeed. Which is why it should be clear that you should NOT put utf-8 in a CHAR 
array :-) We could say ASCII only for CHAR, but I'm not sure there is a good 
reason to  be that restrictive.

It may be a implementation detail of the Python encodings, but at least there, 
latin-1 can decode ANY string of bytes (Other the the null byte) without error, 
and write it out again with no changes. So if consuming code uses the latin-1 
encoding for all CHAR arrays, it may get garbage for the non-ascii bytes, but 
it won't raise an error, or mangle the data if it is written back out.

> the netcdf python library will force the use of strings for netcdf4 files if 
> it sees unicode points outside of ASCII.

which is the right thing to do, and compatible with this proposal, I think. 
(hmm, unless latin-1 is allowed). But you could probably send a latin-1 encoded 
bytes object in yes?

Anyway, if we codify this, and the netCDF4 lib (or any other) can't support it, 
it can be fixed. And yes, I am volunteering to do a PR for a fix to 
netCDF4-python.


-- 
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
https://github.com/cf-convention/cf-conventions/issues/141#issuecomment-599681959

This list forwards relevant notifications from Github.  It is distinct from 
[email protected], although if you do nothing, a subscription to the 
UCAR list will result in a subscription to this list.
To unsubscribe from this list only, send a message to 
[email protected].

Reply via email to