> Also be aware that LATIN-1 is not compatible with UTF-8 with code points > above 127
Indeed. Which is why it should be clear that you should NOT put utf-8 in a CHAR array :-) We could say ASCII only for CHAR, but I'm not sure there is a good reason to be that restrictive. It may be a implementation detail of the Python encodings, but at least there, latin-1 can decode ANY string of bytes (Other the the null byte) without error, and write it out again with no changes. So if consuming code uses the latin-1 encoding for all CHAR arrays, it may get garbage for the non-ascii bytes, but it won't raise an error, or mangle the data if it is written back out. > the netcdf python library will force the use of strings for netcdf4 files if > it sees unicode points outside of ASCII. which is the right thing to do, and compatible with this proposal, I think. (hmm, unless latin-1 is allowed). But you could probably send a latin-1 encoded bytes object in yes? Anyway, if we codify this, and the netCDF4 lib (or any other) can't support it, it can be fixed. And yes, I am volunteering to do a PR for a fix to netCDF4-python. -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/cf-convention/cf-conventions/issues/141#issuecomment-599681959 This list forwards relevant notifications from Github. It is distinct from [email protected], although if you do nothing, a subscription to the UCAR list will result in a subscription to this list. To unsubscribe from this list only, send a message to [email protected].
