One more  note: from #141, someone wrote:

"In the HDF5 case, string encoding is an intrinsic part of the HDF5 string 
datatype and can only be ASCII or UTF-8. "

So that's it -- "strings will be UTF-8" we're done with that part :-).

Dealing with unicode with bare CHAR arrays is a bad idea -- sure you can do it, 
as a CHAR array can hold anything -- but we don't recommend using char arrays 
to hold, e.g. floating point data with an attribute saying that it's an IEEE 
754 32 bit float -- we _could_ do that, but it would be  a really bad idea.

The one use case that makes at least a little bit of sense is to use CHAR 
arrays to pass encoded data around, if and only if the library doing the 
passing around is not expected to interpret the data as text in any way. This 
is why *nix systems have been able to get away with poorly specified encoding 
of filenames for so long. But do CF-data handling libraries ever do that?

In short, does anyone need Unicode that can't use a string type?

If modern Fortran netcdf libs can't handle strings, I can't imagine they do the 
right thing with Unicode anyway.










-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/cf-convention/cf-conventions/issues/139#issuecomment-433473555

Reply via email to