Adam M. Costello writes:
> If gethostbyname() receives 8-bit text, should it assume that it's UTF-8,
The current gethostbyname() release (if you put no-check-names into
/etc/resolv.conf) simply copies the 8-bit bytes to the DNS packet. It
doesn't care about the interpretation of the bytes as characters.
If we make the obvious specification of 8-bit bytes in DNS as UTF-8, the
current gethostbyname() semantics will be consistent with two coherent
programming models:
(1) The local character encoding might not be UTF-8. Higher-level
routines are responsible for converting from the local character
encoding to UTF-8 before calling gethostbyname(). It would be
convenient to have a central routine that does this.
(2) The local character encoding is UTF-8. No conversion is required
in this case. This is simpler than #1, and it's wildly popular
among programmers---dealing with multiple character encodings is
a royal pain. UNIX is rapidly moving from #1 to #2.
There are other programming models that aren't consistent with the
current gethostbyname() semantics. This might be a sufficient reason to
berak compatibility if those models were as nice as #2, but they aren't.
---Dan