On Fri, Jul 06, 2001 at 04:36:25AM +0100, David Starner wrote: > > Do you have any idea whether the problems identified at > > http://support.microsoft.com/support/kb/articles/Q170/5/59.ASP > > have been resolved? > > Are they a problem for us? Windows Code Page 932 may or may not correspond > to anything that we care about. (At a glance, at least one of each pair that > both correspond to the same Unicode character is not in the real JIS X > 0218.)
If it's indeed the case that this is a CP 932 problem and not a shift JIS problem, and if it's indeed the case that we don't support CP 932, then I'll agree that this isn't a problem. > > Prior to Unicode 3.1 the code space was 16 bits. > > NO. Since Unicode 2.0, the code space has been 21 bits. The ONLY thing > that Unicode 3.1 did, is put characters above U+FFFF. It did not > change the fundamental structure of Unicode in the least. I stand corrected. > > Once unicode can act as a super set for every character set we currently > > support, we can use it as such. Until then, we can't. > > If Unicode were a super set for every character set that anyone needs to > support, it would be worthless and completely unusable. I didn't say for any character set that anyone needs to support. I said for every character set we currently support. I hope you see the difference. [And, as an aside, I should have said "for each character set that we currently support" -- I understand that unicode doesn't need to support mixed character set usage before we migrate.] > However, if we currently support any character set well, it is through > a Unicode based glibc - I don't believe libc accepts the existance of > any character set that can't be mapped to Unicode. So arguably, yes, > Unicode is a super set for every character set we currently support > well. Assuming we're using glibc support (e.g. toupper()) for all those character sets, I'll agree that you have a good point. On 20010705T133736-0400, Raul Miller wrote: > > in HTML the language can only be identified in the mime header. On Fri, Jul 06, 2001 at 11:23:42AM +0300, Antti-Juhani Kaijanaho wrote: > There is no such thing as a MIME header in HTML. > > Besides, HTML does include the lang attribute for most elements. I > wonder what it's for if not for indicating the language. I stand corrected. Thanks, -- Raul

