They sometimes reply on subsetting i.e. sometimes their gb-2312 articles contain characters outside the gb-2312 repertoire, but since browsers tend to treat gb-2312 as if it were gbk ... it works for them
always used to ht this barrier when converting their rss feeds to utf-8. There are some anomalies with their Unicode support, take Burmese for instance, they chose to use the Myanmar1 font, which predate and is probably not compatible with UTN 11. In Unicode 5.1 there were some fundamental changes to the Myanmar block to allow other languages written in the Myanmar script to be included in Unicode. BBC's recently launch Burmese pages are not compatible with current and future Unicode implementations for Burmese. NO idea why they chose that path. On Mon, April 13, 2009 12:31 pm, Asmus Freytag wrote: > > One would think, as long as the text repertoire is limited to what the > early OSs could handle, it wouldn't matter whether you choose utf-8 or > the native code page. The support for recognizing and converting from > utf-8 came pretty early for browsers, I think. For example, I think > Unicode support goes back as far as IE 3.0 - but I'm not (no longer) > familiar with early versions of the other browsers. > Yes from memory IE3 introduced Unicode support. The key development for early browsers was the release of HTML 4 included the concept of a document character set for HTML, defining entities and character references in terms of Unicode. By then OSes had already shifted to using unicode fonts for rendering. If my memory can stretch back far enough. > Is there anyone (still) familiar with older browsers and OSs who can > contribute which combination of OS/browser could not handle a Windows > 1252 repertoire say, that was utf-8 encoded. It would be interesting to > find out whether there's any OS/Browser combination for which there's a > reasonable expected remaining deployment level and that couldn't handle > that level of utf-8. OSs by themselves, there still many out there where > the native code page is it, but the browsers that run on them don't have > that limitation. > I think you'd be looking at IE 3.0 and Netscape 4 on the windows platform. > I suspect that this choice by the BBC represents a historical > development where no-one's seen the need to change anything as time > marched on, except where languages essentially require utf-8 support > because the vendors stopped supporting dedicated character sets. I'd agree, the decision seems to be use what they've always been using, then migrate to Unicode when a new language requires it. Although that wouldn't explain Hausa. I'd say both the case of Hausa and Burmese indicates some odd decisions currently being made at the BBC. -- Andrew Cunningham Research and Development Coordinator Vicnet State Library of Victoria Australia andr...@vicnet.net.au _______________________________________________ A12n-policy mailing list A12n-policy@bisharat.net http://lists.bisharat.net/mailman/listinfo/a12n-policy