On Thu, 12 Apr 2012 11:09:37 +0200, Yuan Chao <[email protected]> wrote:
2012/4/12 Philip Jägenstedt <[email protected]>:
關於Big5-HKSCS跟Big5-UAO的重疊問題,我們用了dotnetdotcom.org的資料找了有可能出問題的網頁:<http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-April/035370.html>、<http://lists.w3.org/Archives/Public/public-html-ig-zh/2012Apr/0006.html>。統一是會影響一些網頁,不過看來Firefox的做法影響更多網頁。如果我們的分析不準確或有更好的數據源,那應該再研究……
Yes, I've read the discussions listed above. The main problem is that
all these web content are of HK centric. Surely if you set the
browsing encoding to Big5-HKSCS, you see everything is correct for all
Firefox, Chrome and Opera. For Japanese "kannas", you see both -hk and
non -hk are correct. It is because they are part of Big5-ETEN (ETEN is
one of the founder of Big5).
I've added the original list of Big5 URLs from <dotnetdotcom.org > to
<https://gitorious.org/whatwg/big5/blobs/master/big5-urls.txt> and it
contains 140 .hk URLs and 133 .tw URLs. In other words, the sample does
not appear to be unfairly biased to HK, it was just that the pages using
the conflicting byte sequences were mostly HK-related.
Bigger and more random dataset would be awesome, if someone could/produce
them.
Please refer to the link that Tim provides:
http://moztw.org/docs/big5/
It lists and explains most of the Big5 variants known.
The mapping difference you see for Firefox Big5 and Big5-HKSCS is
exactly the difference between Big5-UAO and Big5-HKSCS. It's nothing
wrong. However, the statement "Firefox的做法影響更多網頁" is simply no
true:
如果硬將Big5-HKSCS併入Big5,那會有更多的Big5-UAO網頁受到影響。
畢竟Big5-UAO的使用人數、content量應該都遠大於Big5-HKSCS吧?
Big5-UAO的content除了Firefox還有別的瀏覽器能顯示嗎?需要安裝特別的字體嗎?
Sorry that honestly I didn't get the the motivation of merging Big5
and Big5-HKSCS. To me when browsing HK content, one should set
Big5-HKSCS encoding to be of higher priority. If one uses IE under
windows, he/she should installed the Big5-HKSCS package and it's self
consistent. For browsing contents from Taiwan, the current Firefox
"Big5" should give the best result. So if to merge Big5 and
Big5-HKSCS, one will need at least another encoding like "Big5-UAO".
Merging Big5 and Big5-HKSCS is not a goal in itself, but we must decide
what mapping <meta charset="big5"> should use. Is there any mapping that
would fix more pages than the one I've proposed?
此外,該標準主要是針對瀏覽器,所以不會直接影響Web之外的Big5用法。
All our discussions are surely for web pages and browsers.
Right, I assumed that "Telnet BBS" was not exposed to browsers, but
perhaps I guess that was in reference to ptt.cc?
--
Philip Jägenstedt
Core Developer
Opera Software