This might be of some interest for this group: Path: pepsi.tninet.se!newsfeed1.telenordia.se!algonet!isdnet!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.media.kyoto-u.ac.jp!newsfeed.mesh.ad.jp!sjc1.nntp.concentric.net!newsfeed.concentric.net!global-news-master From: "Eric A. Hall" <[EMAIL PROTECTED]> Newsgroups: comp.std.internat,comp.mail.mime,comp.mail.headers Subject: unscientific charset survey Date: 06 Mar 2001 17:46:21 GMT Organization: EHS Company Lines: 26 Message-ID: <[EMAIL PROTECTED]> NNTP-Posting-Host: 209.31.7.42 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: Mozilla 4.75 [en] (WinNT; U) X-Accept-Language: en I needed some charset distribution numbers and couldn't find any, so I pointed a perl script at my ISP's news server. 4,024,487 messages were processed. 3,389,401 (84%) had no charset defined. 632,680 (16%) had legal charsets or aliases defined. 2,406 (.05%) had illegal charsets defined. The following had more than 1,000 matches: ASCII 400,291 ISO-8859-1 177,786 ISO-8859-2 25,704 KOI8-R 10,228 ISO-2022-JP 7,677 Windows-1252 4,718 BIG5 2,502 UTF-8 1,616 ISO-8859-15 1,064 Raw data and charts at http://www.ehsco.com/opinion/20010305.html -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/ -- Erland Sommarskog, Stockholm, [EMAIL PROTECTED]
