This might be of some interest for this group:


   Path: 
pepsi.tninet.se!newsfeed1.telenordia.se!algonet!isdnet!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.media.kyoto-u.ac.jp!newsfeed.mesh.ad.jp!sjc1.nntp.concentric.net!newsfeed.concentric.net!global-news-master
   From: "Eric A. Hall" <[EMAIL PROTECTED]>
   Newsgroups: comp.std.internat,comp.mail.mime,comp.mail.headers
   Subject: unscientific charset survey
   Date: 06 Mar 2001 17:46:21 GMT
   Organization: EHS Company
   Lines: 26
   Message-ID: <[EMAIL PROTECTED]>
   NNTP-Posting-Host: 209.31.7.42
   Mime-Version: 1.0
   Content-Type: text/plain; charset=us-ascii
   Content-Transfer-Encoding: 7bit
   X-Mailer: Mozilla 4.75 [en] (WinNT; U)
   X-Accept-Language: en


   I needed some charset distribution numbers and couldn't find any, so I
   pointed a perl script at my ISP's news server.
   
      4,024,487 messages were processed.
      3,389,401 (84%) had no charset defined.
        632,680 (16%) had legal charsets or aliases defined.
          2,406 (.05%) had illegal charsets defined.
   
   The following had more than 1,000 matches:
   
      ASCII              400,291
      ISO-8859-1         177,786
      ISO-8859-2          25,704
      KOI8-R              10,228
      ISO-2022-JP          7,677
      Windows-1252         4,718
      BIG5                 2,502
      UTF-8                1,616
      ISO-8859-15          1,064
   
   Raw data and charts at http://www.ehsco.com/opinion/20010305.html
   
   --
   Eric A. Hall                                        http://www.ehsco.com/
   Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/


--
Erland Sommarskog, Stockholm, [EMAIL PROTECTED]





Reply via email to