Representing a huge palette of code pages - I'd recommend our docs folk consider this and comment or commit.
Bill At 10:59 AM 3/19/2004, Zvi Har'El wrote: >Dear Apache developers, > >I sent the following three months ago, but since I got no response, and now >2.0.49 has been rolled without the patch, I resubmit it for you attention: > > >The default httpd.conf includes the lines > >AddCharset ISO-8859-1 .iso8859-1 .latin1 >AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen >AddCharset ISO-8859-3 .iso8859-3 .latin3 >AddCharset ISO-8859-4 .iso8859-4 .latin4 >AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru >AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb >AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk >AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb >AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk > >However, quick look at http://www.iana.org/assignments/character-sets shows >that calling the non-latin charsets ISO8859-N by the name latinN is wrong. >For example, latin8 is ISO-8859-14, or iso-celtic, and certainly not >ISO-8859-8, which is just hebrew! Similarly, latin6 is ISO-8859-10, and not >ISO-8859-6, which is arabic! Finally, latin5 is ISO-8859-9, turkish, and not >ISO-8859-5, which is cyrillic. latin1-4 are ok, and I didn't find latin7 in >this reference at all. I suggest httpd.conf should be fixed accordingly. > >To make my point clearer, here is the patch: > > >--- httpd-2.0.48/docs/conf/httpd-std.conf.in.~20031011014743~ 2003-10-11 >03:47:43.000000000 +0200 >+++ httpd-2.0.48/docs/conf/httpd-std.conf.in 2003-12-15 18:47:07.000000000 +0200 >@@ -797,11 +797,15 @@ > AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen > AddCharset ISO-8859-3 .iso8859-3 .latin3 > AddCharset ISO-8859-4 .iso8859-4 .latin4 >-AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru >-AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb >-AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk >-AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb >-AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk >+AddCharset ISO-8859-5 .iso8859-5 .cyr .iso-ru >+AddCharset ISO-8859-6 .iso8859-6 .arb >+AddCharset ISO-8859-7 .iso8859-7 .grk >+AddCharset ISO-8859-8 .iso8859-8 .heb >+AddCharset ISO-8859-9 .iso8859-9 .latin5 .trk >+AddCharset ISO-8859-10 .iso8859-10 .latin6 >+AddCharset ISO-8859-13 .iso8859-13 .latin7 >+AddCharset ISO-8859-14 .iso8859-14 .latin8 >+AddCharset ISO-8859-15 .iso8859-15 .latin9 > AddCharset ISO-2022-JP .iso2022-jp .jis > AddCharset ISO-2022-KR .iso2022-kr .kis > AddCharset ISO-2022-CN .iso2022-cn .cis > > > > >I have also included latin7 and latin9, which for some reason absent from IANA, >but appear as standard in in the FSF's "free recode". BTW, instead of >inventing new charset abbreviations like .cyr, .arb, .grk, .heb, I would >personally prefer using the IANA (RFC 1345) aliases: .cyrillic, .arabic, >.greek, .hebrew, in the same way we use .latin1, .latin2 , etc, but this is a >matter of opinion, not bug fix patching. > >Best, > >Zvi. > >-- >Dr. Zvi Har'El mailto:[EMAIL PROTECTED] Department of Mathematics >tel:+972-54-227607 icq:179294841 Technion - Israel Institute of Technology >fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/ Haifa 32000, ISRAEL >"If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942) > Friday, 27 Adar 5764, 19 March 2004, 6:53PM