hello.

I've discovered several problems regarding charsets and langauges
set up in apache default config. Attached is the patch which attempts
to fix is. since I'm new to apache-dev I might have chagned something
which was put there on putpose, so please rewive the things, I await
your comments.

The same thing is done for 1.3 which is also attached.

I'll coment some issues briefly here:

diff -u -r1.13 httpd-std.conf.in

[..] the list is altered to reflect current order (almost)

-LanguagePriority en da nl et fr de el it ja ko no pl pt pt-br ltz ca es sv tw
+LanguagePriority en da nl et fr de el it ja ko no pl pt pt-br ltz ca es sv tw uk uk-UA

[..] I'm not sure who put there KOI8-RU but I'm certain KOI8-U is widely known
so i've moved .ua to refer to that. It may break some compatibility, though
KOI8-RU must be compatible with KOI8-U.

-AddCharset KOI8-ru     .koi8-uk .ua
[..]
+AddCharset KOI8-ru     .koi8-uk  .koi8-ru

[..] the URL has changed
-# See ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets
+# See http://www.iana.org/assignments/character-sets


[..] as http://www.iana.org/assignments/character-sets notes, UTF-* are all
uppercase so I changed that. also there were duplicate charsets for utf[78]

+AddCharset UTF-7        .utf7
[..]
-AddCharset utf-7       .utf7


If you have any comments/suggestion pleace contact me.

-- 
"Nuclear war would really set back cable."
- Ted Turner
Index: apache-1.3//conf/httpd.conf-dist
===================================================================
RCS file: /home/cvspublic/apache-1.3/conf/httpd.conf-dist,v
retrieving revision 1.80
diff -u -r1.80 httpd.conf-dist
--- apache-1.3//conf/httpd.conf-dist    5 Mar 2002 16:19:12 -0000       1.80
+++ apache-1.3//conf/httpd.conf-dist    19 Aug 2002 16:06:05 -0000
@@ -700,7 +700,9 @@
     # Note 2: The example entries below illustrate that in quite
     # some cases the two character 'Language' abbreviation is not
     # identical to the two character 'Country' code for its country,
-    # E.g. 'Danmark/dk' versus 'Danish/da'.
+    # E.g. 'Danmark/dk' versus 'Danish/da'. or 'Ukraine/ua' versus
+    # 'Ukrainian/uk' (the latter is, sometimes, a source of confusion).
+
     #
     # Note 3: In the case of 'ltz' we violate the RFC by using a three char 
     # specifier. But there is 'work in progress' to fix this and get 
@@ -712,7 +714,9 @@
     # Portugese (pt) - Luxembourgeois* (ltz)
     # Spanish (es) - Swedish (sv) - Catalan (ca) - Czech(cz)
     # Polish (pl) - Brazilian Portuguese (pt-br) - Japanese (ja)
-    # Russian (ru)
+    # Russian (ru) - Ukrainian (uk) Ukrainian in Ukraine (uk-UA) [mozilla is
+    # set up to use uk-UA as default in uk localisation packs, hopefully it'll
+    # be fixed (19.08.2002)]
     #
     AddLanguage da .dk
     AddLanguage nl .nl
@@ -742,11 +746,15 @@
     AddLanguage ru .ru
     AddLanguage zh-tw .tw
     AddLanguage tw .tw
+    AddLanguage uk .uk
+    AddLanguage uk-UA .uk-UA # see comments above
+
     AddCharset Big5         .Big5    .big5
     AddCharset WINDOWS-1251 .cp-1251
     AddCharset CP866        .cp866
     AddCharset ISO-8859-5   .iso-ru
     AddCharset KOI8-R       .koi8-r
+    AddCharset KOI8-U       .koi8-u
     AddCharset UCS-2        .ucs2
     AddCharset UCS-4        .ucs4
     AddCharset UTF-8        .utf8
@@ -758,7 +766,7 @@
     # more or less alphabetized them here. You probably want to change this.
     #
     <IfModule mod_negotiation.c>
-        LanguagePriority en da nl et fr de el it ja kr no pl pt pt-br ru ltz ca es sv 
tw
+        LanguagePriority en da nl et fr de el it ja kr no pl pt pt-br ru ltz ca es sv 
+tw uk uk-UA
     </IfModule>
 
     #
Index: httpd-2.0//docs/conf/httpd-std.conf.in
===================================================================
RCS file: /home/cvspublic/httpd-2.0/docs/conf/httpd-std.conf.in,v
retrieving revision 1.13
diff -u -r1.13 httpd-std.conf.in
--- httpd-2.0//docs/conf/httpd-std.conf.in      15 Jul 2002 20:17:25 -0000      1.13
+++ httpd-2.0//docs/conf/httpd-std.conf.in      19 Aug 2002 16:07:22 -0000
@@ -692,7 +692,8 @@
 # Note 2: The example entries below illustrate that in some cases 
 # the two character 'Language' abbreviation is not identical to 
 # the two character 'Country' code for its country,
-# E.g. 'Danmark/dk' versus 'Danish/da'.
+# E.g. 'Danmark/dk' versus 'Danish/da' or 'Ukraine/ua' versus
+# 'Ukrainian/uk' (the latter is, sometimes, a source of confusion).
 #
 # Note 3: In the case of 'ltz' we violate the RFC by using a three char
 # specifier. There is 'work in progress' to fix this and get
@@ -704,7 +705,9 @@
 # Portugese (pt) - Luxembourgeois* (ltz)
 # Spanish (es) - Swedish (sv) - Catalan (ca) - Czech(cz)
 # Polish (pl) - Brazilian Portuguese (pt-br) - Japanese (ja)
-# Russian (ru) - Croatian (hr)
+# Russian (ru) - Croatian (hr) - Ukrainian (uk) Ukrainian in Ukraine (uk-UA)
+# [mozilla is set up to use uk-UA as default in uk localisation packs, hopefully
+# it'll be fixed (19.08.2002)]
 #
 AddLanguage da .dk
 AddLanguage nl .nl
@@ -731,6 +734,8 @@
 AddLanguage tw .tw
 AddLanguage zh-tw .tw
 AddLanguage hr .hr
+AddLanguage uk .uk
+AddLanguage uk-UA .uk-UA # see comments above
 
 #
 # LanguagePriority allows you to give precedence to some languages
@@ -739,7 +744,7 @@
 # Just list the languages in decreasing order of preference. We have
 # more or less alphabetized them here. You probably want to change this.
 #
-LanguagePriority en da nl et fr de el it ja ko no pl pt pt-br ltz ca es sv tw
+LanguagePriority en da nl et fr de el it ja ko no pl pt pt-br ltz ca es sv tw uk uk-UA
 
 #
 # ForceLanguagePriority allows you to serve a result page rather than
@@ -764,7 +769,7 @@
 # Commonly used filename extensions to character sets. You probably
 # want to avoid clashes with the language extensions, unless you
 # are good at carefully testing your setup after each change.
-# See ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets for
+# See http://www.iana.org/assignments/character-sets for
 # the official list of charset names and their respective RFCs
 #
 AddCharset ISO-8859-1  .iso8859-1  .latin1
@@ -780,26 +785,31 @@
 AddCharset ISO-2022-KR .iso2022-kr .kis
 AddCharset ISO-2022-CN .iso2022-cn .cis
 AddCharset Big5        .Big5       .big5
-# For russian, more than one charset is used (depends on client, mostly):
+# For Russian, more than one charset is used (depends on client, mostly):
 AddCharset WINDOWS-1251 .cp-1251   .win-1251
 AddCharset CP866       .cp866
-AddCharset KOI8-r      .koi8-r .koi8-ru
-AddCharset KOI8-ru     .koi8-uk .ua
+AddCharset KOI8-r      .koi8-r
+# both Russian and Ukrainian (probably other cyrillic-based languages)
+AddCharset KOI8-ru     .koi8-uk  .koi8-ru
+# widely-used Ukrainian encoding, RFC2319
+AddCharset KOI8-U      .koi8-u .ua
+
+# Unicode
 AddCharset ISO-10646-UCS-2 .ucs2
 AddCharset ISO-10646-UCS-4 .ucs4
-AddCharset UTF-8       .utf8
+AddCharset UTF-7        .utf7
+AddCharset UTF-8        .utf8
+AddCharset UTF-16       .utf16
 
 # The set below does not map to a specific (iso) standard
 # but works on a fairly wide range of browsers. Note that
 # capitalization actually matters (it should not, but it
 # does for some browsers).
 #
-# See ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets
+# See http://www.iana.org/assignments/character-sets
 # for a list of sorts. But browsers support few.
 #
 AddCharset GB2312      .gb2312 .gb 
-AddCharset utf-7       .utf7
-AddCharset utf-8       .utf8
 AddCharset big5        .big5 .b5
 AddCharset EUC-TW      .euc-tw
 AddCharset EUC-JP      .euc-jp

Reply via email to