Thanks, Pablo! You have a good point that perhaps Apache should have no encoding set by default, thus forcing everyone to read the documentation and make a decision.
Does anyone else have an opinion on this? On Monday 2004.10.11 18:16:09 +0200, Pablo Saratxaga wrote: > Kaixo! > > On Mon, Oct 11, 2004 at 11:56:09AM -0400, Edward H. Trager wrote: > > > MANDRAKE: > > ========= > > And does anyone know the official story about Mandrake? > > I installed Mandrake 10.0 (from a magazine disc) and got > > an ISO-8859-1 locale instead of a UTF-8 locale. > > It depends on your installation choices. > In Mandrakelinux there are some locales that are in UTF-8 by default > (those languages that can be supported only in UTF-8, or that don't > have any large legacy corpus in non-UTF-8); > and other locales that do have large legacy in non UTF-8 that are, > currently (it will hopefully change sometime in the future) in legacy > encoding by default. > But there is, under the "advanced" tab, a "use UTF-8" by default > checkbox, so you can force UTF-8 in anycase. > UTF-8 is also used if you choose several languages and UTF-8 is the > only shared encoding (eg, if you choose support for French and German, both > with legacy iso-8859-15 by default, UTF-8 won't be enabled (unless you > check the UTF-8 checkbox), but if you choose French and Geek for > example, as iso-8859-15 and iso-8859-7 are different, then UTF-8 is > used). > > > The Mandrake > > locale-setting GUI continued to provide only legacy ISO options, > > as far as I could tell. > > the choice has to be done at install time (as use of utf-8 or not > has consequences on how data is stored on hard disks on native linux > partitions; it is not 100% automatizable to change it afterward) > > > In the end I manually set the .i18n > > file to en_US.UTF-8 and everything seems to work to the extent that > > I have tested it. So why is UTF-8 not the default? Does anyone know? > > Because people complained. > UTF-8 support a year ago was not as good as now, and a lot of people > (in particular those using "en_US" locale :) ) would complain about ugly > fonts and other problems if UTF-8 was the default. > > The situation improved a lot, and nowadays there are very few problems > left, probably UTF-8 could be made the default soon; and maybe it could > have been made the default if there weren't other more important issues > to spend our time. > > > APACHE: > > One of the remaining problems is the problem of web pages in cp1252 with > unanounced encoding, when using utf-8 by default some browsers display > them wrong (browsers should do some automatic charset encoding detection > to see if the page is in utf-8, or in cp1252 (the two only valid choices > for unanounced encoding pages, imho). > Same for email programs too (since I switched to utf-8 I got a lot of > messages that display wrongly as they are encoded in cp1252 but don't > announce it properly (in particular in the subject/from headers; but > also in the body); here too, some automatic encoding detection could > help a lot. > > > The last time I installed Apache 2.0.x, it too defaults to the > > legacy ISO-8859-1 configuration. One has to manually change the configuration > > file in order to get HTML pages served with the correct headers > > indicating UTF-8. > > No, it is to the individual files to announce their encoding, not to > the web server. > I don't have any problem using apache with html files correctly > anouncing their encoding, I use a mix of iso-8859-1/iso-8859-15/utf-8, > with some occasional iso-2022-jp pages too. > > > Does anyone know if this is still the case? When is this going to change? > > Apache 2.0.x should really default to UTF-8. Do people agree with me here? > > I disagree :) > The default therefore must not be utf-8 but simply nothing, > forcing a single encoding for all the pages of a whole server is > something that can only be done by the manager of the server, after > carefully thinking about it; not something that should be blindly > enforced by default. > > I however fully agree with you that forcing iso-8859-1 by default is > vey wrong; but I think that forcing any encoding by default is wrong. > > -- > Ki �a vos v�ye b�n, > Pablo Saratxaga > > http://chanae.walon.org/pablo/ PGP Key available, key ID: 0xD9B85466 > [you can write me in Walloon, Spanish, French, English, Catalan or Esperanto] > [min povas skribi en valona, esperanta, angla aux latinidaj lingvoj] -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
