On Tue, Jan 07, 2003 at 10:23:14AM -0500, Colin Walters wrote: [...] > It looks to me like at this point almost everyone agrees with the > content of my proposal in #99933, and we are discussing implementation > details. Agreed?
No. We agree that UTF-8 support must be dramatically improved, but legacy encodings must be supported too. [...] > You mean like changelog.txt.UTF-8 or changelog.UTF-8.txt ? I am pretty > much opposed to any sort of proposal of this form. The reason is that > changing programs to recognize our arbitrary scheme for file encodings > will not only be a lot of work, I was unclear, and only speaking about files shipped by Debian packages which contain non-ASCII characters without specifying their encoding. Users can do whatever they want with their data. I have almost txt, man and info pages in mind. IIRC *BSD put man pages under .../man/<language>.<encoding>/, don't they? Info pages are never translated. The only text files with non-ASCII letters I encounter are documentation and can be safely renamed, but maybe there are others. > but instead we could add support to programs to autodetect the charset > semi-intelligently from file content, which is what programs like Emacs > in the real world do today. Then why do you patch dpkg to support UTF-8 input if it can guess encoding? Denis

