On Wed, Oct 1, 2008 at 10:36 AM, Antoine Pitrou <[EMAIL PROTECTED]> wrote:
> The average user does not even /know/ what a charset is. Because for the average user, there is no need. Part of the HTML5 standard is how to guess at charsets, and when to automatically use a superset instead of the declared encoding. For most of the US and Europe, the guesses are good enough. For the languages and countries where multiple charsets are in common use, and the guesses are often wrong, browser vendors say that the change charset commands are well-known and frequently used. > If a filename can't be exactly > represented with a valid Unicode sequence, all > applications wanting to access > that file are impacted in the same way, Not really. Some utilities never really need to display the filename; they just need to be able to manage the file. Many applications need to display a file chooser, but may never need to actually open problematic files, and may not need an accurate or complete representation. (Consider "Progra~1" on windows.) > This sounds very much like a > Python-level (or at least stdlib-level) problem to me. The stdlib should provide a way of dealing with raw bytes. Beyond that, the needs get too specialized. (And that way of dealing with raw bytes *might* just be documenting the Latin-1 hack.) > Are you suggesting that the solution to the filename > problem is to prompt the > user and ask them for a different encoding? For some applications, yes. -jJ _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com