Karen Tracey wrote: > On Fri, Jan 22, 2010 at 7:38 AM, Michael Foord > <fuzzy...@voidspace.org.uk>wrote: > >> On 21/01/2010 21:21, "Martin v. Löwis" wrote: >> >>> Where the default *file system encoding* is used (i.e. text files are >>>> written or read without specifying an encoding) >>>> >>>> >>> I think you misunderstand the notion of the *file system encoding*. >>> It is *not* a "file encoding", but the file *system* encoding, i.e. >>> the encoding for file *names*, not for file *content*. >>> >>> It was used on Windows for Windows 95; it is not used anymore on Windows >>> (although it's still used on Unix). >>> >>> >> >> Ok, I'm just using the wrong terminology. I'm aware that mbcs is used for >> filename encoding on Windows (right?). The encoding I'm talking about is the >> encoding that Python uses to decode a file (or encode a string) when you do >> the following in Python 3: >> >> text = open(filename).read() >> open(filename, 'w').write(some_string) >> >> It isn't the default encoding (always utf-8 by default in Python 3 >> apparently), it isn't the file system encoding which is the system encoding >> used for file names. What is the correct terminology for this platform >> dependent encoding that Python uses here? >> >> > The doc here: > http://docs.python.org/3.1/library/functions.html?highlight=open#open just > calls it default encoding and clarifies that is "whatever > locale.getpreferredencoding() returns".
... which is a pretty poor guess, since the locale setting is usually set to what the user wants to see in user interfaces, not what the application want to use as file content. As example take XML files: These will almost always use UTF-8 as encoding. If you use the above approach to write them and happen to work on a system that is set to use Latin-1 or CP1525 as preferred encoding, you'll get garbage in your XML file. > The important point is that it is platform dependent - so if you ship and >> use text files with your Python application and don't specify an encoding >> then it will work fine on some platforms and blow up or use the wrong >> encoding on other platforms. >> >> > Yes. If you ship text files with your Python application, then you'd best > take care to know the encoding when you create them and specify it as the > encoding to use when you open the file for reading by your application. Right. Applications should always provide a well-defined encoding for use with files - they know best what encoding to use and they also know best when to apply guessing and which set of encodings to use as basis for such guessing. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 22 2010) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com