Fredrik Lundh wrote: > M.-A. Lemburg wrote: > > >>I don't follow you here. The source code encoding >>is only applied to Unicode literals (you are using string >>literals in your example). String literals are passed >>through as-is. > > > however, for Python 3000, it would be nice if the source-code encoding applied > to the *entire* file (XML-style), rather than just unicode string literals > and (hope- > fully) comments and docstrings.
Actually, the encoding is applied to the complete source file: the file is transcoded into UTF-8 and then parsed by the Python parser. Unicode literals are then decoded from the UTF-8 into Unicode. String literals are transcoded back into the source code encoding, thus making the (rather long due to technical constraints) round-trip source code encoding -> Unicode -> UTF-8 -> Unicode -> source code encoding. Python 3k should have a fully Unicode based parser to reduce this additional transcoding overhead. Since Py3k will only have Unicode literals, the problems with string literals will go away all by themselves :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Oct 25 2005) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! :::: _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com