Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)
Hi Walter, if the bytes type does turn out to be a mutable type as suggested in PEP 358, then please make sure that no code (C code in particular), relies on the constantness of these byte objects. This is especially important when it comes to codecs, since the error callback logic would allow the callback to manipulate the byte object contents and length without the codec taking note of this change. I expect there to be other places in the interpreter which would break as well. Otherwise, you end up opening the door for segfaults and easy DOS attacks on Python3. Regards, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 On 2007-05-04 15:05, walter.doerwald wrote: Author: walter.doerwald Date: Fri May 4 15:05:09 2007 New Revision: 55119 Modified: python/branches/py3k-struni/Lib/codecs.py python/branches/py3k-struni/Lib/test/test_codecs.py Log: Make the BOM constants in codecs.py bytes. Make the buffered input for decoders a bytes object. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0365: Adding the pkg_resources module
On 2007-05-01 02:29, Phillip J. Eby wrote: I wanted to get this in before the Py3K PEP deadline, since this is a Python 2.6 PEP that would presumably impact 3.x as well. Feedback welcome. Could you add a section that explains the side effects of importing pkg_resources ? The documentation of the module doesn't mention any, but the code suggests that you are installing (some form of) import hooks. Some other comments: * Wouldn't it be better to factor out all the meta-data access code that's not related to eggs into pkgutil ?! * How about then renaming the remaining module to egglib ?! * The module needs some reorganization: imports, globals and constants at the top, maybe a few comments delimiting the various sections, * The get_*_platform() should probably use the platform module which is a lot more flexible than distutils' get_platform() (which should probably use the platform module as well in the long run) Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 PEP: 365 Title: Adding the pkg_resources module Version: $Revision: 55032 $ Last-Modified: $Date: 2007-04-30 20:24:48 -0400 (Mon, 30 Apr 2007) $ Author: Phillip J. Eby [EMAIL PROTECTED] Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 30-Apr-2007 Post-History: 30-Apr-2007 Abstract This PEP proposes adding an enhanced version of the ``pkg_resources`` module to the standard library. ``pkg_resources`` is a module used to find and manage Python package/version dependencies and access bundled files and resources, including those inside of zipped ``.egg`` files. Currently, ``pkg_resources`` is only available through installing the entire ``setuptools`` distribution, but it does not depend on any other part of setuptools; in effect, it comprises the entire runtime support library for Python Eggs, and is independently useful. In addition, with one feature addition, this module could support easy bootstrap installation of several Python package management tools, including ``setuptools``, ``workingenv``, and ``zc.buildout``. Proposal Rather than proposing to include ``setuptools`` in the standard library, this PEP proposes only that ``pkg_resources`` be added to the standard library for Python 2.6 and 3.0. ``pkg_resources`` is considerably more stable than the rest of setuptools, with virtually no new features being added in the last 12 months. However, this PEP also proposes that a new feature be added to ``pkg_resources``, before being added to the stdlib. Specifically, it should be possible to do something like:: python -m pkg_resources SomePackage==1.2 to request downloading and installation of ``SomePackage`` from PyPI. This feature would *not* be a replacement for ``easy_install``; instead, it would rely on ``SomePackage`` having pure-Python ``.egg`` files listed for download via the PyPI XML-RPC API, and the eggs would be placed in the ``$PYTHONEGGS`` cache, where they would **not** be importable by default. (And no scripts would be installed) However, if the download egg contains installation bootstrap code, it will be given a chance to run. These restrictions would allow the code to be extremely simple, yet still powerful enough to support users downloading package management tools such as ``setuptools``, ``workingenv`` and ``zc.buildout``, simply by supplying the tool's name on the command line. Rationale = Many users have requested that ``setuptools`` be included in the standard library, to save users needing to go through the awkward process of bootstrapping it. However, most of the bootstrapping complexity comes from the fact that setuptools-installed code cannot use the ``pkg_resources`` runtime module unless setuptools is already installed. Thus, installing setuptools requires (in a sense) that setuptools already be installed. Other Python package management tools, such as ``workingenv`` and ``zc.buildout``, have similar bootstrapping issues, since they both make use of setuptools, but also want to provide users with something approaching a one-step install. The complexity of creating bootstrap utilities for these and any other such tools that arise in future, is greatly reduced if ``pkg_resources`` is already present, and is also able to download pre-packaged
Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)
M.-A. Lemburg wrote: Hi Walter, if the bytes type does turn out to be a mutable type as suggested in PEP 358, it is. then please make sure that no code (C code in particular), relies on the constantness of these byte objects. This is especially important when it comes to codecs, since the error callback logic would allow the callback to manipulate the byte object contents and length without the codec taking note of this change. Encoding is not a problem because the error callback never sees or returns a byte object. However decoding is a problem. After the callback returns the codec has to recalculate it's variables. I expect there to be other places in the interpreter which would break as well. Otherwise, you end up opening the door for segfaults and easy DOS attacks on Python3. True, registering an even callback could crash the interpreter. Seems we have to update all decoding functions. Servus, Walter ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)
M.-A. Lemburg schrieb: Hi Walter, if the bytes type does turn out to be a mutable type as suggested in PEP 358, then please make sure that no code (C code in particular), relies on the constantness of these byte objects. This is especially important when it comes to codecs, since the error callback logic would allow the callback to manipulate the byte object contents and length without the codec taking note of this change. I expect there to be other places in the interpreter which would break as well. Otherwise, you end up opening the door for segfaults and easy DOS attacks on Python3. If the user does not need to change these bytes objects and this is needed in more places, adding an immutable flag for internal bytes objects only settable from C, or even an immutable byte base class might be an idea. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Changing string constants to byte arrays ([Python-checkins] r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py)
On 2007-05-04 18:53, Georg Brandl wrote: M.-A. Lemburg schrieb: Hi Walter, if the bytes type does turn out to be a mutable type as suggested in PEP 358, then please make sure that no code (C code in particular), relies on the constantness of these byte objects. This is especially important when it comes to codecs, since the error callback logic would allow the callback to manipulate the byte object contents and length without the codec taking note of this change. I expect there to be other places in the interpreter which would break as well. Otherwise, you end up opening the door for segfaults and easy DOS attacks on Python3. If the user does not need to change these bytes objects and this is needed in more places, adding an immutable flag for internal bytes objects only settable from C, or even an immutable byte base class might be an idea. +1 I also suggest making all bytes literals immutable to avoid running into any issues like the above. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, May 04 2007) Python/Zope Consulting and Support ...http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Changing string constants to byte arr ays ([Python-checkins] r55119 - in python/branches/p y3k-struni/Lib: codecs.py test/test_codecs.py )
On Friday 04 May 2007, M.-A. Lemburg wrote: I also suggest making all bytes literals immutable to avoid running into any issues like the above. +1 from me. -Fred -- Fred L. Drake, Jr. fdrake at acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] Changing string constants to byte arrays ( r55119 - in python/branches/py3k-struni/Lib: codecs.py test/test_codecs.py )
[-python-dev] On 5/4/07, Fred L. Drake, Jr. [EMAIL PROTECTED] wrote: On Friday 04 May 2007, M.-A. Lemburg wrote: I also suggest making all bytes literals immutable to avoid running into any issues like the above. +1 from me. Rather than adding immutability to bytes objects (which has big implementation and type checking implications), consider using buffer(b123) as an immutable bytes literal. You can freely concatenate and compare buffer objects with bytes objects. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Pre-pre PEP for 'super' keyword
Tristan Seligmann wrote: * Guido van Rossum [EMAIL PROTECTED] [2007-04-29 18:19:20 -0700]: In my mind, 'if' and 'or' are syntax, whereas things like 'None' or 'True' are values; even if None becomes an actual keyword, rather than a builtin. I'm sorry, but that is such an incredibly subjective difference that I can't do anything with it. String literals and numeric literals are syntax too, even though they are values. A keyword, or reserved word, is simply something that looks like an identifier but is converted into a different token (by the lexer or by something sitting between the lexer and the parse) before the parser sees it. Let me try a less subjective description. Things like None, 2.3, 'foo', True are values or expressions; I'm not certain exactly what the term for these is in Python's grammar, but I basically mean something that can be on the RHS of an assignment.. However, something like 'for' or 'if' is part of some other grammatical construct, generally a statement or operator of some kind, so I tend to think of those differently. How about a keyword is an identifier that appears as a literal in the grammar? regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden -- Asciimercial - Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.comsquidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -- Thank You for Reading ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 30XZ: Simplified Parsing
Michael Foord wrote: Jim Jewett wrote: PEP: 30xz Title: Simplified Parsing Version: $Revision$ Last-Modified: $Date$ Author: Jim J. Jewett [EMAIL PROTECTED] Status: Draft Type: Standards Track Content-Type: text/plain Created: 29-Apr-2007 Post-History: 29-Apr-2007 Abstract Python initially inherited its parsing from C. While this has been generally useful, there are some remnants which have been less useful for python, and should be eliminated. + Implicit String concatenation + Line continuation with \ + 034 as an octal number (== decimal 28). Note that this is listed only for completeness; the decision to raise an Exception for leading zeros has already been made in the context of PEP XXX, about adding a binary literal. Rationale for Removing Implicit String Concatenation Implicit String concatentation can lead to confusing, or even silent, errors. [1] def f(arg1, arg2=None): pass f(abc def) # forgot the comma, no warning ... # silently becomes f(abcdef, None) Implicit string concatenation is massively useful for creating long strings in a readable way though: call_something(first part\n second line\n third line\n) I find it an elegant way of building strings and would be sad to see it go. Adding trailing '+' signs is ugly. Currently at least possible, though doubtless some people won't like the left-hand alignment, is call_something(\ first part second part third part ) Alas if the proposal to remove the continuation backslash goes through this may not remain available to us. I realise that the arrival of Py3 means all these are up for grabs, but don't think any of them are really warty enough to require removal. I take the point that octal constants are counter-intuitive and wouldn't be too disappointed by their removal. I still think Icon had the right answer there in allowing an explicit decimal radix in constants, so 16 as a binary constant would be 1r2, or 10r16. IIRC it still allowed 0x10 as well (though Tim may shoot me down there). regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden -- Asciimercial - Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.comsquidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -- Thank You for Reading ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pre-pre PEP for 'super' keyword
Tim Delaney wrote: From: Calvin Spealman [EMAIL PROTECTED] I believe the direction my PEP took with all this is a good bit primitive compared to this approach, although I still find value in it because at least a prototype came out of it that can be used to test the waters, regardless of if a more direct-in-the-language approach would be superior. I've been working on improved super syntax for quite a while now - my original approach was 'self.super' which used _getframe() and mro crawling too. I hit on using bytecode hacking to instantiate a super object at the start of the method to gain performance, which required storing the class in co_consts, etc. It turns out that using a metaclass then makes this a lot cleaner. However, I seem to think that if the __this_class__ PEP goes through, your version can be simplified as well. No tricky stuffy things in cells would be needed, but we can just expand the super 'keyword' to __super__(__this_class__, self), which has been suggested at least once. It seems this would be much simpler to implement, and it also brings up a second point. Also, I like that the super object is created at the beginning of the function, which my proposal couldn't even do. It is more efficient if you have multiple super calls, and gets around a problem I completely missed: what happens if the instance name were rebound before the implicit lookup of the instance object at the time of the super call? You could expand it inline, but I think your second point is a strong argument against it. Also, sticking the super instance into a cell means that inner classes get access to it for free. Otherwise each inner class would *also* need to instantiate a super instance, and __this_class__ (or whatever it's called) would need to be in a cell for them to get access to it instead. BTW, one of my test cases involves multiple super calls in the same method - there is a *very* large performance improvement by instantiating it once. And how does speed deteriorate for methods with no uses of super at all (which will, I suspect, be in the majority)? regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden -- Asciimercial - Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.comsquidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -- Thank You for Reading ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 30XZ: Simplified Parsing
[EMAIL PROTECTED] wrote: Trent But if you don't want the EOLs? Example from some code of mine: Trent raise MakeError(extracting '%s' in '%s' did not create the Trent directory that the Python build will expect: Trent '%s' % (src_pkg, dst_dir, dst)) Trent I use this kind of thing frequently. Don't know if others Trent consider it bad style. I use it all the time. For example, to build up (what I consider to be) readable SQL queries: rows = self.executesql(select cities.city, state, country from cities, venues, events, addresses where cities.city like %s and events.active = 1 and venues.address = addresses.id and addresses.city = cities.id and events.venue = venues.id, (city,)) I would be disappointed it string literal concatention went away. Tripe-quoted strings are much easier here, and SQL is insensitive to the newlines and additional spaces. Why not just use rows = self.executesql(select cities.city, state, country from cities, venues, events, addresses where cities.city like %s and events.active = 1 and venues.address = addresses.id and addresses.city = cities.id and events.venue = venues.id, (city,)) It also gives you better error messages from most database back-ends. I realise it makes the constants slightly longer, but if that's an issue I'd have thought people would want to indent code with tabs and not spaces. regards Steve -- Steve Holden+1 571 484 6266 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden -- Asciimercial - Get on the web: Blog, lens and tag your way to fame!! holdenweb.blogspot.comsquidoo.com/pythonology tagged items: del.icio.us/steve.holden/python All these services currently offer free registration! -- Thank You for Reading ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] updated PEP3125, Remove Backslash Continuation
Major rewrite. The inside-a-string continuation is separated from the general continuation. The alternatives section is expaned to als list Andrew Koenig's improved inside-expressions variant, since that is a real contender. If anyone feels I haven't acknowledged their concerns, please tell me. -- PEP: 3125 Title: Remove Backslash Continuation Version: $Revision$ Last-Modified: $Date$ Author: Jim J. Jewett [EMAIL PROTECTED] Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 29-Apr-2007 Post-History: 29-Apr-2007, 30-Apr-2007, 04-May-2007 Abstract Python initially inherited its parsing from C. While this has been generally useful, there are some remnants which have been less useful for python, and should be eliminated. This PEP proposes elimination of terminal ``\`` as a marker for line continuation. Motivation == One goal for Python 3000 should be to simplify the language by removing unnecessary or duplicated features. There are currently several ways to indicate that a logical line is continued on the following physical line. The other continuation methods are easily explained as a logical consequence of the semantics they provide; ``\`` is simply an escape character that needs to be memorized. Existing Line Continuation Methods == Parenthetical Expression - ([{}]) - Open a parenthetical expression. It doesn't matter whether people view the line as continuing; they do immediately recognize that the expression needs to be closed before the statement can end. An examples using each of (), [], and {}:: def fn(long_argname1, long_argname2): settings = {background: random noise volume: barely audible} restrictions = [Warrantee void if used, Notice must be recieved by yesterday Not responsible for sales pitch] Note that it is always possible to parenthesize an expression, but it can seem odd to parenthesize an expression that needs them only for the line break:: assert val4, ( val is too small) Triple-Quoted Strings - Open a triple-quoted string; again, people recognize that the string needs to finish before the next statement starts. banner_message = Satisfaction Guaranteed, or DOUBLE YOUR MONEY BACK!!! some minor restrictions apply Terminal ``\`` in the general case -- A terminal ``\`` indicates that the logical line is continued on the following physical line (after whitespace). There are no particular semantics associated with this. This form is never required, although it may look better (particularly for people with a C language background) in some cases:: assert val4, \ val is too small Also note that the ``\`` must be the final character in the line. If your editor navigation can add whitespace to the end of a line, that invisible change will alter the semantics of the program. Fortunately, the typical result is only a syntax error, rather than a runtime bug:: assert val4, \ val is too small SyntaxError: unexpected character after line continuation character This PEP proposes to eliminate this redundant and potentially confusing alternative. Terminal ``\`` within a string -- A terminal ``\`` within a single-quoted string, at the end of the line. This is arguably a special case of the terminal ``\``, but it is a special case that may be worth keeping. abd\ def 'abd def' + Many of the objections to removing ``\`` termination were really just objections to removing it within literal strings; several people clarified that they want to keep this literal-string usage, but don't mind losing the general case. + The use of ``\`` for an escape character within strings is well known. - But note that this particular usage is odd, because the escaped character (the newline) is invisible, and the special treatment is to delete the character. That said, the ``\`` of ``\(newline)`` is still an escape which changes the meaning of the following character. Alternate Proposals === Several people have suggested alternative ways of marking the line end. Most of these were rejected for not actually simplifying things. The one exception was to let any unfished expression signify a line continuation, possibly in conjunction with increased indentation. This is attractive because it is a generalization of the rule for parentheses. The
Re: [Python-Dev] PEP 30XZ: Simplified Parsing
Steven Bethard a écrit : On 5/2/07, Michael Foord [EMAIL PROTECTED] wrote: Implicit string concatenation is massively useful for creating long strings in a readable way though: call_something(first part\n second line\n third line\n) I find it an elegant way of building strings and would be sad to see it go. Adding trailing '+' signs is ugly. You'll still have textwrap.dedent:: call_something(dedent('''\ first part second line third line ''')) And using textwrap.dedent, you don't have to remember to add the \n at the end of every line. STeVe maybe we could have a dedent literal that would remove the first newline and all indentation so that you can just write: call_something( d''' first part second line third line ''' ) Cheers Baptiste ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 30XZ: Simplified Parsing
On 5/4/07, Baptiste Carvello [EMAIL PROTECTED] wrote: maybe we could have a dedent literal that would remove the first newline and all indentation so that you can just write: call_something( d''' first part second line third line ''' ) Surely from textwrap import dedent as d is close enough? -Mike ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com