Re: [Python-Dev] [Python-checkins] r84858 - in python/branches: py3k/Doc/library/logging.rst release27-maint/Doc/library/logging.rst
Am 21.09.2010 01:42, schrieb Éric Araujo: Hello + NOTE: If you are thinking of defining your own levels, please see the section + on :ref:`custom-levels`. I think those instances of upper-case-as-markup should either be real reST note/warning/etc. directives or plain English (that is, integrating “NOTE:” into the text flow, for example “Note that...”/“Pay attention to...”). I've been told that Note that should be removed altogether, as it's quite redundant :) Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r84906 - python/branches/py3k/Doc/library/ssl.rst
On Tue, 21 Sep 2010 02:34:46 +0200 Éric Araujo mer...@netwok.org wrote: Log: Remove references to read() and write() methods, which are useless synonyms of recv() and send() Unless I’m mistaken, ssl.SSLSocket.write is still useful for use with print, pprint and maybe other functions, Hmm, sorry? I don't get what you're talking about. You don't use print() with plain sockets, so you shouldn't use it with SSL sockets either. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] r84931 - in python/branches/py3k: Include/symtable.h Misc/NEWS Python/ast.c Python/compile.c Python/future.c Python/symtable.c
Am 21.09.2010 01:02, schrieb benjamin.peterson: Author: benjamin.peterson Date: Tue Sep 21 01:02:10 2010 New Revision: 84931 Log: add column offset to all syntax errors Modified: python/branches/py3k/Misc/NEWS == --- python/branches/py3k/Misc/NEWS(original) +++ python/branches/py3k/Misc/NEWSTue Sep 21 01:02:10 2010 @@ -10,9 +10,8 @@ Core and Builtins - -- Issue #9901: Destroying the GIL in Py_Finalize() can fail if some other - threads are still running. Instead, reinitialize the GIL on a second - call to Py_Initialize(). +- All SyntaxErrors now have a column offset and therefore a caret when the error + is printed. - Issue #9252: PyImport_Import no longer uses a fromlist hack to return the module that was imported, but instead gets the module from sys.modules. @@ -59,10 +58,6 @@ Library --- -- Issue #9877: Expose sysconfig.get_makefile_filename() - -- logging: Added hasHandlers() method to Logger and LoggerAdapter. - - Issue #1686: Fix string.Template when overriding the pattern attribute. - Issue #9854: SocketIO objects now observe the RawIOBase interface in That change looks like and accident. BTW, PyErr_SyntaxLocationEx needs a versionadded. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Tue, Sep 21, 2010 at 3:03 PM, Stephen J. Turnbull step...@xemacs.org wrote: On the other hand, it is dangerous to provide a polymorphic API which does that more extensive parsing, because a less than paranoid programmer will have very likely allowed the parsed components to escape from the context where their encodings can be reliably determined. Remember, *it is unlikely that they will ever be punished for their own lack of caution.* The person who is doomed is somebody who tries to take that code and reuse it in a different context. Yeah, that's the original reasoning that had me leaning towards the parallel API approach. If I seem to be changing my mind a lot in this thread it's because I'm genuinely torn between the desire to make it easier to port existing 2.x code to 3.x by making the current API polymorphic and the fear that doing so will reintroduce some of the exact same bytes/text confusion that the bytes/str split is trying to get rid of. There's no real way for 2to3 to help with the porting issue either, since it has no way to determine the original intent of the 2.x code. I *think* avoiding the quote/unquote precedent and applying the rule bytes in - bytes out will help with avoiding the worst of any potential encoding confusion problems though. At some point the programmer is going to have to invoke decode() if they want a string to pass to display functions and the like (or vice versa with encode()) so there are still limits to how far any poorly handled code will get before blowing up. (Basically, while the issue of programmers assuming 'latin-1' or 'utf-8' or similar ASCII friendly encodings when they shouldn't is real, I don't believe a polymorphic API here will make things any *worse* than what would happen with a parallel API) And if this turns out to be a disaster in practice: a) on my head be it; and b) we still have the option of the DeprecationWarning dance for bytes inputs to the existing functions and moving to a parallel API Still-trying-to-figure-out-what-moment-of-insanity-prompted-me-to-volunteer-to-tackle-this'ly, Nick. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r84858 - in python/branches: py3k/Doc/library/logging.rst release27-maint/Doc/library/logging.rst
On Tue, Sep 21, 2010 at 5:25 PM, Georg Brandl g.bra...@gmx.net wrote: I've been told that Note that should be removed altogether, as it's quite redundant :) I still find starting a paragraph with Note that to be useful as a mild attention getter that isn't as shouty as an actual ReST note. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r84858 - in python/branches: py3k/Doc/library/logging.rst release27-maint/Doc/library/logging.rst
On 21/09/2010 14:42, Nick Coghlan wrote: On Tue, Sep 21, 2010 at 5:25 PM, Georg Brandlg.bra...@gmx.net wrote: I've been told that Note that should be removed altogether, as it's quite redundant :) I still find starting a paragraph with Note that to be useful as a mild attention getter that isn't as shouty as an actual ReST note. I agree. Don't feel strongly about it though. (I'm sure Strunk and White would disapprove.) Michael Cheers, Nick. -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r84858 - in python/branches: py3k/Doc/library/logging.rst release27-maint/Doc/library/logging.rst
2010/9/21 Nick Coghlan ncogh...@gmail.com: On Tue, Sep 21, 2010 at 5:25 PM, Georg Brandl g.bra...@gmx.net wrote: I've been told that Note that should be removed altogether, as it's quite redundant :) I still find starting a paragraph with Note that to be useful as a mild attention getter that isn't as shouty as an actual ReST note. I think the preference should be to avoid phases like Note that. For example the sentence in consideration could be phrased nicely as See :ref:`custom-levels` for documentation on defining your levels. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] r84858 - in python/branches: py3k/Doc/library/logging.rst release27-maint/Doc/library/logging.rst
On Tue, Sep 21, 2010 at 3:46 PM, Michael Foord fuzzy...@voidspace.org.uk wrote: I agree. Don't feel strongly about it though. (I'm sure Strunk and White would disapprove.) No doubt. http://chronicle.com/article/50-Years-of-Stupid-Grammar/25497 ;-) Steve -- Where did you get that preposterous hypothesis? Did Steve tell you that? --- The Hiphopopotamus ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On 21 September 2010 14:38, Nick Coghlan ncogh...@gmail.com wrote: On Tue, Sep 21, 2010 at 3:03 PM, Stephen J. Turnbull step...@xemacs.org wrote: On the other hand, it is dangerous to provide a polymorphic API which [...] Sorry if this is off-topic, but I don't believe I ever saw Stephen's email. I have a feeling that's happened a couple of times recently. Before I go off trying to work out why gmail is dumping list mails on me, did anyone else see Stephen's mail via the list (as opposed to being a direct recipient)? Thanks, and sorry for the interruption. Paul. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
Nick Coghlan writes: (Basically, while the issue of programmers assuming 'latin-1' or 'utf-8' or similar ASCII friendly encodings when they shouldn't is real, I don't believe a polymorphic API here will make things any *worse* than what would happen with a parallel API) That depends on how far the polymorphic API goes. As long as the polymorphic API *never ever* does anything that involves decoding wire format (and I include URL-quoting here), the programmer will have to explicitly do some decoding to get into much trouble, and at that point it's really their problem; you can't stop them. But I don't know whether the web apps programmers will be satisfied with such a minimal API. If not, you're going to have to make some delicate judgments about what to provide and what not, and whether/how to provide a safety net of some kind. I don't envy you that task. And if this turns out to be a disaster in practice: I would say be conservative about which APIs you make polymorphic. And there are a lot of APIs that probably should be considered candidates for polymorphic versions (regexp matching and searching, for example). So any experience or generic functionality you develop here is likely to benefit somebody down the road. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Sep 21, 2010, at 04:01 PM, Paul Moore wrote: Sorry if this is off-topic, but I don't believe I ever saw Stephen's email. I have a feeling that's happened a couple of times recently. Before I go off trying to work out why gmail is dumping list mails on me, did anyone else see Stephen's mail via the list (as opposed to being a direct recipient)? I remember seeing a message from Stephen on this topic, but I didn't read it though and it's deleted now ;). I don't use gmail regularly. See also: http://wiki.list.org/x/2IA9 -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Wed, 22 Sep 2010 00:10:01 +0900 Stephen J. Turnbull step...@xemacs.org wrote: But I don't know whether the web apps programmers will be satisfied with such a minimal API. Web app programmers will generally go through a framework, which handles encoding/decoding for them (already so in 2.x). And there are a lot of APIs that probably should be considered candidates for polymorphic versions (regexp matching and searching, for example). As a matter of fact, the re module APIs are already polymorphic, all the while disallowing any mixing of bytes and unicode. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Mon, Sep 20, 2010 at 6:19 PM, Nick Coghlan ncogh...@gmail.com wrote: What are the cases you believe will cause new mojibake? Calling operations like urlsplit on byte sequences in non-ASCII compatible encodings and operations like urljoin on byte sequences that are encoded with different encodings. These errors differ from the URL escaping errors you cite, since they can produce true mojibake (i.e. a byte sequence without a single consistent encoding), rather than merely non-compliant URLs. However, if someone has let their encodings get that badly out of whack in URL manipulation they're probably doomed anyway... FWIW, while I understand the problems non-ASCII-compatible encodings can create, I've never encountered them, perhaps because ASCII-compatible encodings are so dominant. There are ways you can get a URL (HTTP specifically) where there is no notion of Unicode. I think the use case everyone has in mind here is where you get a URL from one of these sources, and you want to handle it. I have a hard time imagining the sequence of events that would lead to mojibake. Naive parsing of a document in bytes couldn't do it, because if you have a non-ASCII-compatible document your ASCII-based parsing will also fail (e.g., looking for b'href=(.*?)'). I suppose if you did urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())) you could end up with the problem. All this is unrelated to the question, though -- a separate byte-oriented function won't help any case I can think of. If the programmer is implementing something like urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())), it's because they *want* to get bytes out. So if it's named urlparse.urlsplit_bytes() they'll just use that, with the same corruption. Since bytes and text don't interact well, the choice of bytes in and bytes out will be a deliberate one. *Or*, bytes will unintentionally come through, but that will just delay the error a while when the bytes out don't work (e.g., urlparse.urljoin(text_url, urlparse.urlsplit(byte_url).path). Delaying the error is a little annoying, but a delayed error doesn't lead to mojibake. Mojibake is caused by allowing bytes and text to intermix, and the polymorphic functions as proposed don't add new dangers in that regard. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On 21 September 2010 16:23, Barry Warsaw ba...@python.org wrote: On Sep 21, 2010, at 04:01 PM, Paul Moore wrote: Sorry if this is off-topic, but I don't believe I ever saw Stephen's email. I have a feeling that's happened a couple of times recently. Before I go off trying to work out why gmail is dumping list mails on me, did anyone else see Stephen's mail via the list (as opposed to being a direct recipient)? I remember seeing a message from Stephen on this topic, but I didn't read it though and it's deleted now ;). I don't use gmail regularly. See also: http://wiki.list.org/x/2IA9 Ta. I'm seeing some other messages now, it may be just that Stephen's were getting delayed. Paul ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. After all, even if PEP 333 is ultimately replaced by PEP 444, it's probably a good idea to have *some* sort of WSGI 1-ish thing available on Python 3, with bytes/unicode and other matters settled. In the past, I was waiting for some consensuses (consensi?) on Web-SIG about different approaches to Python 3, looking for some sort of definite, yes, we all like this response. However, I can see now that this just means it's my fault we don't have a spec yet.:-( So, unless any last-minute showstopper rebuttals show up this week, I've decided to go ahead officially bless nearly all of what Graham Dumpleton (who's not only the mod_wsgi author, but has put huge amounts of work into shepherding WSGI-on-Python3 proposals, WSGI amendments, etc.) has proposed, with a few minor exceptions. In other words: almost none of the following is my own original work; it's like 90% Graham's. Any praise for this belongs to him; the only thing that belongs to me is the blame for not doing this sooner! (Sorry Graham. You asked me to do this ages ago, and you were right.) Anyway, I'm posting this for comment to both Python-Dev and the Web-SIG. If you are commenting on the technical details of the amendments, please reply to the Web-SIG only. If you are commenting on the development agenda for wsgiref or other Python 3 library issues, please reply to Python-Dev only. That way, neither list will see off-topic discussions. Thanks! The Plan I plan to update the proposal below per comments and feedback during this week, then update PEP 333 itself over the weekend or early next week, followed by a code review of Python 3's wsgiref, and implementation of needed changes (such as recoding os.environ to latin1-captured bytes in the CGI handler). To complete the changes, it is possible that I may need assistance from one or more developers who have more Python 3 experience. If after reading the proposed changes to the spec, you would like to volunteer to help with updating wsgiref to match, please let me know! The Proposal Overview 1. The primary purpose of this update is to provide a uniform porting pattern for moving Python 2 WSGI code to Python 3, meaning a pattern of changes that can be mechanically applied to as little code as practical, while still keeping the WSGI spec easy to programmatically validate (e.g. via ``wsgiref.validate``). The Python 3 specific changes are to use: * ``bytes`` for I/O streams in both directions * ``str`` for environ keys and values * ``bytes`` for arguments to start_response() and write() * text stream for wsgi.errors In other words, strings in, bytes out for headers, bytes for bodies. In general, only changes that don't break Python 2 WSGI implementations are allowed. The changes should also not break mod_wsgi on Python 3, but may make some Python 3 wsgi applications non-compliant, despite continuing to function on mod_wsgi. This is because mod_wsgi allows applications to output string headers and bodies, but I am ruling that option out because it forces every piece of middleware to have to be tested with arbitrary combinations of strings and bytes in order to test compliance. If you want your application to output strings rather than bytes, you can always use a decorator to do that. (And a sample one could be provided in wsgiref.) 2. The secondary purpose of the update is to address some long-standing open issues documented here: http://www.wsgi.org/wsgi/Amendments_1.0 As with the Python 3 changes, only changes that don't retroactively invalidate existing implementations are allowed. 3. There is no tertiary purpose. ;-) (By which I mean, all other kinds of changes are out-of-scope for this update.) 4. The section below labeled A Note On String Types is proposed for verbatim addition to the Specification Overview section in the PEP; the other sections below describe changes to be made inline at the appropriate part of the spec, and changes that were proposed but are rejected for inclusion in this amendment. A Note On String Types -- In general, HTTP deals with bytes, which means that this specification is mostly about handling bytes. However, the content of those bytes often has some kind of textual interpretation, and in Python, strings are the most convenient way to handle text. But in many Python versions and implementations, strings are Unicode, rather than bytes. This requires a careful balance between a usable API and correct translations between bytes and text in the context of HTTP... especially to support porting code between Python implementations with different ``str`` types. WSGI therefore
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Tue, 2010-09-21 at 23:38 +1000, Nick Coghlan wrote: And if this turns out to be a disaster in practice: a) on my head be it; and b) we still have the option of the DeprecationWarning dance for bytes inputs to the existing functions and moving to a parallel API In the case of urllib.parse, it's entirely safe. If someone beats you up over it later, you can tell them to bother straddlers in Web-SIG, as we're the folks who most want the polymorphism in that particular API. - C ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote: While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. If a WSGI-1-compatible protocol seems more sensible to folks, I'm personally happy to defer discussion on PEP 444 or any other backwards-incompatible proposal. - C ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
On Tue, 21 Sep 2010 12:09:44 -0400 P.J. Eby p...@telecommunity.com wrote: While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. If this allows the Web situation in Python 3 to be improved faster and with less hassle then all the better. There's something strange in your proposal: it mentions WSGI 2 at several places while there's no guarantee about what WSGI 2 will be (is there?). Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
On Tue, Sep 21, 2010 at 12:47 PM, Chris McDonough chr...@plope.com wrote: On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote: While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. If a WSGI-1-compatible protocol seems more sensible to folks, I'm personally happy to defer discussion on PEP 444 or any other backwards-incompatible proposal. I think both make sense, making WSGI 1 sensible for Python 3 (as well as other small errata like the size hint) doesn't detract from PEP 444 at all, IMHO. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
On Tue, Sep 21, 2010 at 12:09 PM, P.J. Eby p...@telecommunity.com wrote: The Python 3 specific changes are to use: * ``bytes`` for I/O streams in both directions * ``str`` for environ keys and values * ``bytes`` for arguments to start_response() and write() This is the only thing that seems odd to me -- it seems like the response should be symmetric with the request, and the request in this case uses str for headers (status being header-like), and bytes for the body. Otherwise this seems good to me, the only other major errata I can think of are all listed in the links you included. * text stream for wsgi.errors In other words, strings in, bytes out for headers, bytes for bodies. In general, only changes that don't break Python 2 WSGI implementations are allowed. The changes should also not break mod_wsgi on Python 3, but may make some Python 3 wsgi applications non-compliant, despite continuing to function on mod_wsgi. This is because mod_wsgi allows applications to output string headers and bodies, but I am ruling that option out because it forces every piece of middleware to have to be tested with arbitrary combinations of strings and bytes in order to test compliance. If you want your application to output strings rather than bytes, you can always use a decorator to do that. (And a sample one could be provided in wsgiref.) I agree allowing both is not ideal. -- Ian Bicking | http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
At 12:55 PM 9/21/2010 -0400, Ian Bicking wrote: On Tue, Sep 21, 2010 at 12:47 PM, Chris McDonough mailto:chr...@plope.comchr...@plope.com wrote: On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote: While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. If a WSGI-1-compatible protocol seems more sensible to folks, I'm personally happy to defer discussion on PEP 444 or any other backwards-incompatible proposal. I think both make sense, making WSGI 1 sensible for Python 3 (as well as other small errata like the size hint) doesn't detract from PEP 444 at all, IMHO. Yep. I agree. I do, however, want to get these amendments settled and make sure they get carried over to whatever spec is the successor to PEP 333. I've had a lot of trouble following exactly what was changed in 444, and I'm a tad worried that several new ambiguities may be being introduced. So, solidifying 333 a bit might be helpful if it gives a good baseline against which to diff 444 (or whatever). ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
On Tue, Sep 21, 2010 at 11:09 AM, P.J. Eby p...@telecommunity.com wrote: After all, even if PEP 333 is ultimately replaced by PEP 444, it's probably a good idea to have *some* sort of WSGI 1-ish thing available on Python 3, with bytes/unicode and other matters settled. Indeed. Though I generally like the direction that PEP 444 is going in, I know that writing specs is *HARD*. I think having something that works on Python 3 in time for the 3.2 release is a much bigger deal than having an WSGI2 (or whatever) done. This is classic the perfect is the enemy of the good territory. Let's get the good done and *then* spend time working on the perfect. Jacob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
At 06:52 PM 9/21/2010 +0200, Antoine Pitrou wrote: On Tue, 21 Sep 2010 12:09:44 -0400 P.J. Eby p...@telecommunity.com wrote: While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. If this allows the Web situation in Python 3 to be improved faster and with less hassle then all the better. There's something strange in your proposal: it mentions WSGI 2 at several places while there's no guarantee about what WSGI 2 will be (is there?). Sorry - WSGI 2 should be read as shorthand for, whatever new spec succeeds PEP 333, whether that's PEP 444 or something else. It just means that any new spec that doesn't have to be backward-compatible can (and should) more thoroughly address the issue in question. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Wed, Sep 22, 2010 at 1:10 AM, Stephen J. Turnbull step...@xemacs.org wrote: Nick Coghlan writes: (Basically, while the issue of programmers assuming 'latin-1' or 'utf-8' or similar ASCII friendly encodings when they shouldn't is real, I don't believe a polymorphic API here will make things any *worse* than what would happen with a parallel API) That depends on how far the polymorphic API goes. As long as the polymorphic API *never ever* does anything that involves decoding wire format (and I include URL-quoting here), the programmer will have to explicitly do some decoding to get into much trouble, and at that point it's really their problem; you can't stop them. But I don't know whether the web apps programmers will be satisfied with such a minimal API. If not, you're going to have to make some delicate judgments about what to provide and what not, and whether/how to provide a safety net of some kind. I don't envy you that task. As Chris pointed out, Issue 3300 means that particular boat has already sailed where quote/unquote are concerned. Those are the only APIs which ever need to do any encoding or decoding, as they deal with percent-encoding of Unicode characters. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On Wed, Sep 22, 2010 at 1:57 AM, Ian Bicking i...@colorstudy.com wrote: All this is unrelated to the question, though -- a separate byte-oriented function won't help any case I can think of. If the programmer is implementing something like urlparse.urlsplit(user_input.encode(sys.getdefaultencoding())), it's because they *want* to get bytes out. So if it's named urlparse.urlsplit_bytes() they'll just use that, with the same corruption. Since bytes and text don't interact well, the choice of bytes in and bytes out will be a deliberate one. *Or*, bytes will unintentionally come through, but that will just delay the error a while when the bytes out don't work (e.g., urlparse.urljoin(text_url, urlparse.urlsplit(byte_url).path). Delaying the error is a little annoying, but a delayed error doesn't lead to mojibake. Indeed, this line of thinking is what brought me back around to the polymorphic point of view. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
Ian Bicking: I think the use case everyone has in mind here is where you get a URL from one of these sources, and you want to handle it. I have a hard time imagining the sequence of events that would lead to mojibake. Naive parsing of a document in bytes couldn't do it, because if you have a non-ASCII-compatible document your ASCII-based parsing will also fail (e.g., looking for b'href=(.*?)'). It depends on what the particular ASCII-based parsing is doing. For example, the set of trail bytes in Shift-JIS includes the same bytes as some of the punctuation characters in ASCII as well as all the letters. A search or split on '@' or '|' may find the trail byte in a two-byte character rather than a true occurrence of that character so the operation 'succeeds' but produces an incorrect result. Over time, the set of trail bytes used has expanded - in GB18030 digits are possible although many of the most important characters for parsing such as ''' #%.?/''' are still safe as they may not be trail bytes in the common double-byte character sets. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
On the other hand, it is dangerous to provide a polymorphic API which does that more extensive parsing, because a less than paranoid programmer will have very likely allowed the parsed components to escape from the context where their encodings can be reliably determined. =A0Remember, *it is unlikely that they will ever be punished for their own lack of caution.* =A0The person who is doomed is somebody who tries to take that code and reuse it in a different context. Yeah, that's the original reasoning that had me leaning towards the parallel API approach. If I seem to be changing my mind a lot in this thread it's because I'm genuinely torn between the desire to make it easier to port existing 2.x code to 3.x by making the current API polymorphic and the fear that doing so will reintroduce some of the exact same bytes/text confusion that the bytes/str split is trying to get rid of. I don't think polymorphic API's do anyone any favours in the long run. My experience of the Py2 email API was that it would give the developer false comfort, only to blow up when the app was in the hands of users, and it didn't seem to matter how careful I was. Py3 has gone the pure/strict route in the core, and I think libs should be consistent with that choice. Developers will have work a little harder, but there will be less surprises. -- Andrew McNamara, Senior Developer, Object Craft http://www.object-craft.com.au/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Goodbye
I'm rather sad to have been sacked, but such is life. I won't be doing any more work on the bug tracker for obvious reasons, but hope that you who have managed to keep your voluntary jobs manage to keep Python going. Kindest regards. Mark Lawrence. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Goodbye
On Tue, Sep 21, 2010 at 7:58 PM, Mark Lawrence breamore...@yahoo.co.uk wrote: I'm rather sad to have been sacked, but such is life. I won't be doing any more work on the bug tracker for obvious reasons, but hope that you who have managed to keep your voluntary jobs manage to keep Python going. Umm, what? You mean http://bugs.python.org/issue2180 ? Mark, please stop closing these based on age. The needs to be a determination whether this is a valid bug. If so, then a patch is needed. If not, it can be closed. Am I missing something? -Jack ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Supporting raw bytes data in urllib.parse.* (was Re: Polymorphic best practices)
Neil Hodgson writes: Over time, the set of trail bytes used has expanded - in GB18030 digits are possible although many of the most important characters for parsing such as ''' #%.?/''' are still safe as they may not be trail bytes in the common double-byte character sets. That's just not true. Many double-byte character sets in use are based on ISO-2022, which allows the whole GL repertoire to be used. Perhaps you're thinking about variable-width encodings like Shift JIS and Big5, where I believe that restriction on trailing bytes for double-byte characters holds. However, 7-bit encodings with control sequences remain common in several contexts, at least in Japan and Korea. In particular, I can't say how frequent it is, especially nowadays, but I have seen ISO-2022-JP in URLs on the wire. What really saves the day here is not that common encodings just don't do that. It's that even in the case where only syntactically significant bytes in the representation are URL-encoded, they *are* URL-encoded. As long as the parsing library restricts itself to treating only wire-format input, you're OK.[1] But once you start doing things that involve decoding URL-encoding, you can run into trouble. Footnotes: [1] With conforming input. I assume that the libraries know how to defend themselves from non-conforming input, which could be any kind of bug or attack, not just mojibake. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com