Re: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors

2013-11-14 Thread Stefan Behnel
Nick Coghlan, 13.11.2013 17:25: > Note that the specific problem with just annotating the exception > rather than a specific frame is that you lose the stack context for > where the annotation occurred. The current chaining workaround doesn't > just change the exception message, it also breaks the

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 November 2013 11:10, Terry Reedy wrote: > On 11/14/2013 5:32 PM, Victor Stinner wrote: > >> I don't like the functions codecs.encode() and codecs.decode() because >> the type of the result depends on the encoding (second parameter). We >> try to avoid this in Python. > > > Such dependence is

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Walter Dörwald
Am 15.11.2013 um 00:42 schrieb Serhiy Storchaka : > > 15.11.13 00:32, Victor Stinner написав(ла): >> And add transform() and untransform() methods to bytes and str types. >> In practice, it might be same codecs registry for all codecs just with >> a new attribute. > > If the transform() method wi

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Steven D'Aprano
On Fri, Nov 15, 2013 at 02:28:48PM +1100, Cameron Simpson wrote: > > Non-ascii Unicode strings are just a special case of the more general > > problem of what to do if printing the exception raises. If > > str(exception.message) raises, suppressing the message seems like a > > perfectly reasona

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Cameron Simpson
On 15Nov2013 14:08, Steven D'Aprano wrote: > On Thu, Nov 14, 2013 at 04:02:17PM -0800, Chris Barker wrote: > > right -- any bugfix changes behaviour > > It isn't clear that this is a bug at all. > > Non-ascii Unicode strings are just a special case of the more general > problem of what to do if

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Cameron Simpson
On 14Nov2013 15:57, Chris Barker - NOAA Federal wrote: > (amazing to me how many people are still using <=2.7, actually, even > for new projects .. thank you Red Hat "Enterprise" Linux ;-) ) Well, one of the things RHEL gets you is platform stability (they backport fixes; primarily security in th

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Terry Reedy
On 11/14/2013 7:41 PM, Chris Barker wrote: On Thu, Nov 14, 2013 at 3:58 PM, Steven D'Aprano wrote: It's not a given that the current behaviour *is* a bug. I'll concede that it's not a bug unless someone said somewhere that unicode messages should work In particular, what does the reference

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Steven D'Aprano
On Thu, Nov 14, 2013 at 04:02:17PM -0800, Chris Barker wrote: > On Thu, Nov 14, 2013 at 1:55 PM, Tres Seaver wrote: > > > Fixing any bug is "changing behavior"; 2.7 is not frozen for bugfixes. > > Thank you. > > > The real question is whether third-party code will break when the > > now-empty

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Steven D'Aprano
On Thu, Nov 14, 2013 at 09:09:06PM -0500, Terry Reedy wrote: > 1.5 was around for a long time; not sure if it is completely gone yet. It's not. I forget the details, but after the last American PyCon, somebody posted a message about a fellow they met who was still using 1.5 in production. --

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Terry Reedy
On 11/14/2013 6:57 PM, Chris Barker wrote: On Thu, Nov 14, 2013 at 1:20 PM, Victor Stinner Seriously, *all* these tricky bugs are fixed in Python 3. So don't loose time on trying to workaround them, but invest in the future: upgrade to Python 3! Maybe so -- but we are either maintaining 2.7

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Terry Reedy
On 11/14/2013 6:03 PM, Nick Coghlan wrote: You have to get it out of your head that codecs are just about text and and binary data. 99+% of the current codec module doc leads one to that impression. The fact that codecs are expected to have a file reader and writer and that the default 'stri

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Terry Reedy
On 11/14/2013 5:32 PM, Victor Stinner wrote: I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (second parameter). We try to avoid this in Python. Such dependence is common with arithmetic. >>> 1 + 2 3 >>> 1 + 2.0 3.0 >>> 1 +

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Chris Barker
On Thu, Nov 14, 2013 at 3:58 PM, Steven D'Aprano wrote: > It's not a given that the current behaviour *is* a bug. I'll concede that it's not a bug unless someone said somewhere that unicode messages should work .. but that's kind of a semantic argument. I have to say it's a very odd choice to m

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Chris Barker
On Thu, Nov 14, 2013 at 1:20 PM, Victor Stinner >> If you create an Exception with a unicode object for the message, (...) > > In Python 2, there are too many similar corner cases. It is impossible > to fix these bugs without taking the risk of introducing a regression. Yes, there are -- the auto

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Chris Barker
On Thu, Nov 14, 2013 at 1:55 PM, Tres Seaver wrote: > Fixing any bug is "changing behavior"; 2.7 is not frozen for bugfixes. Thank you. > The real question is whether third-party code will break when the > now-empty error messages appear with '?' littered through them? right -- any bugfix cha

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Steven D'Aprano
On Thu, Nov 14, 2013 at 04:55:19PM -0500, Tres Seaver wrote: > Fixing any bug is "changing behavior"; 2.7 is not frozen for bugfixes. It's not a given that the current behaviour *is* a bug. Exception messages in 2 are byte-strings, not Unicode. Trying to use Unicode instead is not, as far as I

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Serhiy Storchaka
15.11.13 00:32, Victor Stinner написав(ла): And add transform() and untransform() methods to bytes and str types. In practice, it might be same codecs registry for all codecs just with a new attribute. If the transform() method will be added, I prefer to have only one transformation method and

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 09:11, "Nick Coghlan" wrote: > > > On 15 Nov 2013 08:42, "Victor Stinner" wrote: > > > > Oh, I forgot to mention that I sent this email in reaction to this issue: > > > > http://bugs.python.org/issue19585 > > > > Modifying the critical PyFrameObject because the codecs API raises >

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Serhiy Storchaka
15.11.13 01:03, Nick Coghlan написав(ла): We already do this check in the existing convenience methods - it raises TypeError. The problem with this check is that it happens *after* encoding/decoding. This opens door for DoS (see my last message).

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Ethan Furman
On 11/14/2013 02:59 PM, Terry Reedy wrote: On 11/14/2013 4:55 PM, Tres Seaver wrote: About the only things I can think of which might break would be doctests, but people *expect* those to break across third-dot releases of Python (one reason why I hate them). My impression is that we avoid en

[Python-Dev] "*zip-bomb" via codecs

2013-11-14 Thread Serhiy Storchaka
It is possible make a DDoS using the fact that codecs registry provides access to gzip and bzip2 decompressor. Someone can send HTTP request or email message with specified "gzip_codec" or "bzip2_codec" as content encoding and great well compressed gzip- or bzip2-file as a content. Naive server

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 08:42, "Victor Stinner" wrote: > > Oh, I forgot to mention that I sent this email in reaction to this issue: > > http://bugs.python.org/issue19585 > > Modifying the critical PyFrameObject because the codecs API raises > surprising errors doesn't sound correct. I prefer to fix how co

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Nick Coghlan
On 15 Nov 2013 08:34, "Victor Stinner" wrote: > > Hi, > > I saw that Nick Coghlan documented codecs.encode() and > codecs.decode(), and changed the exception raised when codecs like > rot_13 are used on bytes.decode() and str.encode(). > > I don't like the functions codecs.encode() and codecs.deco

Re: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors

2013-11-14 Thread Greg Ewing
Walter Dörwald wrote: Unfortunaty the frame from the decorator shows up in the traceback. Maybe the decorator could remove its own frame from the traceback? -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/lis

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Terry Reedy
On 11/14/2013 4:55 PM, Tres Seaver wrote: About the only things I can think of which might break would be doctests, but people *expect* those to break across third-dot releases of Python (one reason why I hate them). My impression is that we avoid enhancing correct exception messages in bugfi

Re: [Python-Dev] peps: PEP 456: add some of the new implementation details to the PEP's text

2013-11-14 Thread Terry Reedy
On 11/14/2013 4:00 AM, Antoine Pitrou wrote: On Wed, 13 Nov 2013 23:33:02 +0100 (CET) christian.heimes wrote: +Small string optimization += + +Hash functions like SipHash24 have a costly initialization and finalization +code that can dominate speed of the algorithm for

Re: [Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Victor Stinner
Oh, I forgot to mention that I sent this email in reaction to this issue: http://bugs.python.org/issue19585 Modifying the critical PyFrameObject because the codecs API raises surprising errors doesn't sound correct. I prefer to fix how codecs are used, than modifying the PyFrameObject. For more

[Python-Dev] Add transform() and untranform() methods

2013-11-14 Thread Victor Stinner
Hi, I saw that Nick Coghlan documented codecs.encode() and codecs.decode(), and changed the exception raised when codecs like rot_13 are used on bytes.decode() and str.encode(). I don't like the functions codecs.encode() and codecs.decode() because the type of the result depends on the encoding (

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/14/2013 04:02 PM, Benjamin Peterson wrote: > 2013/11/14 Chris Barker : >> So a proposal: >> >> Use 'replace" mode for the encoding to the default, and at least >> the user would see SOMETHING of the message. In a common case, it >> would be a lo

Re: [Python-Dev] The pysandbox project is broken

2013-11-14 Thread Armin Rigo
Hi Victor, On Wed, Nov 13, 2013 at 12:58 AM, Victor Stinner wrote: > I now gave up on sandboxing Python. I just would like to warn other > core developers that trying to put a sandbox in Python is not a good > idea :-) I cannot thank you enough for writing this mail :-) It is a great place to p

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Victor Stinner
2013/11/14 Chris Barker : > (note this is about 2.7 -- sorry, but a lot of us still use that! I > can only assume that in 3.* this is a non-issue) > > I just discovered an issue that's been around a long time: > > If you create an Exception with a unicode object for the message, (...) In Python 2,

Re: [Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Benjamin Peterson
2013/11/14 Chris Barker : > So a proposal: > > Use 'replace" mode for the encoding to the default, and at least the > user would see SOMETHING of the message. In a common case, it would be > a lot of ascii, and in the worse case it would be a lot of question > marks -- still better than a totally b

Re: [Python-Dev] The pysandbox project is broken

2013-11-14 Thread Eli Bendersky
On Wed, Nov 13, 2013 at 10:27 AM, Brett Cannon wrote: > > > > On Wed, Nov 13, 2013 at 1:05 PM, Eli Bendersky wrote: > >> >> >> >> On Wed, Nov 13, 2013 at 6:58 AM, Brett Cannon wrote: >> >>> >>> >>> >>> On Wed, Nov 13, 2013 at 6:30 AM, Facundo Batista < >>> facundobati...@gmail.com> wrote: >>> >

[Python-Dev] unicode Exception messages in py2.7

2013-11-14 Thread Chris Barker
Folks, (note this is about 2.7 -- sorry, but a lot of us still use that! I can only assume that in 3.* this is a non-issue) I just discovered an issue that's been around a long time: If you create an Exception with a unicode object for the message, the message can be silently ignored if it can n

Re: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors

2013-11-14 Thread Walter Dörwald
On 14.11.13 14:22, Walter Dörwald wrote: On 13.11.13 17:25, Nick Coghlan wrote: >> [...] A more elegant (and comprehensive) solution as a PEP for 3.5 would certainly be a nice thing to have, but I think this is still much better than the 3.3 status quo. Thinking further about this, I like y

Re: [Python-Dev] [Python-checkins] Daily reference leaks (784a02ec2a26): sum=522

2013-11-14 Thread Benjamin Peterson
2013/11/14 Antoine Pitrou : > On Thu, 14 Nov 2013 22:01:32 +1000 > Nick Coghlan wrote: >> On 14 Nov 2013 21:58, "Nick Coghlan" wrote: >> > >> > >> > On 14 Nov 2013 13:52, wrote: >> > > >> > > results for 784a02ec2a26 on branch "default" >> > > >> > >

Re: [Python-Dev] [Python-checkins] cpython: Close #17828: better handling of codec errors

2013-11-14 Thread Walter Dörwald
On 13.11.13 17:25, Nick Coghlan wrote: On 14 November 2013 02:12, Nick Coghlan wrote: On 14 November 2013 00:30, Walter Dörwald wrote: On 13.11.13 14:51, nick.coghlan wrote: http://hg.python.org/cpython/rev/854a2cea31b9 changeset: 87084:854a2cea31b9 user:Nick Coghlan date:

Re: [Python-Dev] [Python-checkins] Daily reference leaks (784a02ec2a26): sum=522

2013-11-14 Thread Nick Coghlan
On 14 Nov 2013 21:58, "Nick Coghlan" wrote: > > > On 14 Nov 2013 13:52, wrote: > > > > results for 784a02ec2a26 on branch "default" > > > > > > test_codeccallbacks leaked [40, 40, 40] references, sum=120 > > test_codeccallbacks leaked [40, 40, 40] memo

Re: [Python-Dev] [Python-checkins] Daily reference leaks (784a02ec2a26): sum=522

2013-11-14 Thread Nick Coghlan
On 14 Nov 2013 13:52, wrote: > > results for 784a02ec2a26 on branch "default" > > > test_codeccallbacks leaked [40, 40, 40] references, sum=120 > test_codeccallbacks leaked [40, 40, 40] memory blocks, sum=120 > test_codecs leaked [38, 38, 38] references

Re: [Python-Dev] peps: PEP 456: add some of the new implementation details to the PEP's text

2013-11-14 Thread Antoine Pitrou
On Wed, 13 Nov 2013 23:33:02 +0100 (CET) christian.heimes wrote: > > > +Small string optimization > += > + > +Hash functions like SipHash24 have a costly initialization and finalization > +code that can dominate speed of the algorithm for very short strings. On the > +o