Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 11:32:05 +1000 Nick Coghlan ncogh...@gmail.com wrote: It's consistent with bytearray.join's behaviour: x = bytearray() x.join([babc]) bytearray(b'abc') x bytearray(b'') Yeah, I guess I'm OK with us being consistent on that one. It's still weird, but also

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread M.-A. Lemburg
On 09.01.2014 22:45, Antoine Pitrou wrote: On Thu, 9 Jan 2014 13:36:05 -0800 Chris Barker chris.bar...@noaa.gov wrote: Some folks have suggested using latin-1 (or other 8-bit encoding) -- is that guaranteed to work with any binary data, and round-trip accurately? Yes, it is. Just a word

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Paul Moore
On 10 January 2014 12:19, M.-A. Lemburg m...@egenix.com wrote: Just a word of caution: Using the 'latin-1' to mean unknown encoding can easily result in Mojibake (unreadable text) entering your application with dangerous effects on your other text data. Agreed. The latin-1 suggestion is

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Matěj Cepl
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2014-01-10, 12:19 GMT, you wrote: Using the 'latin-1' to mean unknown encoding can easily result in Mojibake (unreadable text) entering your application with dangerous effects on your other text data. E.g. Marc-André read using 'latin-1' if

Re: [Python-Dev] [Python-checkins] peps: PEP 460: add .format_map()

2014-01-10 Thread Nick Coghlan
On 10 January 2014 07:41, Eric V. Smith e...@trueblade.com wrote: I'm not sure how format_map helps in porting from 2 to 3, since it doesn't exist in any version of 2. Although that said, it's no doubt a useful feature, just not useful in code that supports both 2 and 3 with a single code

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Nick Coghlan
On 10 January 2014 13:32, Lennart Regebro rege...@gmail.com wrote: On Thu, Jan 9, 2014 at 10:06 AM, Kristján Valur Jónsson krist...@ccpgames.com wrote: Do I speak Chinese to my grocer because china is a growing force in the world? Or start every discussion with my children with a negotiation

Re: [Python-Dev] [Python-checkins] peps: PEP 460: add .format_map()

2014-01-10 Thread Eric V. Smith
On 1/10/2014 10:20 AM, Nick Coghlan wrote: On 10 January 2014 07:41, Eric V. Smith e...@trueblade.com wrote: I'm not sure how format_map helps in porting from 2 to 3, since it doesn't exist in any version of 2. Although that said, it's no doubt a useful feature, just not useful in code that

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Stefan Krah
Nick Coghlan ncogh...@gmail.com wrote: One idea we're considering for Python 3.5 is to have a report of ascii on a POSIX OS imply the surrogateescape error handler (at least for the standard streams, and perhaps in other contexts), since the OS reporting the POSIX/C locale almost certainly

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread INADA Naoki
Now I feel it is bad thing that encouraging using unicode for binary with latin-1 encoding or surrogateescape errorhandler. Handling binary data in str type using latin-1 is just a hack. Surrogateescape is just a workaround to keep undecodable bytes in text. Encouraging binary data in str type

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Baptiste Carvello
Le 10/01/2014 16:35, Nick Coghlan a écrit : One idea we're considering for Python 3.5 is to have a report of ascii on a POSIX OS imply the surrogateescape error handler (at least for the standard streams, and perhaps in other contexts), since the OS reporting the POSIX/C locale almost

[Python-Dev] Summary of Python tracker Issues

2014-01-10 Thread Python tracker
ACTIVITY SUMMARY (2014-01-03 - 2014-01-10) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open4409 (+61) closed 27580 (+42) total 31989 (+103) Open issues

[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
(Sorry if this messes-up the thread order, it is meant as a reply to the original RFC.) Dear list, newbie here. After much hesitation I decided to put forward a use case which bothers me about the current proposal. Disclaimer: I happen to write a library which is directly influenced by this. As

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Stefan Ring
On Fri, Jan 10, 2014 at 4:35 PM, Nick Coghlan ncogh...@gmail.com wrote: On 10 January 2014 13:32, Lennart Regebro rege...@gmail.com wrote: No, because your environment have a default language. And Python has a default encoding. You only get problems when some file doesn't use the default

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Serhiy Storchaka
10.01.14 14:19, M.-A. Lemburg написав(ла): BTW: Perhaps it would be a good idea to backport the surrogateescape error handler to Python 2.7 to simplify writing code which works in both Python 2 and 3. You also should change the UTF-8 codec so that it will reject surrogates (i.e.

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 12:17 PM, Juraj Sukop wrote: (Sorry if this messes-up the thread order, it is meant as a reply to the original RFC.) Dear list, newbie here. After much hesitation I decided to put forward a use case which bothers me about the current proposal. Disclaimer: I happen to write a

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Mark Lawrence
On 06/01/2014 13:24, Victor Stinner wrote: Hi, bytes % args and bytes.format(args) are requested by Mercurial and Twisted projects. The issue #3982 was stuck because nobody proposed a complete definition of the new features. Here is a try as a PEP. Apologies if this has already been said,

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Philip Jenvey
On Jan 10, 2014, at 7:35 AM, Nick Coghlan wrote: Putting this here because I found out today it's not in any of the PEPs and folks have to go digging in mailing list archives to find it. I'll add it to my Python 3 QA at some point. The reason Python 3 currently tries to rely on the POSIX

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Greg Ewing
INADA Naoki wrote: latin1 is OK but is it Pythonic? Latin is most certainly a Pythonic subject: http://www.youtube.com/watch?v=IIAdHEwiAy8 -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Georg Brandl
Am 10.01.2014 18:56, schrieb Eric V. Smith: On 1/10/2014 12:17 PM, Juraj Sukop wrote: (Sorry if this messes-up the thread order, it is meant as a reply to the original RFC.) Dear list, newbie here. After much hesitation I decided to put forward a use case which bothers me about the

[Python-Dev] Python3 complexity - 2 use cases

2014-01-10 Thread Jim J. Jewett
Steven D'Aprano wrote: I think that heuristics to guess the encoding have their role to play, if the caller understands the risks. Ben Finney wrote: In my opinion, content-type guessing heuristics certainly don't belong in the standard library. It would be great if there were never any

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop juraj.su...@gmail.com wrote: As you may know, PDF operates over bytes and an integer or floating-point number is written down as-is, for example 100 or 1.23. Just to be clear here -- is PDF specifically bytes+ascii? Or could there be

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Serhiy Storchaka
10.01.14 18:27, Baptiste Carvello написав(ла): would it make sense to be more general, and allow a lenient mode, where all files implicitly opened with the default encoding would also use the surrogateescape error handler ? The surrogateescape error handler is compatible only with

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Victor Stinner
2014/1/10 Juraj Sukop juraj.su...@gmail.com: In the case of PDF, the embedding of an image into PDF looks like: 10 0 obj /Type /XObject /Width 100 /Height 100 /Alternates 15 0 R /Length 2167 stream ...binary image data...

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 5:12 PM, Victor Stinner wrote: 2014/1/10 Juraj Sukop juraj.su...@gmail.com: In the case of PDF, the embedding of an image into PDF looks like: 10 0 obj /Type /XObject /Width 100 /Height 100 /Alternates 15 0 R /Length 2167

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and float. See Guido's messages http://bugs.python.org/issue3982#msg180423 and http://bugs.python.org/issue3982#msg180430 for some justification and discussion. If you

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and float. See Guido's messages http://bugs.python.org/issue3982#msg180423 and http://bugs.python.org/issue3982#msg180430 for

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 17:20:32 -0500 Eric V. Smith e...@trueblade.com wrote: Isn't the point of the PEP to make it easier to port 2.x code to 3.5? Is there really existing code like this in 2.x? No, but so what? The point of the PEP is not to allow arbitrary Python 2 code to run without

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 17:33:57 -0500 Eric V. Smith e...@trueblade.com wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and float. See Guido's messages

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 02:42 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 17:33:57 -0500 Eric V. Smith e...@trueblade.com wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 14:58:15 -0800 Ethan Furman et...@stoneleaf.us wrote: On 01/10/2014 02:42 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 17:33:57 -0500 Eric V. Smith e...@trueblade.com wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 6:05 AM, Paul Moore p.f.mo...@gmail.com wrote: Using the 'latin-1' to mean unknown encoding can easily result in Mojibake (unreadable text) entering your application with dangerous effects on your other text data. Agreed. The latin-1 suggestion is purely for people

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Mark Lawrence
On 10/01/2014 22:06, Chris Barker wrote: On Fri, Jan 10, 2014 at 6:05 AM, Paul Moore p.f.mo...@gmail.com mailto:p.f.mo...@gmail.com wrote: Using the 'latin-1' to mean unknown encoding can easily result in Mojibake (unreadable text) entering your application with dangerous

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 18:14:45 -0500 Eric V. Smith e...@trueblade.com wrote: Because embedding the ASCII equivalent of ints and floats in byte streams is a common operation? Again, if you're representing ASCII, you're representing text and should use a str object. Yes, but is there

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Fri, Jan 10, 2014 at 10:52 PM, Chris Barker chris.bar...@noaa.govwrote: On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop juraj.su...@gmail.comwrote: As you may know, PDF operates over bytes and an integer or floating-point number is written down as-is, for example 100 or 1.23. Just to be

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner victor.stin...@gmail.comwrote: What not building 10 0 obj ... stream and endstream endobj in Unicode and then encode to ASCII? Example: data = b''.join(( (%d %d obj ... stream % (10, 0)).encode('ascii'), binary_image_data, (endstream

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Sat, 11 Jan 2014 00:43:39 +0100 Juraj Sukop juraj.su...@gmail.com wrote: Basically, to .encode('ascii') every possible number is not exactly simple or pretty. Well it strikes me that the PDF format itself is not exactly simple or pretty. It might be convenient that Python 2 allows you, in

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop juraj.su...@gmail.com wrote: What this all means is that the PDF objects are expressed in ASCII, stream objects like images and fonts may have a binary part and I never saw those UTF+16 strings. hmm -- I wonder if they are out there in the wild,

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 3:22 PM, Mark Lawrence breamore...@yahoo.co.ukwrote: The correct way is to read the interface specification which tells you what should be in the data. Or do people not use interface specifications these days, preferring to guess what they've got instead? No one is

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou solip...@pitrou.netwrote: Also, when you say you've never encountered UTF-16 text in PDFs, it sounds like those people who've never encountered any non-ASCII data in their programs. Let me clarify: one does not think in writing text in

Re: [Python-Dev] Python3 complexity

2014-01-10 Thread Ethan Furman
On 01/10/2014 03:22 PM, Mark Lawrence wrote: On 10/01/2014 22:06, Chris Barker wrote: I'm not so sure -- it could be used (abused?) for that, but I'm suggesting it be used for mixed ascii-binary data. I don't know that there IS a right way to do that -- at least not an efficient or easy to

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/08/2014 02:42 PM, Antoine Pitrou wrote: With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation. From the PEP: = Python 3 generally mandates that text be stored and manipulated as unicode (i.e. str

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 16:23:53 -0800 Ethan Furman et...@stoneleaf.us wrote: On 01/08/2014 02:42 PM, Antoine Pitrou wrote: With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation. From the PEP: =

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 8:12 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 16:23:53 -0800 Ethan Furman et...@stoneleaf.us wrote: On 01/08/2014 02:42 PM, Antoine Pitrou wrote: With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example http://bugs.python.org/issue3982#msg180432 . Then we might as well not do anything, since any attempt to advance things is met

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:04 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example http://bugs.python.org/issue3982#msg180432 . Then we might as well not do

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 18:28:41 -0800 Ethan Furman et...@stoneleaf.us wrote: Is it safe to assume you don't use Python for the use-cases under discussion? You know, I've done quite a bit of network programming. I've also done an experimental port of Twisted to Python 3. I know what a network

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). limited %-format means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:39 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 18:28:41 -0800 Ethan Furman wrote: Is it safe to assume you don't use Python for the use-cases under discussion? You know, I've done quite a bit of network programming. No, I didn't, that's why I asked. I've also done an

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:39 PM, Antoine Pitrou wrote: I know what a network protocol with ill-defined encodings looks like. For the record, I've been (and I suspect Eric and some others have also been) talking about well-defined encodings. For the DBF files that I work with, there is binary,

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). limited %-format means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Cameron Simpson
On 11Jan2014 00:43, Juraj Sukop juraj.su...@gmail.com wrote: On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner victor.stin...@gmail.comwrote: What not building 10 0 obj ... stream and endstream endobj in Unicode and then encode to ASCII? Example: data = b''.join(( (%d %d obj ...

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Steven D'Aprano
On Fri, Jan 10, 2014 at 06:17:02PM +0100, Juraj Sukop wrote: As you may know, PDF operates over bytes and an integer or floating-point number is written down as-is, for example 100 or 1.23. I'm sorry, I don't understand what you mean here. I'm honestly not trying to be difficult, but you

Re: [Python-Dev] Python3 complexity - 2 use cases

2014-01-10 Thread Ben Finney
Jim J. Jewett jimjjew...@gmail.com writes: Steven D'Aprano wrote: I think that heuristics to guess the encoding have their role to play, if the caller understands the risks. Ben Finney wrote: In my opinion, content-type guessing heuristics certainly don't belong in the standard

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Georg Brandl
Am 11.01.2014 03:04, schrieb Antoine Pitrou: On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example http://bugs.python.org/issue3982#msg180432 . I agree. Then we might as well