Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Georg Brandl
Am 11.01.2014 03:04, schrieb Antoine Pitrou: > On Fri, 10 Jan 2014 20:53:09 -0500 > "Eric V. Smith" wrote: >> >> So, I'm -1 on the PEP. It doesn't address the cases laid out in issue >> 3892. See for example http://bugs.python.org/issue3982#msg180432 . I agree. > Then we might as well not do an

Re: [Python-Dev] Python3 "complexity" - 2 use cases

2014-01-10 Thread Ben Finney
"Jim J. Jewett" writes: > > > Steven D'Aprano wrote: > >> I think that heuristics to guess the encoding have their role to play, > >> if the caller understands the risks. > > Ben Finney wrote: > > In my opinion, content-type guessing heuristics certainly don't belong > > in the standard library

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Steven D'Aprano
On Fri, Jan 10, 2014 at 06:17:02PM +0100, Juraj Sukop wrote: > As you may know, PDF operates over bytes and an integer or floating-point > number is written down as-is, for example "100" or "1.23". I'm sorry, I don't understand what you mean here. I'm honestly not trying to be difficult, but you

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Cameron Simpson
On 11Jan2014 00:43, Juraj Sukop wrote: > On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner > wrote: > > What not building "10 0 obj ... stream" and "endstream endobj" in > > Unicode and then encode to ASCII? Example: > > > > data = b''.join(( > > ("%d %d obj ... stream" % (10, 0)).encode('ascii')

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). "limited %-format" means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other fo

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:39 PM, Antoine Pitrou wrote: I know what a network protocol with ill-defined encodings looks like. For the record, I've been (and I suspect Eric and some others have also been) talking about well-defined encodings. For the DBF files that I work with, there is binary, ASCII,

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:39 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 18:28:41 -0800 Ethan Furman wrote: Is it safe to assume you don't use Python for the use-cases under discussion? You know, I've done quite a bit of network programming. No, I didn't, that's why I asked. I've also done an ex

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). "limited %-format" means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other fo

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 18:28:41 -0800 Ethan Furman wrote: > > Is it safe to assume you don't use Python for the use-cases under discussion? You know, I've done quite a bit of network programming. I've also done an experimental port of Twisted to Python 3. I know what a network protocol with ill-def

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:04 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 20:53:09 -0500 "Eric V. Smith" wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example http://bugs.python.org/issue3982#msg180432 . Then we might as well not do anything, since any at

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 20:53:09 -0500 "Eric V. Smith" wrote: > > So, I'm -1 on the PEP. It doesn't address the cases laid out in issue > 3892. See for example http://bugs.python.org/issue3982#msg180432 . Then we might as well not do anything, since any attempt to advance things is met by stubborn o

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 8:12 PM, Antoine Pitrou wrote: > On Fri, 10 Jan 2014 16:23:53 -0800 > Ethan Furman wrote: >> On 01/08/2014 02:42 PM, Antoine Pitrou wrote: >>> >>> With Victor's consent, I overhauled PEP 460 and made the feature set >>> more restricted and consistent with the bytes/str separation. >>

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 16:23:53 -0800 Ethan Furman wrote: > On 01/08/2014 02:42 PM, Antoine Pitrou wrote: > > > > With Victor's consent, I overhauled PEP 460 and made the feature set > > more restricted and consistent with the bytes/str separation. > > From the PEP: > = > > Python 3 gen

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/08/2014 02:42 PM, Antoine Pitrou wrote: With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation. From the PEP: = Python 3 generally mandates that text be stored and manipulated as unicode (i.e. str ob

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Ethan Furman
On 01/10/2014 03:22 PM, Mark Lawrence wrote: On 10/01/2014 22:06, Chris Barker wrote: I'm not so sure -- it could be used (abused?) for that, but I'm suggesting it be used for mixed ascii-binary data. I don't know that there IS a "right" way to do that -- at least not an efficient or easy to re

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou wrote: > Also, when you say you've never encountered UTF-16 text in PDFs, it > sounds like those people who've never encountered any non-ASCII data in > their programs. Let me clarify: one does not think in "writing text in Unicode"-terms in PDF.

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 3:22 PM, Mark Lawrence wrote: > The correct way is to read the interface specification which tells you > what should be in the data. Or do people not use interface specifications > these days, preferring to guess what they've got instead? > No one is suggesting guessing (

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop wrote: > What this all means is that the PDF objects are expressed in ASCII, > "stream" objects like images and fonts may have a binary part and I never > saw those UTF+16 strings. > hmm -- I wonder if they are out there in the wild, though > u

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Sat, 11 Jan 2014 00:43:39 +0100 Juraj Sukop wrote: > Basically, to ".encode('ascii')" every possible > number is not exactly simple or pretty. Well it strikes me that the PDF format itself is not exactly simple or pretty. It might be convenient that Python 2 allows you, in certain cases, to "i

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner wrote: > > What not building "10 0 obj ... stream" and "endstream endobj" in > Unicode and then encode to ASCII? Example: > > data = b''.join(( > ("%d %d obj ... stream" % (10, 0)).encode('ascii'), > binary_image_data, > ("endstream endobj").e

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Fri, Jan 10, 2014 at 10:52 PM, Chris Barker wrote: > On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop wrote: > >> As you may know, PDF operates over bytes and an integer or floating-point >> number is written down as-is, for example "100" or "1.23". >> > > Just to be clear here -- is PDF specifical

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 18:14:45 -0500 "Eric V. Smith" wrote: > > >> Because embedding the ASCII equivalent of ints and floats in byte streams > >> is a common operation? > > > > Again, if you're representing "ASCII", you're representing text and > > should use a str object. > > Yes, but is there e

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Mark Lawrence
On 10/01/2014 22:06, Chris Barker wrote: On Fri, Jan 10, 2014 at 6:05 AM, Paul Moore mailto:p.f.mo...@gmail.com>> wrote: > Using the 'latin-1' to mean unknown encoding can easily result > in Mojibake (unreadable text) entering your application with > dangerous effects on your othe

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 6:02 PM, Antoine Pitrou wrote: > On Fri, 10 Jan 2014 14:58:15 -0800 > Ethan Furman wrote: >> On 01/10/2014 02:42 PM, Antoine Pitrou wrote: >>> On Fri, 10 Jan 2014 17:33:57 -0500 >>> "Eric V. Smith" wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: > On Fri, 10 Jan 2014 12:5

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 6:05 AM, Paul Moore wrote: > > Using the 'latin-1' to mean unknown encoding can easily result > > in Mojibake (unreadable text) entering your application with > > dangerous effects on your other text data. > > Agreed. The latin-1 suggestion is purely for people who object

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 14:58:15 -0800 Ethan Furman wrote: > On 01/10/2014 02:42 PM, Antoine Pitrou wrote: > > On Fri, 10 Jan 2014 17:33:57 -0500 > > "Eric V. Smith" wrote: > >> On 1/10/2014 5:29 PM, Antoine Pitrou wrote: > >>> On Fri, 10 Jan 2014 12:56:19 -0500 > >>> "Eric V. Smith" wrote: > >

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 02:42 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 17:33:57 -0500 "Eric V. Smith" wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 "Eric V. Smith" wrote: I agree. I don't see any reason to exclude int and float. See Guido's messages http:/

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 17:33:57 -0500 "Eric V. Smith" wrote: > On 1/10/2014 5:29 PM, Antoine Pitrou wrote: > > On Fri, 10 Jan 2014 12:56:19 -0500 > > "Eric V. Smith" wrote: > >> > >> I agree. I don't see any reason to exclude int and float. See Guido's > >> messages http://bugs.python.org/issue3982#

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 17:20:32 -0500 "Eric V. Smith" wrote: > > Isn't the point of the PEP to make it easier to port 2.x code to 3.5? > Is > there really existing code like this in 2.x? No, but so what? The point of the PEP is not to allow arbitrary Python 2 code to run without modification under

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 5:29 PM, Antoine Pitrou wrote: > On Fri, 10 Jan 2014 12:56:19 -0500 > "Eric V. Smith" wrote: >> >> I agree. I don't see any reason to exclude int and float. See Guido's >> messages http://bugs.python.org/issue3982#msg180423 and >> http://bugs.python.org/issue3982#msg180430 for some ju

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 12:56:19 -0500 "Eric V. Smith" wrote: > > I agree. I don't see any reason to exclude int and float. See Guido's > messages http://bugs.python.org/issue3982#msg180423 and > http://bugs.python.org/issue3982#msg180430 for some justification and > discussion. If you are represent

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 5:12 PM, Victor Stinner wrote: > 2014/1/10 Juraj Sukop : >> In the case of PDF, the embedding of an image into PDF looks like: >> >> 10 0 obj >> << /Type /XObject >> /Width 100 >> /Height 100 >> /Alternates 15 0 R >> /Length 2167 >> >

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Victor Stinner
2014/1/10 Juraj Sukop : > In the case of PDF, the embedding of an image into PDF looks like: > > 10 0 obj > << /Type /XObject > /Width 100 > /Height 100 > /Alternates 15 0 R > /Length 2167 > >> > stream > ...binary image data... > ends

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Serhiy Storchaka
10.01.14 18:27, Baptiste Carvello написав(ла): would it make sense to be more general, and allow a "lenient mode", where all files implicitly opened with the default encoding would also use the surrogateescape error handler ? The surrogateescape error handler is compatible only with ASCII-comp

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop wrote: > As you may know, PDF operates over bytes and an integer or floating-point > number is written down as-is, for example "100" or "1.23". > Just to be clear here -- is PDF specifically bytes+ascii? Or could there be some-other-encoding unicode

[Python-Dev] Python3 "complexity" - 2 use cases

2014-01-10 Thread Jim J. Jewett
> Steven D'Aprano wrote: >> I think that heuristics to guess the encoding have their role to play, >> if the caller understands the risks. Ben Finney wrote: > In my opinion, content-type guessing heuristics certainly don't belong > in the standard library. It would be great if there were never

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Georg Brandl
Am 10.01.2014 18:56, schrieb Eric V. Smith: > On 1/10/2014 12:17 PM, Juraj Sukop wrote: >> (Sorry if this messes-up the thread order, it is meant as a reply to the >> original RFC.) >> >> Dear list, >> >> newbie here. After much hesitation I decided to put forward a use case >> which bothers me a

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Greg Ewing
INADA Naoki wrote: latin1 is OK but is it Pythonic? Latin is most certainly a Pythonic subject: http://www.youtube.com/watch?v=IIAdHEwiAy8 -- Greg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Philip Jenvey
On Jan 10, 2014, at 7:35 AM, Nick Coghlan wrote: > Putting this here because I found out today it's not in any of the > PEPs and folks have to go digging in mailing list archives to find it. > I'll add it to my Python 3 Q&A at some point. > > The reason Python 3 currently tries to rely on the PO

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Mark Lawrence
On 06/01/2014 13:24, Victor Stinner wrote: Hi, bytes % args and bytes.format(args) are requested by Mercurial and Twisted projects. The issue #3982 was stuck because nobody proposed a complete definition of the "new" features. Here is a try as a PEP. Apologies if this has already been said, b

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 12:17 PM, Juraj Sukop wrote: > (Sorry if this messes-up the thread order, it is meant as a reply to the > original RFC.) > > Dear list, > > newbie here. After much hesitation I decided to put forward a use case > which bothers me about the current proposal. Disclaimer: I happen to >

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Serhiy Storchaka
10.01.14 14:19, M.-A. Lemburg написав(ла): BTW: Perhaps it would be a good idea to backport the surrogateescape error handler to Python 2.7 to simplify writing code which works in both Python 2 and 3. You also should change the UTF-8 codec so that it will reject surrogates (i.e. u'\ud880'.enco

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Stefan Ring
On Fri, Jan 10, 2014 at 4:35 PM, Nick Coghlan wrote: > On 10 January 2014 13:32, Lennart Regebro wrote: >> No, because your environment have a default language. And Python has a >> default encoding. You only get problems when some file doesn't use the >> default encoding. > > The reason Python 3

[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
(Sorry if this messes-up the thread order, it is meant as a reply to the original RFC.) Dear list, newbie here. After much hesitation I decided to put forward a use case which bothers me about the current proposal. Disclaimer: I happen to write a library which is directly influenced by this. As

[Python-Dev] Summary of Python tracker Issues

2014-01-10 Thread Python tracker
ACTIVITY SUMMARY (2014-01-03 - 2014-01-10) Python tracker at http://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open4409 (+61) closed 27580 (+42) total 31989 (+103) Open issues wi

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Baptiste Carvello
Le 10/01/2014 16:35, Nick Coghlan a écrit : > One idea we're considering for Python 3.5 is to have a report of > "ascii" on a POSIX OS imply the surrogateescape error handler (at > least for the standard streams, and perhaps in other contexts), since > the OS reporting the POSIX/C locale almost ce

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread INADA Naoki
Now I feel it is bad thing that encouraging using unicode for binary with latin-1 encoding or surrogateescape errorhandler. Handling binary data in str type using latin-1 is just a hack. Surrogateescape is just a workaround to keep undecodable bytes in text. Encouraging binary data in str type wi

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Stefan Krah
Nick Coghlan wrote: > One idea we're considering for Python 3.5 is to have a report of > "ascii" on a POSIX OS imply the surrogateescape error handler (at > least for the standard streams, and perhaps in other contexts), since > the OS reporting the POSIX/C locale almost certainly indicates a > co

Re: [Python-Dev] [Python-checkins] peps: PEP 460: add .format_map()

2014-01-10 Thread Eric V. Smith
On 1/10/2014 10:20 AM, Nick Coghlan wrote: > On 10 January 2014 07:41, Eric V. Smith wrote: >> I'm not sure how format_map helps in porting from 2 to 3, since it >> doesn't exist in any version of 2. >> >> Although that said, it's no doubt a useful feature, just not useful in >> code that supports

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Nick Coghlan
On 10 January 2014 13:32, Lennart Regebro wrote: > On Thu, Jan 9, 2014 at 10:06 AM, Kristján Valur Jónsson > wrote: >> Do I speak Chinese to my grocer because china is a growing force in the >> world? Or start every discussion with my children with a negotiation on >> what language to use? > >

Re: [Python-Dev] [Python-checkins] peps: PEP 460: add .format_map()

2014-01-10 Thread Nick Coghlan
On 10 January 2014 07:41, Eric V. Smith wrote: > I'm not sure how format_map helps in porting from 2 to 3, since it > doesn't exist in any version of 2. > > Although that said, it's no doubt a useful feature, just not useful in > code that supports both 2 and 3 with a single code base or when port

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Matěj Cepl
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2014-01-10, 12:19 GMT, you wrote: > Using the 'latin-1' to mean unknown encoding can easily result > in Mojibake (unreadable text) entering your application with > dangerous effects on your other text data. > > E.g. "Marc-André" read using 'latin-1'

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread Paul Moore
On 10 January 2014 12:19, M.-A. Lemburg wrote: > Just a word of caution: > > Using the 'latin-1' to mean unknown encoding can easily result > in Mojibake (unreadable text) entering your application with > dangerous effects on your other text data. Agreed. The latin-1 suggestion is purely for peop

Re: [Python-Dev] Python3 "complexity"

2014-01-10 Thread M.-A. Lemburg
On 09.01.2014 22:45, Antoine Pitrou wrote: > On Thu, 9 Jan 2014 13:36:05 -0800 > Chris Barker wrote: >> >> Some folks have suggested using latin-1 (or other 8-bit encoding) -- is >> that guaranteed to work with any binary data, and round-trip accurately? > > Yes, it is. Just a word of caution:

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 11:32:05 +1000 Nick Coghlan wrote: > > > > It's consistent with bytearray.join's behaviour: > > > > >>> x = bytearray() > > >>> x.join([b"abc"]) > > bytearray(b'abc') > > >>> x > > bytearray(b'') > > Yeah, I guess I'm OK with us being consistent on that one. It's still > weird