Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread R. David Murray
On Sun, 12 Jan 2014 17:51:41 +1000, Nick Coghlan ncogh...@gmail.com wrote: On 12 January 2014 04:38, R. David Murray rdmur...@bitdance.com wrote: But! Our goal should be to help people convert to Python3. So how can we find out what the specific problems are that real-world programs are

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Juraj Sukop
On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.infowrote: On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote: AFAIK (and just for the record), there could be both Latin1 text and UTF-16 in a PDF (and other encodings too), depending on the font used: [...]

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Nick Coghlan
On 12 Jan 2014 21:53, Juraj Sukop juraj.su...@gmail.com wrote: On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.info wrote: On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote: AFAIK (and just for the record), there could be both Latin1 text and UTF-16 in a

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Juraj Sukop
On Sun, Jan 12, 2014 at 2:16 PM, Nick Coghlan ncogh...@gmail.com wrote: Why are you proposing to do the *join* in text space? Encode all the parts separately, concatenate them with b'\n'.join() (or whatever separator is appropriate). It's only the *text formatting operation* that needs to be

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Steven D'Aprano
On Sun, Jan 12, 2014 at 12:52:18PM +0100, Juraj Sukop wrote: On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.infowrote: On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote: AFAIK (and just for the record), there could be both Latin1 text and UTF-16 in a

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Steven D'Aprano
On Sun, Jan 12, 2014 at 11:16:37PM +1000, Nick Coghlan wrote: content = '\n'.join([ 'header', 'part 2 %.3f' % number, binary_image_data.decode('latin-1'), utf16_string.encode('utf-16be').decode('latin-1'), 'trailer']).encode('latin-1')

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Juraj Sukop
Wait a second, this is how I understood it but what Nick said made me think otherwise... On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano st...@pearwood.infowrote: On Sun, Jan 12, 2014 at 12:52:18PM +0100, Juraj Sukop wrote: On Sun, Jan 12, 2014 at 2:35 AM, Steven D'Aprano st...@pearwood.info

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Stephen J. Turnbull
Daniel Holth writes: -1 on adding more surrogateesapes by default. It's a pain to track down where the encoding errors came from. What do you mean by default? It was quite explicit in the code I posted, and it's the only reasonable thing to do with text data without known (but ASCII

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Ethan Furman
On 01/12/2014 12:39 PM, Stephen J. Turnbull wrote: Daniel Holth writes: -1 on adding more surrogateesapes by default. It's a pain to track down where the encoding errors came from. What do you mean by default? It was quite explicit in the code I posted, and it's the only reasonable

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Mark Shannon
Why not just use six.byte_format(fmt, *args)? It works on both Python2 and Python3 and accepts the numerical format specifiers, plus '%b' for inserting bytes and '%a' for converting text to ascii. Admittedly it doesn't exist yet, but it could and it would save a lot of arguing :) (Apologies

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Ethan Furman
On 01/12/2014 01:59 PM, Mark Shannon wrote: Why not just use six.byte_format(fmt, *args)? It works on both Python2 and Python3 and accepts the numerical format specifiers, plus '%b' for inserting bytes and '%a' for converting text to ascii. Sounds like the second best option! Admittedly

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Chris Angelico
On Mon, Jan 13, 2014 at 4:57 AM, Juraj Sukop juraj.su...@gmail.com wrote: On Sun, Jan 12, 2014 at 6:22 PM, Steven D'Aprano st...@pearwood.info wrote: First, utf16_string confuses me. What is it? If it is a Unicode string, i.e.: It is a Unicode string which happens to contain code points

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Stephen J. Turnbull
Steven D'Aprano writes: then the name is horribly misleading, and it is best handled like this: content = '\n'.join([ 'header', 'part 2 %.3f' % number, binary_image_data.decode('latin-1'), utf16_string, # Misleading name, actually Unicode string

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Ethan Furman
On 01/12/2014 02:31 PM, Stephen J. Turnbull wrote: This corrupts binary_image_data. Each byte 127 will be replaced by two bytes. In the second case, you can use latin1 to encode, it it gives you what you want. This kind of subtlety is precisely why MAL warned about use of latin1 to smuggle

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Steven D'Aprano
On Mon, Jan 13, 2014 at 07:31:16AM +0900, Stephen J. Turnbull wrote: Steven D'Aprano writes: then the name is horribly misleading, and it is best handled like this: content = '\n'.join([ 'header', 'part 2 %.3f' % number,

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Stephen J. Turnbull
Ethan Furman writes: This kind of subtlety is precisely why MAL warned about use of latin1 to smuggle bytes. And why I've been fighting Steven D'Aprano on it. No, I think you haven't been fighting Steven d'A on it. You're talking about parsing and generating structured binary files,

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Ethan Furman
On 01/12/2014 04:02 PM, Stephen J. Turnbull wrote: So when you talk about we, I suspect you are not the we everybody else is arguing with. In particular, AIUI your use case is not included in the use cases most of us -- including Steven -- are thinking about. Ah, so even in the minority I'm

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-12 Thread Stephen J. Turnbull
Steven D'Aprano writes: Of course you're right, but I have understood the above as being a sketch and not real code. (E.g. does header really mean the literal string header, or does it stand in for something which is a header?) In real code, one would need to have some way of telling

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Nick Coghlan
On 11 January 2014 08:58, Ethan Furman et...@stoneleaf.us wrote: On 01/10/2014 02:42 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 17:33:57 -0500 Eric V. Smith e...@trueblade.com wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Nick Coghlan
On 11 January 2014 12:28, Ethan Furman et...@stoneleaf.us wrote: On 01/10/2014 06:04 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Stephen Hansen
For not caring much, your own stubbornness is quite notable throughout this discussion. Stones and glass houses. :) That said: Twisted and Mercurial aren't the only ones who are hurt by this, at all. I'm aware of at least two other projects who are actively hindered in their support or migration

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Glenn Linderman
On 1/11/2014 1:44 AM, Stephen Hansen wrote: There's been a number of examples given: PDF, HTTP, network streams that switch inline from text-ish to binary and back-again.. But, we can focus that down to a very narrow and not at all uncommon situation in the latter. PDF has been mentioned a

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Kristján Valur Jónsson
+kristjan=ccpgames@python.org] On Behalf Of Nick Coghlan Sent: 11. janúar 2014 08:43 To: Ethan Furman Cc: python-dev@python.org Subject: Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5 No, it's the POSIX text model is completely broken and we're not letting

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Juraj Sukop
On Sat, Jan 11, 2014 at 5:14 AM, Cameron Simpson c...@zip.com.au wrote: Hi Juraj, Hello Cameron. data = b' '.join( bytify( [ 10, 0, obj, binary_image_data, ... ] ) ) Thanks for the suggestion! The problem with bytify is that some items might require different formatting than other

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Juraj Sukop
On Sat, Jan 11, 2014 at 6:36 AM, Steven D'Aprano st...@pearwood.infowrote: I'm sorry, I don't understand what you mean here. I'm honestly not trying to be difficult, but you sound confident that you understand what you are doing, but your description doesn't make sense to me. To me, it looks

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Georg Brandl
Am 11.01.2014 09:43, schrieb Nick Coghlan: On 11 January 2014 12:28, Ethan Furman et...@stoneleaf.us wrote: On 01/10/2014 06:04 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Georg Brandl
Am 11.01.2014 10:44, schrieb Stephen Hansen: I mean, its not like the bytes type lacks knowledge of the subset of bytes that happen to be 7-bit ascii-compatible and can't perform text-ish operations on them-- Python 3.3.3 (v3.3.3:c3896275c0f6, Nov 18 2013, 21:18:40) [MSC v.1600 32 bit

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Georg Brandl
Am 11.01.2014 14:49, schrieb Georg Brandl: Am 11.01.2014 10:44, schrieb Stephen Hansen: I mean, its not like the bytes type lacks knowledge of the subset of bytes that happen to be 7-bit ascii-compatible and can't perform text-ish operations on them-- Python 3.3.3

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread M.-A. Lemburg
On 11.01.2014 14:54, Georg Brandl wrote: Am 11.01.2014 14:49, schrieb Georg Brandl: Am 11.01.2014 10:44, schrieb Stephen Hansen: I mean, its not like the bytes type lacks knowledge of the subset of bytes that happen to be 7-bit ascii-compatible and can't perform text-ish operations on

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Antoine Pitrou
On Sat, 11 Jan 2014 08:26:57 +0100 Georg Brandl g.bra...@gmx.net wrote: Am 11.01.2014 03:04, schrieb Antoine Pitrou: On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Nick Coghlan
On 12 January 2014 01:15, M.-A. Lemburg m...@egenix.com wrote: On 11.01.2014 14:54, Georg Brandl wrote: Am 11.01.2014 14:49, schrieb Georg Brandl: Am 11.01.2014 10:44, schrieb Stephen Hansen: I mean, its not like the bytes type lacks knowledge of the subset of bytes that happen to be 7-bit

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Steven D'Aprano
On Sat, Jan 11, 2014 at 01:56:56PM +0100, Juraj Sukop wrote: On Sat, Jan 11, 2014 at 6:36 AM, Steven D'Aprano st...@pearwood.infowrote: If you consider PDF as binary with occasional pieces of ASCII text, then working with bytes makes sense. But I wonder whether it might be better to

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Antoine Pitrou
On Sun, 12 Jan 2014 01:34:26 +1000 Nick Coghlan ncogh...@gmail.com wrote: Yes, it bloody well does. The number of people who have told me that using Python 3 is what allowed them to finally understand how Unicode works vastly exceeds the number of wire protocol and file format devs that have

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Ethan Furman
On 01/11/2014 07:38 AM, Steven D'Aprano wrote: The point that I am making is that many people want to add formatting operations to bytes so they can put ASCII strings inside bytes. But (as far as I can tell) they don't need to do this, because they can treat Unicode strings containing code

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread M.-A. Lemburg
On 11.01.2014 16:34, Nick Coghlan wrote: On 12 January 2014 01:15, M.-A. Lemburg m...@egenix.com wrote: On 11.01.2014 14:54, Georg Brandl wrote: Am 11.01.2014 14:49, schrieb Georg Brandl: Am 11.01.2014 10:44, schrieb Stephen Hansen: I mean, its not like the bytes type lacks knowledge of the

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Matěj Cepl
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2014-01-11, 10:56 GMT, you wrote: I don't know what the fuss is about. I just cannot resist: When you are calm while everybody else is in the state of panic, you haven’t understood the problem. -- one of many collections of

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Ethan Furman
On 01/11/2014 12:43 AM, Nick Coghlan wrote: In particular, the bytes type is, and always will be, designed for pure binary manipulation [...] I apologize for being blunt, but this is a lie. Lets take a look at the methods defined by bytes: dir(b'') ['__add__', '__class__', '__contains__',

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Ethan Furman
On 01/11/2014 07:34 AM, Nick Coghlan wrote: On 12 January 2014 01:15, M.-A. Lemburg wrote: We don't have to be pedantic about the bytes/text separation. It doesn't help in real life. Yes, it bloody well does. The number of people who have told me that using Python 3 is what allowed them to

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Steven D'Aprano
On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote: On 01/11/2014 07:38 AM, Steven D'Aprano wrote: The point that I am making is that many people want to add formatting operations to bytes so they can put ASCII strings inside bytes. But (as far as I can tell) they don't need to do

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread R. David Murray
tl;dr: At the end I'm volunteering to look at real code that is having porting problems. On Sat, 11 Jan 2014 17:33:17 +0100, M.-A. Lemburg m...@egenix.com wrote: asciistr is interesting in that it coerces to bytes instead of to Unicode (as is the case in Python 2). At the moment it doesn't

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Stephen J. Turnbull
M.-A. Lemburg writes: I complete agree with Stephen, that bytes are in fact often an encoding of text. If that text is ASCII compatible, I don't see any reason why we should not continue to expose the C lib standard string APIs available for text manipulations on bytes. We already *have*

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Steven D'Aprano
On Sat, Jan 11, 2014 at 04:15:35PM +0100, M.-A. Lemburg wrote: I think we need to step back a little from the purist view of things and give more emphasis on the practicality beats purity Zen. I complete agree with Stephen, that bytes are in fact often an encoding of text. If that text is

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Steven D'Aprano
On Sat, Jan 11, 2014 at 05:33:17PM +0100, M.-A. Lemburg wrote: FWIW: I quite liked the Python 2 model, but perhaps that's because I already knww how Unicode works, so could use it to make my life easier ;-) /incredulous I would really love to see you justify that claim. How do you use the

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread MRAB
On 2014-01-11 05:36, Steven D'Aprano wrote: [snip] Latin-1 has the nice property that every byte decodes into the character with the same code point, and visa versa. So: for i in range(256): assert bytes([i]).decode('latin-1') == chr(i) assert chr(i).encode('latin-1') == bytes([i])

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Ethan Furman
On 01/11/2014 10:36 AM, Steven D'Aprano wrote: On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote: unicode to bytes bytes to unicode using latin1 unicode to bytes Where do you get this from? I don't follow your logic. Start with a text template: template =

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Stephen J. Turnbull
MRAB writes: with open(outfile.pdf, w, encoding=latin-1) as f: f.write(pdf) [snip] The second example won't work because you're forgetting about the handling of line endings in text mode. Not so fast! Forgot, yes (me too!), but not work? Not quite: with

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Ethan Furman
On 01/11/2014 11:49 AM, Stephen J. Turnbull wrote: MRAB writes: with open(outfile.pdf, w, encoding=latin-1) as f: f.write(pdf) [snip] The second example won't work because you're forgetting about the handling of line endings in text mode. Not so fast! Forgot, yes (me

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread R. David Murray
On Sat, 11 Jan 2014 11:54:26 -0800, Ethan Furman et...@stoneleaf.us wrote: On 01/11/2014 11:49 AM, Stephen J. Turnbull wrote: MRAB writes: with open(outfile.pdf, w, encoding=latin-1) as f: f.write(pdf) [snip] The second example won't work because you're

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Donald Stufft
On Jan 11, 2014, at 10:34 AM, Nick Coghlan ncogh...@gmail.com wrote: Yes, it bloody well does. The number of people who have told me that using Python 3 is what allowed them to finally understand how Unicode works vastly exceeds the number of wire protocol and file format devs that have

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Terry Reedy
On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote: We already *have* a type in Python 3.3 that provides text manipulations on arrays of 8-bit objects: str (per PEP 393). BTW: I don't know why so many people keep asking for use cases. Isn't it obvious that text data without known (but ASCII

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Daniel Holth
On Sat, Jan 11, 2014 at 4:28 PM, Terry Reedy tjre...@udel.edu wrote: On 1/11/2014 1:44 PM, Stephen J. Turnbull wrote: We already *have* a type in Python 3.3 that provides text manipulations on arrays of 8-bit objects: str (per PEP 393). BTW: I don't know why so many people keep asking for

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Ethan Furman
On 01/11/2014 12:45 PM, Donald Stufft wrote: FWIW as one of the people who it took Python3 to finally figure out how to actually use unicode, it was the absence of encode on bytes and decode on str that actually did it. Giving bytes a format method would not have affected that either way I

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Mariano Reingart
On Fri, Jan 10, 2014 at 9:13 PM, Juraj Sukop juraj.su...@gmail.com wrote: On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou solip...@pitrou.netwrote: Also, when you say you've never encountered UTF-16 text in PDFs, it sounds like those people who've never encountered any non-ASCII data in

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Steven D'Aprano
On Sat, Jan 11, 2014 at 07:22:30PM +, MRAB wrote: with open(outfile.pdf, w, encoding=latin-1) as f: f.write(pdf) [snip] The second example won't work because you're forgetting about the handling of line endings in text mode. So I did! Thank you for the correction. -- Steven

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Matěj Cepl
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2014-01-11, 18:09 GMT, you wrote: We are NOT going back to the confusing incoherent mess that is the Python 2 model of bolting Unicode onto the side of POSIX . . . We are not asking for that. Yes, you do. Maybe not you personally, but

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Steven D'Aprano
On Sat, Jan 11, 2014 at 04:28:34PM -0500, Terry Reedy wrote: The problem with some criticisms of using 'unicode in Python 3' is that there really is no such thing. Unicode in 3.0 to 3.2 used the old internal model inherited from 2.x. Unicode in 3.3+ uses a different internal model that is

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Steven D'Aprano
On Sat, Jan 11, 2014 at 08:13:39PM -0200, Mariano Reingart wrote: AFAIK (and just for the record), there could be both Latin1 text and UTF-16 in a PDF (and other encodings too), depending on the font used: [...] In Python2, txt is just a str, but in Python3 handling everything as latin1

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Cameron Simpson
On 11Jan2014 13:15, Juraj Sukop juraj.su...@gmail.com wrote: On Sat, Jan 11, 2014 at 5:14 AM, Cameron Simpson c...@zip.com.au wrote: data = b' '.join( bytify( [ 10, 0, obj, binary_image_data, ... ] ) ) Thanks for the suggestion! The problem with bytify is that some items might require

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Nick Coghlan
On 12 Jan 2014 03:29, Ethan Furman et...@stoneleaf.us wrote: On 01/11/2014 12:43 AM, Nick Coghlan wrote: In particular, the bytes type is, and always will be, designed for pure binary manipulation [...] I apologize for being blunt, but this is a lie. Lets take a look at the methods

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Ethan Furman
On 01/11/2014 06:29 PM, Steven D'Aprano wrote: On Sat, Jan 11, 2014 at 11:05:36AM -0800, Ethan Furman wrote: On 01/11/2014 10:36 AM, Steven D'Aprano wrote: On Sat, Jan 11, 2014 at 08:20:27AM -0800, Ethan Furman wrote: unicode to bytes bytes to unicode using latin1 unicode to bytes

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Nick Coghlan
On 12 January 2014 02:33, M.-A. Lemburg m...@egenix.com wrote: On 11.01.2014 16:34, Nick Coghlan wrote: While that was an *expedient* (and, in fact, necessary) solution at the time, the fact it is still thoroughly confusing people 13 years later shows it is not a *comprehensible* solution.

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-11 Thread Nick Coghlan
On 12 January 2014 04:38, R. David Murray rdmur...@bitdance.com wrote: But! Our goal should be to help people convert to Python3. So how can we find out what the specific problems are that real-world programs are facing, look at the *actual code*, and help that project figure out the best

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 11:32:05 +1000 Nick Coghlan ncogh...@gmail.com wrote: It's consistent with bytearray.join's behaviour: x = bytearray() x.join([babc]) bytearray(b'abc') x bytearray(b'') Yeah, I guess I'm OK with us being consistent on that one. It's still weird, but also

[Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
(Sorry if this messes-up the thread order, it is meant as a reply to the original RFC.) Dear list, newbie here. After much hesitation I decided to put forward a use case which bothers me about the current proposal. Disclaimer: I happen to write a library which is directly influenced by this. As

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 12:17 PM, Juraj Sukop wrote: (Sorry if this messes-up the thread order, it is meant as a reply to the original RFC.) Dear list, newbie here. After much hesitation I decided to put forward a use case which bothers me about the current proposal. Disclaimer: I happen to write a

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Mark Lawrence
On 06/01/2014 13:24, Victor Stinner wrote: Hi, bytes % args and bytes.format(args) are requested by Mercurial and Twisted projects. The issue #3982 was stuck because nobody proposed a complete definition of the new features. Here is a try as a PEP. Apologies if this has already been said,

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Georg Brandl
Am 10.01.2014 18:56, schrieb Eric V. Smith: On 1/10/2014 12:17 PM, Juraj Sukop wrote: (Sorry if this messes-up the thread order, it is meant as a reply to the original RFC.) Dear list, newbie here. After much hesitation I decided to put forward a use case which bothers me about the

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop juraj.su...@gmail.com wrote: As you may know, PDF operates over bytes and an integer or floating-point number is written down as-is, for example 100 or 1.23. Just to be clear here -- is PDF specifically bytes+ascii? Or could there be

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Victor Stinner
2014/1/10 Juraj Sukop juraj.su...@gmail.com: In the case of PDF, the embedding of an image into PDF looks like: 10 0 obj /Type /XObject /Width 100 /Height 100 /Alternates 15 0 R /Length 2167 stream ...binary image data...

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 5:12 PM, Victor Stinner wrote: 2014/1/10 Juraj Sukop juraj.su...@gmail.com: In the case of PDF, the embedding of an image into PDF looks like: 10 0 obj /Type /XObject /Width 100 /Height 100 /Alternates 15 0 R /Length 2167

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and float. See Guido's messages http://bugs.python.org/issue3982#msg180423 and http://bugs.python.org/issue3982#msg180430 for some justification and discussion. If you

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and float. See Guido's messages http://bugs.python.org/issue3982#msg180423 and http://bugs.python.org/issue3982#msg180430 for

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 17:20:32 -0500 Eric V. Smith e...@trueblade.com wrote: Isn't the point of the PEP to make it easier to port 2.x code to 3.5? Is there really existing code like this in 2.x? No, but so what? The point of the PEP is not to allow arbitrary Python 2 code to run without

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 17:33:57 -0500 Eric V. Smith e...@trueblade.com wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and float. See Guido's messages

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 02:42 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 17:33:57 -0500 Eric V. Smith e...@trueblade.com wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith e...@trueblade.com wrote: I agree. I don't see any reason to exclude int and

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 14:58:15 -0800 Ethan Furman et...@stoneleaf.us wrote: On 01/10/2014 02:42 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 17:33:57 -0500 Eric V. Smith e...@trueblade.com wrote: On 1/10/2014 5:29 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 12:56:19 -0500 Eric V. Smith

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 18:14:45 -0500 Eric V. Smith e...@trueblade.com wrote: Because embedding the ASCII equivalent of ints and floats in byte streams is a common operation? Again, if you're representing ASCII, you're representing text and should use a str object. Yes, but is there

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Fri, Jan 10, 2014 at 10:52 PM, Chris Barker chris.bar...@noaa.govwrote: On Fri, Jan 10, 2014 at 9:17 AM, Juraj Sukop juraj.su...@gmail.comwrote: As you may know, PDF operates over bytes and an integer or floating-point number is written down as-is, for example 100 or 1.23. Just to be

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner victor.stin...@gmail.comwrote: What not building 10 0 obj ... stream and endstream endobj in Unicode and then encode to ASCII? Example: data = b''.join(( (%d %d obj ... stream % (10, 0)).encode('ascii'), binary_image_data, (endstream

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Sat, 11 Jan 2014 00:43:39 +0100 Juraj Sukop juraj.su...@gmail.com wrote: Basically, to .encode('ascii') every possible number is not exactly simple or pretty. Well it strikes me that the PDF format itself is not exactly simple or pretty. It might be convenient that Python 2 allows you, in

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Chris Barker
On Fri, Jan 10, 2014 at 3:40 PM, Juraj Sukop juraj.su...@gmail.com wrote: What this all means is that the PDF objects are expressed in ASCII, stream objects like images and fonts may have a binary part and I never saw those UTF+16 strings. hmm -- I wonder if they are out there in the wild,

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Juraj Sukop
On Sat, Jan 11, 2014 at 12:49 AM, Antoine Pitrou solip...@pitrou.netwrote: Also, when you say you've never encountered UTF-16 text in PDFs, it sounds like those people who've never encountered any non-ASCII data in their programs. Let me clarify: one does not think in writing text in

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/08/2014 02:42 PM, Antoine Pitrou wrote: With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation. From the PEP: = Python 3 generally mandates that text be stored and manipulated as unicode (i.e. str

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 16:23:53 -0800 Ethan Furman et...@stoneleaf.us wrote: On 01/08/2014 02:42 PM, Antoine Pitrou wrote: With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation. From the PEP: =

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Eric V. Smith
On 1/10/2014 8:12 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 16:23:53 -0800 Ethan Furman et...@stoneleaf.us wrote: On 01/08/2014 02:42 PM, Antoine Pitrou wrote: With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example http://bugs.python.org/issue3982#msg180432 . Then we might as well not do anything, since any attempt to advance things is met

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:04 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example http://bugs.python.org/issue3982#msg180432 . Then we might as well not do

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Antoine Pitrou
On Fri, 10 Jan 2014 18:28:41 -0800 Ethan Furman et...@stoneleaf.us wrote: Is it safe to assume you don't use Python for the use-cases under discussion? You know, I've done quite a bit of network programming. I've also done an experimental port of Twisted to Python 3. I know what a network

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). limited %-format means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:39 PM, Antoine Pitrou wrote: On Fri, 10 Jan 2014 18:28:41 -0800 Ethan Furman wrote: Is it safe to assume you don't use Python for the use-cases under discussion? You know, I've done quite a bit of network programming. No, I didn't, that's why I asked. I've also done an

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Ethan Furman
On 01/10/2014 06:39 PM, Antoine Pitrou wrote: I know what a network protocol with ill-defined encodings looks like. For the record, I've been (and I suspect Eric and some others have also been) talking about well-defined encodings. For the DBF files that I work with, there is binary,

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread INADA Naoki
To avoid implicit conversion between str and bytes, I propose adding only limited %-format, not .format() or .format_map(). limited %-format means: %c accepts integer or bytes having one length. %r is not supported %s accepts only bytes. %a is only format accepts arbitrary object. And other

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Cameron Simpson
On 11Jan2014 00:43, Juraj Sukop juraj.su...@gmail.com wrote: On Fri, Jan 10, 2014 at 11:12 PM, Victor Stinner victor.stin...@gmail.comwrote: What not building 10 0 obj ... stream and endstream endobj in Unicode and then encode to ASCII? Example: data = b''.join(( (%d %d obj ...

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Steven D'Aprano
On Fri, Jan 10, 2014 at 06:17:02PM +0100, Juraj Sukop wrote: As you may know, PDF operates over bytes and an integer or floating-point number is written down as-is, for example 100 or 1.23. I'm sorry, I don't understand what you mean here. I'm honestly not trying to be difficult, but you

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-10 Thread Georg Brandl
Am 11.01.2014 03:04, schrieb Antoine Pitrou: On Fri, 10 Jan 2014 20:53:09 -0500 Eric V. Smith e...@trueblade.com wrote: So, I'm -1 on the PEP. It doesn't address the cases laid out in issue 3892. See for example http://bugs.python.org/issue3982#msg180432 . I agree. Then we might as well

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Nick Coghlan
On 9 Jan 2014 11:29, INADA Naoki songofaca...@gmail.com wrote: And I think everyone was well intentioned - and python3 covers most of the bases, but working with binary data is not only a wire-protocol programmer's problem. If you're working with binary data, use the binary API offered by

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Antoine Pitrou
On Thu, 09 Jan 2014 03:54:13 + MRAB pyt...@mrabarnett.plus.com wrote: I'm thinking that the i format could be used for signed integers and the u for unsigned integers. The width would be the number of bytes. You would also need to have a way of specifying the endianness. For example:

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Barry Warsaw
On Jan 08, 2014, at 01:51 PM, Stephen J. Turnbull wrote: Benjamin Peterson writes: I agree. This is a very important, much-requested feature for low-level networking code. I hear it's much-requested, but is there any description of typical use cases? The two unported libraries that are

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Nick Coghlan
On 9 Jan 2014 06:43, Antoine Pitrou solip...@pitrou.net wrote: Hi, With Victor's consent, I overhauled PEP 460 and made the feature set more restricted and consistent with the bytes/str separation. +1 I was initially dubious about the idea, but the proposed semantics look good to me. We

Re: [Python-Dev] RFC: PEP 460: Add bytes % args and bytes.format(args) to Python 3.5

2014-01-09 Thread Antoine Pitrou
On Fri, 10 Jan 2014 05:26:04 +1000 Nick Coghlan ncogh...@gmail.com wrote: We should probably include format_map for consistency with the str API. Yes, you're right. However, I also added bytearray into the mix, as bytearray objects should generally support the same operations as bytes

  1   2   >