Re: [Python-Dev] Generalised String Coercion

2005-08-09 Thread Guido van Rossum
On 8/9/05, Nick Coghlan <[EMAIL PROTECTED]> wrote: > We could always give the text mode/binary mode distinction in "open" a real > meaning - text mode deals with character sequences, binary mode deals with > byte sequences. I thought that's what I proposed before. I'm still for it. -- --Guido va

Re: [Python-Dev] Generalised String Coercion

2005-08-09 Thread Nick Coghlan
James Y Knight wrote: > Hum, actually, it somewhat makes sense for the "open" builtin to > become what is now "codecs.open", for convenience's sake, although it > does blur the distinction between a byte stream and a character > stream somewhat. If that happens, I suppose it does actually mak

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Stephen J. Turnbull
> "Martin" == Martin v Löwis <[EMAIL PROTECTED]> writes: Martin> While this would work, it would still feel wrong: the Martin> binary data are *not* latin1 (most likely), so declaring Martin> them to be latin1 would be confusing. Perhaps a synonym Martin> '8bit' for latin1 coul

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread James Y Knight
On Aug 8, 2005, at 12:14 PM, Guido van Rossum wrote: > Ouch. Too much discussion to respond to it all. Please remember that > in Jythin and IronPython, str and unicode are already synonyms. That's > how Python 3.0 will do it, except unicode will disappear as being > redundant. I like the bytes/froz

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread François Pinard
[Phillip J. Eby] > At 09:14 AM 8/8/2005 -0700, Guido van Rossum wrote: > > I'm not going to change my mind on text() unless someone explains > > what's so attractive about it. > 2. It's more obvious to programmers that it's a *text* string rather > than a string of bytes I've no opinion on the

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Neil Schemenauer
On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote: > My first response to the PEP, however, is that instead of a new > built-in function, I'd rather relax the requirement that str() return > an 8-bit string -- after all, int() is allowed to return a long, so > why couldn't str() be a

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Martin v. Löwis
Phillip J. Eby wrote: > Actually, thinking about it some more, it seems to me it's actually more > like this: > >sock.send( ("%d:%s," % > (len(data),data.decode('latin1'))).encode('latin1') ) While this would work, it would still feel wrong: the binary data are *not* latin1 (most likely), so

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread M.-A. Lemburg
Guido van Rossum wrote: > Ouch. Too much discussion to respond to it all. Please remember that > in Jythin and IronPython, str and unicode are already synonyms. I know, but don't understand that argument: aren't we talking about Python in general, not some particular implementation ? Why should

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Phillip J. Eby
At 09:14 AM 8/8/2005 -0700, Guido van Rossum wrote: >I'm not going to change my mind on text() unless >someone explains what's so attractive about it. 1. It's obvious to non-programmers what it's for (str and unicode aren't) 2. It's more obvious to programmers that it's a *text* string rather tha

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Guido van Rossum
Ouch. Too much discussion to respond to it all. Please remember that in Jythin and IronPython, str and unicode are already synonyms. That's how Python 3.0 will do it, except unicode will disappear as being redundant. I like the bytes/frozenbytes pair idea. Streams could grow a getpos()/setpos() API

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Aahz
On Sun, Aug 07, 2005, Neil Schemenauer wrote: > On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote: >> >> My first response to the PEP, however, is that instead of a new >> built-in function, I'd rather relax the requirement that str() return >> an 8-bit string > > Do you have any th

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Phillip J. Eby
At 10:07 AM 8/8/2005 +0200, Martin v. Löwis wrote: >Phillip J. Eby wrote: > >>Hm. What would be the use case for using %s with binary, non-text data? > > > > > > Well, I could see using it to write things like netstrings, > > i.e. sock.send("%d:%s," % (len(data),data)) seems like the One Obvious

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread M.-A. Lemburg
Michael Hudson wrote: > "M.-A. Lemburg" <[EMAIL PROTECTED]> writes: > > >>Set the external encoding for stdin, stdout, stderr: >> >>(also an example for adding encoding support to an >>existing file object): >> >>def set_sys_std_encoding(encodin

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread M.-A. Lemburg
Guido van Rossum wrote: > [Guido] > >>>My first response to the PEP, however, is that instead of a new >>>built-in function, I'd rather relax the requirement that str() return >>>an 8-bit string -- after all, int() is allowed to return a long, so >>>why couldn't str() be allowed to return a Unicod

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Nick Coghlan
Martin v. Löwis wrote: > Guido van Rossum wrote: >>The bytes type could just be a very thin wrapper around array('b'). > > That answers an important question: so you want the bytes type to be > mutable (and, consequently, unsuitable as a dictionary key). I would suggest a bytes/frozenbytes pair,

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Michael Hudson
"M.-A. Lemburg" <[EMAIL PROTECTED]> writes: > Set the external encoding for stdin, stdout, stderr: > > (also an example for adding encoding support to an > existing file object): > > def set_sys_std_encoding(encoding): > # Load encoding supp

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Martin v. Löwis
Stephen J. Turnbull wrote: > If you mean the UTF-8 support in Terminal, it's no better or worse > than the EUC-JP support. The problem is that most Japanese Unix > systems continue to default to EUC-JP, and many Windows hosts > (including Samba file systems) default to Shift JIS. So people using

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Stephen J. Turnbull
> "Martin" == Martin v Löwis <[EMAIL PROTECTED]> writes: Martin> I think your doubts are unfounded. Many Japanese people Martin> change it to EUC-JP (I believe), as UTF-8 support doesn't Martin> work well for them (or atleast didn't use to). If you mean the UTF-8 support in Termin

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Martin v. Löwis
Phillip J. Eby wrote: >>Hm. What would be the use case for using %s with binary, non-text data? > > > Well, I could see using it to write things like netstrings, > i.e. sock.send("%d:%s," % (len(data),data)) seems like the One Obvious Way > to write a netstring in today's Python at least. But

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Martin v. Löwis
Guido van Rossum wrote: > We might be able to get there halfway in Python 2.x: we could > introduce the bytes type now, and provide separate APIs to read and > write them. (In fact, the array module and the f.readinto() method > make this possible today, but it's too klunky so nobody uses it. > Pe

Re: [Python-Dev] Generalised String Coercion

2005-08-08 Thread Martin v. Löwis
Bob Ippolito wrote: > It's UTF-8 by default, I highly doubt many people bother to change it. I think your doubts are unfounded. Many Japanese people change it to EUC-JP (I believe), as UTF-8 support doesn't work well for them (or atleast didn't use to). Regards, Martin ___

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Martin v. Löwis
Guido van Rossum wrote: > I'm not sure if it works for all encodings, but if possible I'd like > to extend the seeking semantics on text files: seek positions are byte > counts, and the application should consider them as "magic cookies". If the seek position is merely a number, it won't work for

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Bob Ippolito
On Aug 7, 2005, at 7:37 PM, Martin v. Löwis wrote: > Guido van Rossum wrote: > >>> If stdin, stdout and stderr go to a terminal, there already is a >>> default encoding (actually, there always is a default encoding on >>> these, as it falls back to the system encoding if its not a >>> terminal,

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Martin v. Löwis
Guido van Rossum wrote: >>If stdin, stdout and stderr go to a terminal, there already is a >>default encoding (actually, there always is a default encoding on >>these, as it falls back to the system encoding if its not a terminal, >>or if the terminal's encoding is not supported or cannot be determ

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Phillip J. Eby
At 05:24 PM 8/7/2005 -0700, Guido van Rossum wrote: >Hm. What would be the use case for using %s with binary, non-text data? Well, I could see using it to write things like netstrings, i.e. sock.send("%d:%s," % (len(data),data)) seems like the One Obvious Way to write a netstring in today's Pyt

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Neil Schemenauer
On Sat, Aug 06, 2005 at 06:56:39PM -0700, Guido van Rossum wrote: > My first response to the PEP, however, is that instead of a new > built-in function, I'd rather relax the requirement that str() return > an 8-bit string Do you have any thoughts on what the C API would be? It seems to me that Py

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Guido van Rossum
[Guido] > > My first response to the PEP, however, is that instead of a new > > built-in function, I'd rather relax the requirement that str() return > > an 8-bit string -- after all, int() is allowed to return a long, so > > why couldn't str() be allowed to return a Unicode string? [MAL] > The pr

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Guido van Rossum
[Reinhold Birkenfeld] > > FWIW, I've already drafted a patch for the former. It lets you write to > > file.encoding and honors this when writing Unicode strings to it. [Martin v L] > I don't like that approach. You shouldn't be allowed to change the > encoding mid-stream (except perhaps under very

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Guido van Rossum
[me] > > a way to decide on a default encoding for stdin, > > stdout, stderr. [Martin] > If stdin, stdout and stderr go to a terminal, there already is a > default encoding (actually, there always is a default encoding on > these, as it falls back to the system encoding if its not a terminal, > or

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread M.-A. Lemburg
Guido van Rossum wrote: > My first response to the PEP, however, is that instead of a new > built-in function, I'd rather relax the requirement that str() return > an 8-bit string -- after all, int() is allowed to return a long, so > why couldn't str() be allowed to return a Unicode string? The pr

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Martin v. Löwis
Reinhold Birkenfeld wrote: > FWIW, I've already drafted a patch for the former. It lets you write to > file.encoding and honors this when writing Unicode strings to it. I don't like that approach. You shouldn't be allowed to change the encoding mid-stream (except perhaps under very specific circum

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Martin v. Löwis
Guido van Rossum wrote: > The main problem for a smooth Unicode transition remains I/O, in my > opinion; I'd like to see a PEP describing a way to attach an encoding > to text files, and a way to decide on a default encoding for stdin, > stdout, stderr. If stdin, stdout and stderr go to a terminal

Re: [Python-Dev] Generalised String Coercion

2005-08-07 Thread Reinhold Birkenfeld
Guido van Rossum wrote: > The main problem for a smooth Unicode transition remains I/O, in my > opinion; I'd like to see a PEP describing a way to attach an encoding > to text files, and a way to decide on a default encoding for stdin, > stdout, stderr. FWIW, I've already drafted a patch for the

Re: [Python-Dev] Generalised String Coercion

2005-08-06 Thread Guido van Rossum
[Removed python-list CC] On 8/6/05, Terry Reedy <[EMAIL PROTECTED]> wrote: > > PEP: 349 > > Title: Generalised String Coercion > ... > > Rationale > >Python has had a Unicode string type for some time now but use of > >it is not yet widespread. There is a large amount of Python code > >

Re: [Python-Dev] Generalised String Coercion

2005-08-06 Thread Terry Reedy
> PEP: 349 > Title: Generalised String Coercion ... > Rationale >Python has had a Unicode string type for some time now but use of >it is not yet widespread. There is a large amount of Python code >that assumes that string data is represented as str instances. >The long term plan f