Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Steve Holden
Terry Reedy wrote: > Stephen J. Turnbull wrote: >> "Martin v. Löwis" writes: >> > > It might also be a good idea to take the download link off the front >> > > page of python.org: until that happens newbies are going to keep >> coming >> > > along and downloading it "because it's the newest". >

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Terry Reedy
Stephen J. Turnbull wrote: "Martin v. Löwis" writes: > > It might also be a good idea to take the download link off the front > > page of python.org: until that happens newbies are going to keep coming > > along and downloading it "because it's the newest". By that logic, I would suggest rem

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Stephen J. Turnbull
"Martin v. Löwis" writes: > > It might also be a good idea to take the download link off the front > > page of python.org: until that happens newbies are going to keep coming > > along and downloading it "because it's the newest". > > It was (and probably still is) Guido's position that 3.0 *

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Paul Moore
2009/1/28 Raymond Hettinger : > [Adam Olsen] >> >> It'd also help if the file repr gave the encoding: > > +1 from me too. That will be a big help. Definitely. People *are* going to get confused by encoding errors - let's give them all the help we can. Paul

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Martin v. Löwis
> It might also be a good idea to take the download link off the front > page of python.org: until that happens newbies are going to keep coming > along and downloading it "because it's the newest". It was (and probably still is) Guido's position that 3.0 *is* the version that newbies should be us

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Nick Coghlan
Steve Holden wrote: > 2.6 showed it in the > inclusion (later recognizable as somewhat ill-advised so late in the > day) of multiprocessing; Given the longstanding fork() bugs that were fixed as a result of that inclusion, I think that ill-advised is too strong... could it have done with a little

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Steve Holden
Terry Reedy wrote: > Michael Foord wrote: >> M.-A. Lemburg wrote: > >>> Why don't we just mark 3.0.x as experimental branch and keep updating/ >>> fixing things that were not sorted out for the 3.0.0 release ?! I think >>> that's a fair approach, given that the only way to get field testing >>> fo

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Raymond Hettinger
[Adam Olsen] It'd also help if the file repr gave the encoding: +1 from me too. That will be a big help. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.o

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Daniel Stutzbach
On Wed, Jan 28, 2009 at 1:42 PM, Adam Olsen wrote: > It'd also help if the file repr gave the encoding: > +1 -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC ___ Python-Dev mailing list Python-Dev@pyth

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Adam Olsen
On Wed, Jan 28, 2009 at 11:52 AM, Paul Moore wrote: > Ah, I see. That is entirely obvious. The key bit of information is > that the default io encoding is cp1252, not cp850. I know that in > theory, I see the consequences often enough (:-)), but it isn't > "instinctive" for me. And the simple "def

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Terry Reedy
Michael Foord wrote: M.-A. Lemburg wrote: Why don't we just mark 3.0.x as experimental branch and keep updating/ fixing things that were not sorted out for the 3.0.0 release ?! I think that's a fair approach, given that the only way to get field testing for new open-source software is to relea

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Jean-Paul Calderone
On Wed, 28 Jan 2009 18:52:41 +, Paul Moore wrote: 2009/1/28 "Martin v. Löwis" : Well, first try to understand what the error *is*: py> unicodedata.name('\u0153') 'LATIN SMALL LIGATURE OE' py> unicodedata.name('£') 'POUND SIGN' py> ascii('£') "'\\xa3'" py> ascii('£'.encode('cp850').decode('

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Terry Reedy
Steven Bethard wrote: On Wed, Jan 28, 2009 at 10:29 AM, "Martin v. Löwis" wrote: Notice that the determination of the specific encoding used is fairly elaborate: - if IO is to a terminal, Python tries to determine the encoding of the terminal. This is mostly relevant for Windows (which uses,

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Paul Moore
2009/1/28 "Martin v. Löwis" : > Well, first try to understand what the error *is*: > > py> unicodedata.name('\u0153') > 'LATIN SMALL LIGATURE OE' > py> unicodedata.name('£') > 'POUND SIGN' > py> ascii('£') > "'\\xa3'" > py> ascii('£'.encode('cp850').decode('cp1252')) > "'\\u0153'" > > So when Pytho

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Martin v. Löwis
> This a very helpful explanation. Is it in the docs somewhere, or if it > isn't, could it be? I actually don't know. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Martin v. Löwis
Paul Moore wrote: > 2009/1/28 "Martin v. Löwis" : >> print(open("a1").read()) >>> Traceback (most recent call last): >>> File "", line 1, in >>> File "D:\Apps\Python30\lib\io.py", line 1491, in write >>> b = encoder.encode(s) >>> File "D:\Apps\Python30\lib\encodings\cp850.py", line 1

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Steven Bethard
On Wed, Jan 28, 2009 at 10:29 AM, "Martin v. Löwis" wrote: > Notice that the determination of the specific encoding used is fairly > elaborate: > - if IO is to a terminal, Python tries to determine the encoding of > the terminal. This is mostly relevant for Windows (which uses, > by default, the

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Martin v. Löwis
> Thanks for the explanation. It might be clearer to document this a > little more explicitly in the docs for open() (on the basis that > people using open() are the most likely to be naive about encodings). > I'll see if I can come up with an appropriate doc patch. Notice that the determination o

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Paul Moore
2009/1/28 "Martin v. Löwis" : > print(open("a1").read()) >> Traceback (most recent call last): >> File "", line 1, in >> File "D:\Apps\Python30\lib\io.py", line 1491, in write >> b = encoder.encode(s) >> File "D:\Apps\Python30\lib\encodings\cp850.py", line 19, in encode >> return

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Paul Moore
2009/1/28 "Martin v. Löwis" : > Paul Moore wrote: >> Hmm, I just checked and on Windows, it >> appears that sys.getdefaultencoding() is UTF-8. That seems odd - I >> would have thought the majority of Windows systems were NOT set to use >> UTF-8 by default... > > In Python 3, sys.getdefaultencoding(

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Martin v. Löwis
print(open("a1").read()) > Traceback (most recent call last): > File "", line 1, in > File "D:\Apps\Python30\lib\io.py", line 1491, in write > b = encoder.encode(s) > File "D:\Apps\Python30\lib\encodings\cp850.py", line 19, in encode > return codecs.charmap_encode(input,self.err

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Martin v. Löwis
Paul Moore wrote: > Hmm, I just checked and on Windows, it > appears that sys.getdefaultencoding() is UTF-8. That seems odd - I > would have thought the majority of Windows systems were NOT set to use > UTF-8 by default... In Python 3, sys.getdefaultencoding() is "utf-8" on all platforms, just as

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Michael Foord
M.-A. Lemburg wrote: On 2009-01-27 22:19, Raymond Hettinger wrote: From: ""Martin v. Löwis"" Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think it should be released earlier (else 3.0 looks fairly ridiculous). I think it should be released earlier and completely

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Martin v. Löwis
> PS Can anyone comment on why Python defaults to utf-8 on Windows? Don't panic. It doesn't, and you are misinterpreting what you are seeing. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/pytho

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread M.-A. Lemburg
On 2009-01-27 22:19, Raymond Hettinger wrote: > From: ""Martin v. Löwis"" >> Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think >> it should be released earlier (else 3.0 looks fairly ridiculous). > > I think it should be released earlier and completely supplant 3.0 > before more t

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Antoine Pitrou
Le mercredi 28 janvier 2009 à 16:54 +, Paul Moore a écrit : > I do think it's worth taking care over the default encoding, though. > Quite apart from performance, getting "correct" behaviour is > important. I can't speak for Unix, but on Windows, the following > behaviour feels like a bug to me

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Paul Moore
2009/1/28 Antoine Pitrou : > If you look at how utf-8 decoding is implemented (in unicodeobject.c), it's > quite obvious why it is so :-) There is a (very) fast path for chunks of pure > ASCII data, and (fast but not blazingly fast) fallback for non ASCII data. Thanks for the explanation. > Pleas

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Antoine Pitrou
Paul Moore gmail.com> writes: > > > > As I pointed out, utf-8, utf-16 and latin1 decoders have already been optimized > > in py3k. For *pure ASCII* input, utf-8 decoding is blazingly fast (1GB/s here). > > The dataset for iobench isn't pure ASCII though, and that's why it's not as fast. > > Ah, t

Re: [Python-Dev] Python 3.0.1

2009-01-28 Thread Lawrence Oluyede
On Wed, Jan 28, 2009 at 4:32 AM, Steve Holden wrote: > I think that both 3.0 and 2.6 were rushed releases. 2.6 showed it in the > inclusion (later recognizable as somewhat ill-advised so late in the > day) of multiprocessing; 3.0 shows it in the very fact that this > discussion has become necessar

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Victor Stinner
Le Wednesday 28 January 2009 12:41:07 Antoine Pitrou, vous avez écrit : > > Why not testing io.open() or codecs.open() which create unicode strings? > > There is no doubt that io.open() and codecs.open() in 2.x are much slower > than the io-c branch. However, nobody is expecting very good performan

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Paul Moore
2009/1/28 Antoine Pitrou : > Paul Moore gmail.com> writes: >> >> It would be helpful to limit this cost as much as possible - maybe >> that's simply ensuring that the default encoding for open is (in the >> majority of cases) a highly-optimised one whose costs *don't* dominate >> in the way you de

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Antoine Pitrou
Victor Stinner haypocalc.com> writes: > > Le Wednesday 28 January 2009 11:55:16 Antoine Pitrou, vous avez écrit : > > 2.x has no encoding costs, which explains why it's so much faster. > > Why not testing io.open() or codecs.open() which create unicode strings? The goal is to test the idiomatic

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Antoine Pitrou
Paul Moore gmail.com> writes: > > It would be helpful to limit this cost as much as possible - maybe > that's simply ensuring that the default encoding for open is (in the > majority of cases) a highly-optimised one whose costs *don't* dominate > in the way you describe As I pointed out, utf-8,

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Victor Stinner
Le Wednesday 28 January 2009 11:55:16 Antoine Pitrou, vous avez écrit : > 2.x has no encoding costs, which explains why it's so much faster. Why not testing io.open() or codecs.open() which create unicode strings? -- Victor Stinner aka haypo http://www.haypocalc.com/blog/ ___

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Paul Moore
2009/1/28 Antoine Pitrou : > When writing large chunks of text (4096, 1e6), bookkeeping costs become > marginal and encoding costs dominate. 2.x has no encoding costs, which > explains why it's so much faster. Interesting. However, it's still "slower" in terms of perception. In 2.x, I regularly do

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Antoine Pitrou
Hello, Raymond Hettinger rcn.com> writes: > > >MB/S MB/SMB/S > >in C in py3k in 2.7 C/3k 2.7/3k > > ** Text append ** > > 10M write 1e6 units at a time261.00 218.000 1540.000 1.20 7.06 > > 20K w

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-28 Thread Raymond Hettinger
[Scott David Daniels] Comparison of three cases (including performance rations): MB/S MB/SMB/S in C in py3k in 2.7 C/3k 2.7/3k ** Text append ** 10M write 1e6 units at a time261.00 218.000 1540.000 1