Re: [Python-Dev] FormatError() in callproc.c under win32

2009-01-27 Thread Ulrich Eckhardt
On Monday 26 January 2009, Thomas Heller wrote:
> Ulrich Eckhardt schrieb:
> > In callproc.c from trunk is a function called SetException(), which calls
[...]
> > My third approach would be to filter out the special error codes first
> > and delegate all others to PyErr_SetFromWindowsErr(). The latter already
> > handles the lack of a string for the code by formatting it numerically.
> > This would also improve consistency, since the two functions use
> > different ways to format unrecognised errors numerically. This approach
> > would change where and how a completely unrecognised error code is
> > formatted, but would otherwise be pretty equivalent.
>
> The third approach is fine with me.  Sidenote: The only error codes that I
> remember having seen in practice are 'access violation reading...' and
> 'access violation writing...', although it may be that on WinCE 'datatype
> misalignment' may also be possible.

Submitted as patch for issue #5078.

Note: under CE, you can actually encounter datatype misalignments, since it 
runs on CPUs that don't emulate them. I wonder if the same doesn't also apply 
to win64

Uli

-- 
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

**
Sator Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
**
   Visit our website at 
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.
**

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FormatError() in callproc.c under win32

2009-01-27 Thread Ulrich Eckhardt
On Monday 26 January 2009, Martin v. Löwis wrote:
> > In callproc.c from trunk is a function called SetException(), which calls
> > FormatError() only to discard the contents. Can anyone enlighten me to
> > the reasons thereof?
>
> Interestingly enough, the code used to say
>
>PyErr_SetString(PyExc_WindowsError, lpMsgBuf);
>
> Then it was changed to its current form, with a log message of
>
>Changes for windows CE, contributed by Luke Dunstan.  Thanks a lot!
>
> See
>
> 
http://ctypes.cvs.sourceforge.net/viewvc/ctypes/ctypes/source/callproc.c?hideattic=0&r1=1.127.2.15&r2=1.127.2.16
>
> I suggest you ask Thomas Heller and Luke Dunstan (if available) what the
> rationale for this partial change was.

I can only guess:
1. Those changes seem to generate TCHAR strings. This is necessary to compile 
it on both win9x (TCHAR=char) and CE (TCHAR=wchar_t). Since win9x was dropped 
from the supported platforms, that isn't necessary any more and all the code 
could use WCHAR directly.
2. Those changes also seem to change a few byte-strings to Unicode-strings, 
see format_error(). This is a questionable step, since those are changes that 
are visible to Python code. Worse, even on the same platform it could return 
different string types when the lookup of the errorcode fails. I wonder if 
that is intentional.

In any case, CCing Luke on the issue, maybe he can clarify things.

cheers

Uli

**
Sator Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
**
   Visit our website at 
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Sator Laser GmbH ist für diese Folgen nicht 
verantwortlich.
**

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PSF-Board] I've got a surprise for you!

2009-01-27 Thread Steve Holden
Jim Walker wrote:
[Trent's announcement]
> 
> Great stuff Trent! I was wondering how you were doing.
> 
> I really appreciate what it takes to put these open resources
> together ;) There's a lot of moving parts :)
> 
> Cheers,
> Jim
> 
> BTW.
> 
> We now have zone servers in the OpenSolaris test farm, and
> I plan to add guest os servers in the next few weeks using
> ldoms (sparc) and xvm (x64). The zone servers provide whole
> root zones, which should be a good development environment
> for most projects. Check it out:
> 
> http://test.opensolaris.org/testfarm
> http://www.opensolaris.org/os/community/testing/testfarm/zones/
> 
> Let me know if there is interest from the python community to
> manage one of the test farm servers for python development.
> Besides the general use machines, the php community is already
> managing a T2000 server.

Jim:

Thanks, this is a terrific offer. I am copying it to the Python
developers list so they can discuss it - I know that Solaris is one of
the platforms we do get quite a few build questions about.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Chairman, PSF   http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Jake McGuire
Instance attribute names are normally interned - this is done in  
PyObject_SetAttr (among other places).  Unpickling (in pickle and  
cPickle) directly updates __dict__ on the instance object.  This  
bypasses the interning so you end up with many copies of the strings  
representing your attribute names, which wastes a lot of space, both  
in RAM and in pickles of sequences of objects created from pickles.   
Note that the native python memcached client uses pickle to serialize  
objects.


>>> import pickle
>>> class C(object):
...   def __init__(self, x):
... self.long_attribute_name = x
...
>>> len(pickle.dumps([pickle.loads(pickle.dumps(C(None),  
pickle.HIGHEST_PROTOCOL)) for i in range(100)],  
pickle.HIGHEST_PROTOCOL))

3658
>>> len(pickle.dumps([C(None) for i in range(100)],  
pickle.HIGHEST_PROTOCOL))

1441
>>>

Interning the strings on unpickling makes the pickles smaller, and at  
least for cPickle actually makes unpickling sequences of many objects  
slightly faster.  I have included proposed patches to cPickle.c and  
pickle.py, and would appreciate any feedback.


dhcp-172-31-170-32:~ mcguire$ diff -u Downloads/Python-2.4.3/Modules/ 
cPickle.c cPickle.c
--- Downloads/Python-2.4.3/Modules/cPickle.c	2004-07-26  
22:22:33.0 -0700

+++ cPickle.c   2009-01-26 23:30:31.0 -0800
@@ -4258,6 +4258,8 @@
PyObject *state, *inst, *slotstate;
PyObject *__setstate__;
PyObject *d_key, *d_value;
+   PyObject *name;
+   char * key_str;
int i;
int res = -1;

@@ -4319,8 +4321,24 @@

i = 0;
while (PyDict_Next(state, &i, &d_key, &d_value)) {
-   if (PyObject_SetItem(dict, d_key, d_value) < 0)
-   goto finally;
+   /* normally the keys for instance attributes are
+  interned.  we should try to do that here. */
+   if (PyString_CheckExact(d_key)) {
+   key_str = PyString_AsString(d_key);
+   name = PyString_FromString(key_str);
+   if (! name)
+   goto finally;
+
+   PyString_InternInPlace(&name);
+   if (PyObject_SetItem(dict, name, d_value) < 0) {
+   Py_DECREF(name);
+   goto finally;
+   }
+   Py_DECREF(name);
+   } else {
+   if (PyObject_SetItem(dict, d_key, d_value) < 0)
+   goto finally;
+   }
}
Py_DECREF(dict);
}

dhcp-172-31-170-32:~ mcguire$ diff -u Downloads/Python-2.4.3/Lib/ 
pickle.py pickle.py
--- Downloads/Python-2.4.3/Lib/pickle.py	2009-01-27 01:41:43.0  
-0800

+++ pickle.py   2009-01-27 01:41:31.0 -0800
@@ -1241,7 +1241,15 @@
 state, slotstate = state
 if state:
 try:
-inst.__dict__.update(state)
+d = inst.__dict__
+try:
+for k,v in state.items():
+d[intern(k)] = v
+# keys in state don't have to be strings
+# don't blow up, but don't go out of our way
+except TypeError:
+d.update(state)
+
 except RuntimeError:
 # XXX In restricted execution, the instance's __dict__
 # is not accessible.  Use the old way of unpickling

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Jesse Noller
On Tue, Jan 27, 2009 at 4:49 AM, Jake McGuire  wrote:
> Instance attribute names are normally interned - this is done in
> PyObject_SetAttr (among other places).  Unpickling (in pickle and cPickle)
> directly updates __dict__ on the instance object.  This bypasses the
> interning so you end up with many copies of the strings representing your
> attribute names, which wastes a lot of space, both in RAM and in pickles of
> sequences of objects created from pickles.  Note that the native python
> memcached client uses pickle to serialize objects.
>
 import pickle
 class C(object):
> ...   def __init__(self, x):
> ... self.long_attribute_name = x
> ...
 len(pickle.dumps([pickle.loads(pickle.dumps(C(None),
 pickle.HIGHEST_PROTOCOL)) for i in range(100)], pickle.HIGHEST_PROTOCOL))
> 3658
 len(pickle.dumps([C(None) for i in range(100)],
 pickle.HIGHEST_PROTOCOL))
> 1441

>
> Interning the strings on unpickling makes the pickles smaller, and at least
> for cPickle actually makes unpickling sequences of many objects slightly
> faster.  I have included proposed patches to cPickle.c and pickle.py, and
> would appreciate any feedback.
>
> dhcp-172-31-170-32:~ mcguire$ diff -u
> Downloads/Python-2.4.3/Modules/cPickle.c cPickle.c
> --- Downloads/Python-2.4.3/Modules/cPickle.c2004-07-26
> 22:22:33.0 -0700
> +++ cPickle.c   2009-01-26 23:30:31.0 -0800
> @@ -4258,6 +4258,8 @@
>PyObject *state, *inst, *slotstate;
>PyObject *__setstate__;
>PyObject *d_key, *d_value;
> +   PyObject *name;
> +   char * key_str;
>int i;
>int res = -1;
>
> @@ -4319,8 +4321,24 @@
>
>i = 0;
>while (PyDict_Next(state, &i, &d_key, &d_value)) {
> -   if (PyObject_SetItem(dict, d_key, d_value) < 0)
> -   goto finally;
> +   /* normally the keys for instance attributes are
> +  interned.  we should try to do that here. */
> +   if (PyString_CheckExact(d_key)) {
> +   key_str = PyString_AsString(d_key);
> +   name = PyString_FromString(key_str);
> +   if (! name)
> +   goto finally;
> +
> +   PyString_InternInPlace(&name);
> +   if (PyObject_SetItem(dict, name, d_value) <
> 0) {
> +   Py_DECREF(name);
> +   goto finally;
> +   }
> +   Py_DECREF(name);
> +   } else {
> +   if (PyObject_SetItem(dict, d_key, d_value) <
> 0)
> +   goto finally;
> +   }
>}
>Py_DECREF(dict);
>}
>
> dhcp-172-31-170-32:~ mcguire$ diff -u Downloads/Python-2.4.3/Lib/pickle.py
> pickle.py
> --- Downloads/Python-2.4.3/Lib/pickle.py2009-01-27
> 01:41:43.0 -0800
> +++ pickle.py   2009-01-27 01:41:31.0 -0800
> @@ -1241,7 +1241,15 @@
> state, slotstate = state
> if state:
> try:
> -inst.__dict__.update(state)
> +d = inst.__dict__
> +try:
> +for k,v in state.items():
> +d[intern(k)] = v
> +# keys in state don't have to be strings
> +# don't blow up, but don't go out of our way
> +except TypeError:
> +d.update(state)
> +
> except RuntimeError:
> # XXX In restricted execution, the instance's __dict__
> # is not accessible.  Use the old way of unpickling
>

Hi Jake,

You should really post this to bugs.python.org as an enhancement so we
can track the discussion there.

-jesse
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] V8, TraceMonkey, SquirrelFish and Python

2009-01-27 Thread Mart Sõmermaa
As most of you know there's constant struggle on the JavaScript front to get
even faster performance out of interpreters.
V8, TraceMonkey and SquirrelFish have brought novel ideas to interpreter
design, wouldn't it make sense to reap the best bits and bring them to
Python?

Has anyone delved into the designs and considered their applicability to
Python?

Hoping-to-see-some-V8-and-Python-teams-collaboration-in-Mountain-View-ly
yours,
Mart Sõmermaa
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] V8, TraceMonkey, SquirrelFish and Python

2009-01-27 Thread Jesse Noller
On Tue, Jan 27, 2009 at 9:50 AM, Mart Sõmermaa  wrote:
> As most of you know there's constant struggle on the JavaScript front to get
> even faster performance out of interpreters.
> V8, TraceMonkey and SquirrelFish have brought novel ideas to interpreter
> design, wouldn't it make sense to reap the best bits and bring them to
> Python?
>
> Has anyone delved into the designs and considered their applicability to
> Python?
>
> Hoping-to-see-some-V8-and-Python-teams-collaboration-in-Mountain-View-ly
> yours,
> Mart Sõmermaa
>

Hi Mart,

This is a better discussion for the python-ideas list. That being
said, there was a thread discussing this last year, see:

http://mail.python.org/pipermail/python-dev/2008-October/083176.html

-jesse
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] V8, TraceMonkey, SquirrelFish and Python

2009-01-27 Thread Steve Holden
Jesse Noller wrote:
> On Tue, Jan 27, 2009 at 9:50 AM, Mart Sõmermaa  wrote:
>> As most of you know there's constant struggle on the JavaScript front to get
>> even faster performance out of interpreters.
>> V8, TraceMonkey and SquirrelFish have brought novel ideas to interpreter
>> design, wouldn't it make sense to reap the best bits and bring them to
>> Python?
>>
>> Has anyone delved into the designs and considered their applicability to
>> Python?
>>
>> Hoping-to-see-some-V8-and-Python-teams-collaboration-in-Mountain-View-ly
>> yours,
>> Mart Sõmermaa
>>
> 
> Hi Mart,
> 
> This is a better discussion for the python-ideas list. That being
> said, there was a thread discussing this last year, see:
> 
> http://mail.python.org/pipermail/python-dev/2008-October/083176.html
> 
I am sure this will be included as a part of the discussion at the VM
summit that's taking place as a part of the pre-PyCon activity.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FormatError() in callproc.c under win32

2009-01-27 Thread Luke Dunstan



> From: eckha...@satorlaser.com
> To: python-dev@python.org
> Subject: Re: [Python-Dev] FormatError() in callproc.c under win32
> Date: Tue, 27 Jan 2009 12:16:01 +0100
> CC: coder_infi...@hotmail.com
> 
> On Monday 26 January 2009, Martin v. Löwis wrote:
> > > In callproc.c from trunk is a function called SetException(), which calls
> > > FormatError() only to discard the contents. Can anyone enlighten me to
> > > the reasons thereof?

The left over call to FormatError() looks like a mistake to me.

> >
> > Interestingly enough, the code used to say
> >
> >PyErr_SetString(PyExc_WindowsError, lpMsgBuf);
> >
> > Then it was changed to its current form, with a log message of
> >
> >Changes for windows CE, contributed by Luke Dunstan.  Thanks a lot!
> >
> > See
> >
> > 
> http://ctypes.cvs.sourceforge.net/viewvc/ctypes/ctypes/source/callproc.c?hideattic=0&r1=1.127.2.15&r2=1.127.2.16
> >
> > I suggest you ask Thomas Heller and Luke Dunstan (if available) what the
> > rationale for this partial change was.
> 
> I can only guess:
> 1. Those changes seem to generate TCHAR strings. This is necessary to compile 
> it on both win9x (TCHAR=char) and CE (TCHAR=wchar_t). Since win9x was dropped 
> from the supported platforms, that isn't necessary any more and all the code 
> could use WCHAR directly.

As far as I remember TCHAR was char for Windows NT/2K/XP Python builds too, at 
least at that time, but yes it would be clearer to use WCHAR instead now.

> 2. Those changes also seem to change a few byte-strings to Unicode-strings, 
> see format_error(). This is a questionable step, since those are changes that 
> are visible to Python code. Worse, even on the same platform it could return 
> different string types when the lookup of the errorcode fails. I wonder if 
> that is intentional.

Probably not intentional. Yes, it would be better if the return value was 
either always char or always WCHAR.

> 
> In any case, CCing Luke on the issue, maybe he can clarify things.
> 
> cheers
> 
> Uli

Good luck,
Luke

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Sun resources [Was: I've got a surprise for you!]

2009-01-27 Thread Steve Holden
So, if anyone wants to run a Sun buildbot or whatever, Jim would be the
person to contact. Synchronize on this list to ensure Jim doesn't get
multiple approaches, please.

regards
 Steve

 Original Message 
Subject: Re: [PSF-Board] I've got a surprise for you!
Date: Tue, 27 Jan 2009 10:49:29 -0700
From: Jim Walker 
Reply-To: james.wal...@sun.com
Organization: Sun Microsystems, Inc.
To: Steve Holden 
References: <20090126233246.ga37...@wind.teleri.net>
<497e9320.2030...@sun.com> <497ef94e.3050...@holdenweb.com>

Steve Holden wrote:
> 
> Thanks, this is a terrific offer. I am copying it to the Python
> developers list so they can discuss it - I know that Solaris is one of
> the platforms we do get quite a few build questions about.
> 

Sounds good. Depending on what you want to do, I can
assign a system to your group within a week or two.

Cheers,
Jim

-- 
Jim Walker, http://blogs.sun.com/jwalker
Sun Microsystems, Software, Solaris QE
x77744, 500 Eldorado Blvd, Broomfield CO 80021




-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 6:23 AM, Jesse Noller  wrote:
> On Tue, Jan 27, 2009 at 4:49 AM, Jake McGuire  wrote:
>> Instance attribute names are normally interned - this is done in
>> PyObject_SetAttr (among other places).  Unpickling (in pickle and cPickle)
>> directly updates __dict__ on the instance object.  This bypasses the
>> interning so you end up with many copies of the strings representing your
>> attribute names, which wastes a lot of space, both in RAM and in pickles of
>> sequences of objects created from pickles.  Note that the native python
>> memcached client uses pickle to serialize objects.
>>
> import pickle
> class C(object):
>> ...   def __init__(self, x):
>> ... self.long_attribute_name = x
>> ...
> len(pickle.dumps([pickle.loads(pickle.dumps(C(None),
> pickle.HIGHEST_PROTOCOL)) for i in range(100)], pickle.HIGHEST_PROTOCOL))
>> 3658
> len(pickle.dumps([C(None) for i in range(100)],
> pickle.HIGHEST_PROTOCOL))
>> 1441
>
>>
>> Interning the strings on unpickling makes the pickles smaller, and at least
>> for cPickle actually makes unpickling sequences of many objects slightly
>> faster.  I have included proposed patches to cPickle.c and pickle.py, and
>> would appreciate any feedback.
>>
>> dhcp-172-31-170-32:~ mcguire$ diff -u
>> Downloads/Python-2.4.3/Modules/cPickle.c cPickle.c
>> --- Downloads/Python-2.4.3/Modules/cPickle.c2004-07-26
>> 22:22:33.0 -0700
>> +++ cPickle.c   2009-01-26 23:30:31.0 -0800
>> @@ -4258,6 +4258,8 @@
>>PyObject *state, *inst, *slotstate;
>>PyObject *__setstate__;
>>PyObject *d_key, *d_value;
>> +   PyObject *name;
>> +   char * key_str;
>>int i;
>>int res = -1;
>>
>> @@ -4319,8 +4321,24 @@
>>
>>i = 0;
>>while (PyDict_Next(state, &i, &d_key, &d_value)) {
>> -   if (PyObject_SetItem(dict, d_key, d_value) < 0)
>> -   goto finally;
>> +   /* normally the keys for instance attributes are
>> +  interned.  we should try to do that here. */
>> +   if (PyString_CheckExact(d_key)) {
>> +   key_str = PyString_AsString(d_key);
>> +   name = PyString_FromString(key_str);
>> +   if (! name)
>> +   goto finally;
>> +
>> +   PyString_InternInPlace(&name);
>> +   if (PyObject_SetItem(dict, name, d_value) <
>> 0) {
>> +   Py_DECREF(name);
>> +   goto finally;
>> +   }
>> +   Py_DECREF(name);
>> +   } else {
>> +   if (PyObject_SetItem(dict, d_key, d_value) <
>> 0)
>> +   goto finally;
>> +   }
>>}
>>Py_DECREF(dict);
>>}
>>
>> dhcp-172-31-170-32:~ mcguire$ diff -u Downloads/Python-2.4.3/Lib/pickle.py
>> pickle.py
>> --- Downloads/Python-2.4.3/Lib/pickle.py2009-01-27
>> 01:41:43.0 -0800
>> +++ pickle.py   2009-01-27 01:41:31.0 -0800
>> @@ -1241,7 +1241,15 @@
>> state, slotstate = state
>> if state:
>> try:
>> -inst.__dict__.update(state)
>> +d = inst.__dict__
>> +try:
>> +for k,v in state.items():
>> +d[intern(k)] = v
>> +# keys in state don't have to be strings
>> +# don't blow up, but don't go out of our way
>> +except TypeError:
>> +d.update(state)
>> +
>> except RuntimeError:
>> # XXX In restricted execution, the instance's __dict__
>> # is not accessible.  Use the old way of unpickling
>>
>
> Hi Jake,
>
> You should really post this to bugs.python.org as an enhancement so we
> can track the discussion there.
>
> -jesse

Seconded, with eagerness -- interning attribute names when unpickling
makes a lot of sense!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> Interning the strings on unpickling makes the pickles smaller, and at
> least for cPickle actually makes unpickling sequences of many objects
> slightly faster.  I have included proposed patches to cPickle.c and
> pickle.py, and would appreciate any feedback.

Please submit patches always to the bug tracker.

On the proposed change: While it is fairly unintrusive, I would like to
propose a different approach - pickle interned strings special. The
marshal module already uses this approach, and it should extend to
pickle (although it would probably require a new protocol).

On pickling, inspect each string and check whether it is interned. If
so, emit a different code, and record it into the object id dictionary.
On a second occurrence of the string, only pickle a backward reference.
(Alternatively, check whether pickling the same string a second time
would be more compact).

On unpickling, support the new code to intern the result strings;
subsequent references to it will go to the standard backreferencing
algorithm.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] FormatError() in callproc.c under win32

2009-01-27 Thread Martin v. Löwis
> Note: under CE, you can actually encounter datatype misalignments, since it 
> runs on CPUs that don't emulate them. I wonder if the same doesn't also apply 
> to win64

I don't think you can get misalignment traps on AMD64. Not sure about
IA-64: I know that the processor will trap on misaligned accesses, but
the operating system might silently fix the access.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 10:43 AM, "Martin v. Löwis"  wrote:
>> Interning the strings on unpickling makes the pickles smaller, and at
>> least for cPickle actually makes unpickling sequences of many objects
>> slightly faster.  I have included proposed patches to cPickle.c and
>> pickle.py, and would appreciate any feedback.
>
> Please submit patches always to the bug tracker.
>
> On the proposed change: While it is fairly unintrusive, I would like to
> propose a different approach - pickle interned strings special. The
> marshal module already uses this approach, and it should extend to
> pickle (although it would probably require a new protocol).
>
> On pickling, inspect each string and check whether it is interned. If
> so, emit a different code, and record it into the object id dictionary.
> On a second occurrence of the string, only pickle a backward reference.
> (Alternatively, check whether pickling the same string a second time
> would be more compact).
>
> On unpickling, support the new code to intern the result strings;
> subsequent references to it will go to the standard backreferencing
> algorithm.

Hm. This would change the pickling format though. Wouldn't just
interning (short) strings on unpickling be simpler?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python 3.0.1

2009-01-27 Thread Raymond Hettinger
With the extensive changes in the works, Python 3.0.1 is shaping-up to be a complete rerelease of 3.0 with API changes and major 
usability fixes.  It will fully supplant the original 3.0 release which was hobbled by poor IO performance.


I propose to make the new release more attractive by backporting several module improvements already in 3.1, including two new 
itertools and one collections class.  These are already fully documented, tested, and checked-in to 3.1 and it would be ashamed to 
let them sit idle for a year or so, when the module updates are already ready-to-ship.



Raymond 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 11:00 AM, Raymond Hettinger  wrote:
> With the extensive changes in the works, Python 3.0.1 is shaping-up to be a
> complete rerelease of 3.0 with API changes and major usability fixes.  It
> will fully supplant the original 3.0 release which was hobbled by poor IO
> performance.
>
> I propose to make the new release more attractive by backporting several
> module improvements already in 3.1, including two new itertools and one
> collections class.  These are already fully documented, tested, and
> checked-in to 3.1 and it would be ashamed to let them sit idle for a year or
> so, when the module updates are already ready-to-ship.

In that case, I recommend just releasing it as 3.1. I had always
anticipated a 3.1 release much sooner than the typical release
schedule.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Raymond Hettinger

From: "Guido van Rossum" 

In that case, I recommend just releasing it as 3.1. I had always
anticipated a 3.1 release much sooner than the typical release
schedule.


That is great idea.   It's a strong cue that there is
a somewhat major break with 3.0 (removed functions,
API fixes, huge performance fixes, and whatnot).


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Barry Warsaw

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jan 27, 2009, at 2:05 PM, Guido van Rossum wrote:

On Tue, Jan 27, 2009 at 11:00 AM, Raymond Hettinger   
wrote:
With the extensive changes in the works, Python 3.0.1 is shaping-up  
to be a
complete rerelease of 3.0 with API changes and major usability  
fixes.  It
will fully supplant the original 3.0 release which was hobbled by  
poor IO

performance.

I propose to make the new release more attractive by backporting  
several
module improvements already in 3.1, including two new itertools and  
one

collections class.  These are already fully documented, tested, and
checked-in to 3.1 and it would be ashamed to let them sit idle for  
a year or

so, when the module updates are already ready-to-ship.


In that case, I recommend just releasing it as 3.1. I had always
anticipated a 3.1 release much sooner than the typical release
schedule.


I was going to object on principle to Raymond's suggestion to rip out  
the operator module functions in Python 3.0.1.  I have no objection to  
ripping them out for 3.1.


If you really think we need a Python 3.1 soon, then I won't worry  
about trying to get a 3.0.1 out soon.  3.1 is Benjamin's baby :).


If OTOH we do intend to get a 3.0.1 out, say by the end of February,  
then please be careful to adhere to our guidelines for which version  
various changes can go in.  For example, the operator methods needs to  
be restored to the 3.0 maintenance branch, and any other API changes  
added to 3.0 need to be backed out and applied only to the python3  
trunk.


Barry

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSX9ggHEjvBPtnXfVAQJkTwQAmpKLlXwiIdgHANxlj85wNko4kB7o8Xv8
8wKT6/ZZeU8t09eelchklhw9rAB4I/BQcoQYPg9jiUydbFWdPd/0/G8xrr+F+dTO
J2fkGEK1GVorcAZ3iWywpLQXPnHgfrelUBhKT5KzIu5xWzuEnLBDT3c+r2fwNZia
hNpAu1Ihj+s=
=g69v
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Brett Cannon
On Tue, Jan 27, 2009 at 11:29, Barry Warsaw  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On Jan 27, 2009, at 2:05 PM, Guido van Rossum wrote:
>
>> On Tue, Jan 27, 2009 at 11:00 AM, Raymond Hettinger 
>> wrote:
>>>
>>> With the extensive changes in the works, Python 3.0.1 is shaping-up to be
>>> a
>>> complete rerelease of 3.0 with API changes and major usability fixes.  It
>>> will fully supplant the original 3.0 release which was hobbled by poor IO
>>> performance.
>>>
>>> I propose to make the new release more attractive by backporting several
>>> module improvements already in 3.1, including two new itertools and one
>>> collections class.  These are already fully documented, tested, and
>>> checked-in to 3.1 and it would be ashamed to let them sit idle for a year
>>> or
>>> so, when the module updates are already ready-to-ship.
>>
>> In that case, I recommend just releasing it as 3.1. I had always
>> anticipated a 3.1 release much sooner than the typical release
>> schedule.
>

A quick 3.1 release also shows how committed we are to 3.x and that we
realize that 3.0 had some initial growing pains that needed to be
worked out.

> I was going to object on principle to Raymond's suggestion to rip out the
> operator module functions in Python 3.0.1.

I thought it was for 3.1?

>  I have no objection to ripping
> them out for 3.1.
>
> If you really think we need a Python 3.1 soon, then I won't worry about
> trying to get a 3.0.1 out soon.  3.1 is Benjamin's baby :).
>

Depending on what Benjamin wants to do we could try for something like
a release by PyCon or at PyCon during the sprints. Actually the sprint
one is a rather nice idea if Benjamin is willing to spend sprint time
on it (and he is sticking around for the sprints) as I assume you,
Barry, will be there to be able to help in person and we can squash
last minute issues really quickly.

> If OTOH we do intend to get a 3.0.1 out, say by the end of February, then
> please be careful to adhere to our guidelines for which version various
> changes can go in.  For example, the operator methods needs to be restored
> to the 3.0 maintenance branch, and any other API changes added to 3.0 need
> to be backed out and applied only to the python3 trunk.

If you have the time for it, Barry, I am +1 on an end of February
3.0.1 with a March/April 3.1 if that works for Benjamin.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> Hm. This would change the pickling format though. Wouldn't just
> interning (short) strings on unpickling be simpler?

Sure - that's what Jake had proposed. However, it is always difficult
to select which strings to intern - his heuristics (IIUC) is to intern
all strings that appear as dictionary keys. Whether this is good enough,
I don't know. In particular, it might intern very large strings that
aren't identifiers at all.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Barry Warsaw

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jan 27, 2009, at 2:39 PM, Brett Cannon wrote:

I was going to object on principle to Raymond's suggestion to rip  
out the

operator module functions in Python 3.0.1.


I thought it was for 3.1?


Sorry, I probably misread Raymond's suggestion.


I have no objection to ripping
them out for 3.1.

If you really think we need a Python 3.1 soon, then I won't worry  
about

trying to get a 3.0.1 out soon.  3.1 is Benjamin's baby :).



Depending on what Benjamin wants to do we could try for something like
a release by PyCon or at PyCon during the sprints. Actually the sprint
one is a rather nice idea if Benjamin is willing to spend sprint time
on it (and he is sticking around for the sprints) as I assume you,
Barry, will be there to be able to help in person and we can squash
last minute issues really quickly.


Yep, I'm planning on sticking around, so that's a great idea.

If OTOH we do intend to get a 3.0.1 out, say by the end of  
February, then
please be careful to adhere to our guidelines for which version  
various
changes can go in.  For example, the operator methods needs to be  
restored
to the 3.0 maintenance branch, and any other API changes added to  
3.0 need

to be backed out and applied only to the python3 trunk.


If you have the time for it, Barry, I am +1 on an end of February
3.0.1 with a March/April 3.1 if that works for Benjamin.


Or at least a 3.1alpha/beta/whatever during Pycon.  I'm sure I can  
find the time to do a 3.0.1 before Pycon.


Barry

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSX9mGXEjvBPtnXfVAQL5BgP+JXX43hbNlrjeV9YBFBbCB9SfnFlImTTx
ZHhilw12yH13Ha2RLbre+sWlBDQFdTeAJkjUWg2/iZ7Ti8g9eD7sp1KRRuLkbTx0
83h+ciTd9Fdp+sv4JRKfP609X0dlAfbrjjVU/NzXCHePXb++Tr2liHRtHwnr3DgL
kZNp1jOTG8Q=
=nVHs
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 11:40 AM, "Martin v. Löwis"  wrote:
>> Hm. This would change the pickling format though. Wouldn't just
>> interning (short) strings on unpickling be simpler?
>
> Sure - that's what Jake had proposed. However, it is always difficult
> to select which strings to intern - his heuristics (IIUC) is to intern
> all strings that appear as dictionary keys. Whether this is good enough,
> I don't know. In particular, it might intern very large strings that
> aren't identifiers at all.

Just set a size limit, e.g. 30 or 100. It's just a heuristic. I
believe somewhere in Python itself I intern string literals if they
are reasonably short and fit the pattern of an identifier; I'd worry
that the pattern matching would slow down unpickling more than the
expected benefit though, so perhaps just a size test would be better.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> Just set a size limit, e.g. 30 or 100. It's just a heuristic. I
> believe somewhere in Python itself I intern string literals if they
> are reasonably short and fit the pattern of an identifier; I'd worry
> that the pattern matching would slow down unpickling more than the
> expected benefit though, so perhaps just a size test would be better.

Ok. So, Jake, it's back to my original request - please submit this
to the tracker (preferably along with test cases).

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Benjamin Peterson
On Tue, Jan 27, 2009 at 1:00 PM, Raymond Hettinger  wrote:
> With the extensive changes in the works, Python 3.0.1 is shaping-up to be a
> complete rerelease of 3.0 with API changes and major usability fixes.  It
> will fully supplant the original 3.0 release which was hobbled by poor IO
> performance.
>
> I propose to make the new release more attractive by backporting several
> module improvements already in 3.1, including two new itertools and one
> collections class.  These are already fully documented, tested, and
> checked-in to 3.1 and it would be ashamed to let them sit idle for a year or
> so, when the module updates are already ready-to-ship.

At the moment, there are 4 release blockers for 3.0.1. I'd like to see
3.0.1 released soon (within the next month.) It would fix the hugest
mistakes in the initial release most of which have been done committed
since December. I'm sure it would be attractive enough with the nasty
bugs fixed in it! Let's not completely open the flood gates.

Releasing 3.1 in March or April also sounds good. I will be at least
at the first day of sprints.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Jake McGuire

On Jan 27, 2009, at 11:40 AM, Martin v. Löwis wrote:

Hm. This would change the pickling format though. Wouldn't just
interning (short) strings on unpickling be simpler?


Sure - that's what Jake had proposed. However, it is always difficult
to select which strings to intern - his heuristics (IIUC) is to intern
all strings that appear as dictionary keys. Whether this is good  
enough,

I don't know. In particular, it might intern very large strings that
aren't identifiers at all.


I may have misunderstood how unpickling works, but I believe that my  
path only interns strings that are keys in a dictionary used to  
populate an instance.  This is very similar to how instance creation  
and modification works in Python now.  The only difference is if you  
set an attribute via "inst.__dict__['attribute_name'] = value" then  
'attribute_name' will not be automatically interned, but if you pickle  
the instance, 'attribute_name' will be interned on unpickling.


There may be cases where users specifically go through __dict__ to  
avoid interning attribute names, but I would be surprised to hear  
about it and very interested in talking to the person who did that.


Creating a new pickle protocol to handle this case seems excessive...

-jake 
___

Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Martin v. Löwis
> At the moment, there are 4 release blockers for 3.0.1. I'd like to see
> 3.0.1 released soon (within the next month.)

I agree. In December, there was a huge sense of urgency that we
absolutely must have a 3.0.1 last year - and now people talk about
giving up 3.0 entirely.

Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think
it should be released earlier (else 3.0 looks fairly ridiculous).

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Kevin Jacobs
On Tue, Jan 27, 2009 at 3:22 PM, Benjamin Peterson wrote:
>
> At the moment, there are 4 release blockers for 3.0.1. I'd like to see
> 3.0.1 released soon (within the next month.) It would fix the hugest
> mistakes in the initial release most of which have been done committed
> since December. I'm sure it would be attractive enough with the nasty
> bugs fixed in it! Let's not completely open the flood gates.
>
> Releasing 3.1 in March or April also sounds good. I will be at least
> at the first day of sprints.
>


As an interested observer, but not yet user of the 3.x series, I was
wondering about progress on restoring io performance and what release those
improvements were slated for.  This is the major blocker for me to begin
porting my non-numpy/scipy dependent code.  Much of my current work is in
bioinformatics, often dealing with multi-gigabyte datasets, so file io fast
is critical.  Otherwise, I'll have to live with 2.x for the indefinite
future.

Thanks,
~Kevin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> I may have misunderstood how unpickling works

Perhaps I have misunderstood your patch. Posting it to Rietveld might
also be useful.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 12:28 PM, "Martin v. Löwis"  wrote:
>> At the moment, there are 4 release blockers for 3.0.1. I'd like to see
>> 3.0.1 released soon (within the next month.)
>
> I agree. In December, there was a huge sense of urgency that we
> absolutely must have a 3.0.1 last year - and now people talk about
> giving up 3.0 entirely.
>
> Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think
> it should be released earlier (else 3.0 looks fairly ridiculous).

It sounds like my approval of Raymond's removal of certain (admittedly
obsolete) operators from the 3.0 branch was premature. Barry at least
thinks those should be rolled back. Others?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Martin v. Löwis
>> Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think
>> it should be released earlier (else 3.0 looks fairly ridiculous).
> 
> It sounds like my approval of Raymond's removal of certain (admittedly
> obsolete) operators from the 3.0 branch was premature. Barry at least
> thinks those should be rolled back. Others?

I agree that not too much harm is done by removing stuff in 3.0.1 that
erroneously had been left in the 3.0 release - in particular if 3.0.1
gets released quickly (e.g. within two months of the original release).

If that is an acceptable policy, then those changes would fall under
the policy. If the policy is *not* acceptable, a lot of changes to
3.0.1 need to be rolled back (e.g. the ongoing removal of __cmp__
fragments)

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Raymond Hettinger

From: ""Martin v. Löwis"" 

Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think
it should be released earlier (else 3.0 looks fairly ridiculous).


I think it should be released earlier and completely supplant 3.0
before more third-party developers spend time migrating code.
We needed 3.0 to get released so we could get the feedback
necessary to shake it out.  Now, it is time for it to fade into history
and take advantage of the lessons learned.

The principles for the 2.x series don't really apply here.  In 2.x, there
was always a useful, stable, clean release already fielded and there
were tons of third-party apps that needed a slow rate of change.

In contrast, 3.0 has a near zero installed user base (at least in terms
of being used in production).  It has very few migrated apps.  It is
not particularly clean and some of the work for it was incomplete
when it was released.

My preference is to drop 3.0 entirely (no incompatable bugfix release)
and in early February release 3.1 as the real 3.x that migrators ought
to aim for and that won't have incompatable bugfix releases.  Then at
PyCon, we can have a real bug day and fix-up any chips in the paint.

If 3.1 goes out right away, then it doesn't matter if 3.0 looks ridiculous.
All eyes go to the latest release.  Better to get this done before more
people download 3.0 to kick the tires.


Raymond





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] V8, TraceMonkey, SquirrelFish and Python

2009-01-27 Thread Mart Sõmermaa
On Tue, Jan 27, 2009 at 5:04 PM, Jesse Noller  wrote:

> Hi Mart,
>
> This is a better discussion for the python-ideas list. That being
> said, there was a thread discussing this last year, see:
>
> http://mail.python.org/pipermail/python-dev/2008-October/083176.html
>
> -jesse
>

Indeed, sorry. Incidentally, there is a similar discussion going on just
now.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO performance

2009-01-27 Thread Antoine Pitrou

Hello Kevin,

> As an interested observer, but not yet user of the 3.x series, I was wondering
about progress on restoring io performance and what release those improvements
were slated for.

There is an SVN branch with a complete rewrite (in C) of the IO stack. You can
find it in branches/io-c. Apart from a problem in _ssl.c, it should be quite
usable. Your tests and observations are welcome!

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] pprint(iterator)

2009-01-27 Thread Raymond Hettinger

It is becoming the norm in 3.x for functions to return iterators, generators, 
or views whereever possible.

I had a thought that pprint() ought to be taught to print iterators:

   pprint(enumerate(seq))
   pprint(map(somefunc, somedata))
   pprint(permutations(elements))
   pprint(mydict.items())

Currently, all four of those will print something like:

   >>> pprint(d.items())
   
   >>> pprint(enumerate(d))
   

If pprint() is to give a more useful result, the question is how best to 
represent the iterators.

In the examples for itertools, I adopted the convention of displaying  results
like a collection with no commas or enclosing delimiters:

   # chain('ABC', 'DEF') --> A B C D E F

The equivalent for pprint would be the same for items, using space for items on one row or using linefeeds for output too long for 
one row.


Another idea is to make-up an angle-bracket style to provide a visual cue for 
iterator output:

   <'A' 'B' 'C' 'D' 'E' 'F'>

Perhaps with commas:

   <'A', 'B', 'C', 'D', 'E', 'F'>

None of those ideas can be run through eval, nor do they identify the type of 
iterator.  Perhaps these would be better:

   

or

  iter(['A', 'B', 'C', 'D', 'E', 'F'])


Do you guys have any thoughts on the subject?


Raymond 


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Benjamin Peterson
On Tue, Jan 27, 2009 at 3:19 PM, Raymond Hettinger  wrote:
> If 3.1 goes out right away, then it doesn't matter if 3.0 looks ridiculous.
> All eyes go to the latest release.  Better to get this done before more
> people download 3.0 to kick the tires.

It seems like we are arguing over the version number of basically the
same thing. I would like to see 3.0.1 released in early February for
nearly the reasons you name. However, it seems to me that there are
two kinds of issues: those like __cmp__ removal and some silly IO bugs
that have been fixed for a while and our waiting to be released.
There's also projects like io in c which are important, but would not
make the schedule you and I want for 3.0.1/3.1. It's for those longer
term features that I want 3.0.1 and 3.1. If we immedatly released 3.1,
when would those longer term projects that are important for migration
make it to stable? 3.2 is probably a while off.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Guido van Rossum
My only thought is that whatever you do, target Python 3.1, not 3.0.1.

On Tue, Jan 27, 2009 at 1:46 PM, Raymond Hettinger  wrote:
> It is becoming the norm in 3.x for functions to return iterators,
> generators, or views whereever possible.
>
> I had a thought that pprint() ought to be taught to print iterators:
>
>   pprint(enumerate(seq))
>   pprint(map(somefunc, somedata))
>   pprint(permutations(elements))
>   pprint(mydict.items())
>
> Currently, all four of those will print something like:
>
>   >>> pprint(d.items())
>   
>   >>> pprint(enumerate(d))
>   
>
> If pprint() is to give a more useful result, the question is how best to
> represent the iterators.
>
> In the examples for itertools, I adopted the convention of displaying
>  results
> like a collection with no commas or enclosing delimiters:
>
>   # chain('ABC', 'DEF') --> A B C D E F
>
> The equivalent for pprint would be the same for items, using space for items
> on one row or using linefeeds for output too long for one row.
>
> Another idea is to make-up an angle-bracket style to provide a visual cue
> for iterator output:
>
>   <'A' 'B' 'C' 'D' 'E' 'F'>
>
> Perhaps with commas:
>
>   <'A', 'B', 'C', 'D', 'E', 'F'>
>
> None of those ideas can be run through eval, nor do they identify the type
> of iterator.  Perhaps these would be better:
>
>   
>
> or
>
>  iter(['A', 'B', 'C', 'D', 'E', 'F'])
>
>
> Do you guys have any thoughts on the subject?
>
>
> Raymond
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Antoine Pitrou
Benjamin Peterson  python.org> writes:
> 
> At the moment, there are 4 release blockers for 3.0.1. I'd like to see
> 3.0.1 released soon (within the next month.) It would fix the hugest
> mistakes in the initial release most of which have been done committed
> since December. I'm sure it would be attractive enough with the nasty
> bugs fixed in it! Let's not completely open the flood gates.
> 
> Releasing 3.1 in March or April also sounds good. I will be at least
> at the first day of sprints.

+1 on all Benjamin said.
The IO-in-C branch cannot be reasonably pulled in release30-maint, but it will
be ready for 3.1.
Speaking of which, testers are welcome (the branch is in branches/io-c). Also, I
need someone to update the Windows build files.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Oleg Broytmann
On Tue, Jan 27, 2009 at 01:46:35PM -0800, Raymond Hettinger wrote:
>

   I like the idea, and I prefer this formatting. Also bear in mind there
are infinite generators, and there are iterators that cannot be reset. For
infinite generators pprint() must have a parameter, say, 'max_items', and
print . The situation with
iterators that cannot be reset should be documented.

Oleg.
-- 
 Oleg Broytmannhttp://phd.pp.ru/p...@phd.pp.ru
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Benjamin Peterson
On Tue, Jan 27, 2009 at 3:46 PM, Raymond Hettinger  wrote:
> It is becoming the norm in 3.x for functions to return iterators,
> generators, or views whereever possible.

> Do you guys have any thoughts on the subject?

Maybe a solution like this could help with bugs like #2610?



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 2:06 PM, Oleg Broytmann  wrote:
> On Tue, Jan 27, 2009 at 01:46:35PM -0800, Raymond Hettinger wrote:
>>
>
>   I like the idea, and I prefer this formatting. Also bear in mind there
> are infinite generators, and there are iterators that cannot be reset. For
> infinite generators pprint() must have a parameter, say, 'max_items', and
> print . The situation with
> iterators that cannot be reset should be documented.

This pretty much kills the proposal. Calling a "print" function like
pprint() should not have a side effect on the object being printed.
I'd be okay of pprint() special-cased the views returned by e.g.
dict.keys(), but if all we know is that the argument has a __next__
method, pprint() should *not* be calling that.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Nick Coghlan
Raymond Hettinger wrote:
>

I quite like the idea of something along those lines. For example:

try:
  itr = iter(obj)
except TypeError:
  pass
else:
  return "" % (obj.__class__.__name__,
))

Doing this only in pprint also reduces the chances of accidentally
consuming an iterator (which was a reasonable objection when I suggested
changing the __str__ implementation on some of the standard iterators
some time ago).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 2:12 PM, Benjamin Peterson  wrote:
> On Tue, Jan 27, 2009 at 3:46 PM, Raymond Hettinger  wrote:
>> It is becoming the norm in 3.x for functions to return iterators,
>> generators, or views whereever possible.
>
>> Do you guys have any thoughts on the subject?
>
> Maybe a solution like this could help with bugs like #2610?

It would have to special-case range() objects.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Jake McGuire

On Jan 27, 2009, at 12:39 PM, Martin v. Löwis wrote:

I may have misunderstood how unpickling works


Perhaps I have misunderstood your patch. Posting it to Rietveld might
also be useful.


It is not immediately clear to me how Rietveld works.  But I have  
created an issue on tracker:


http://bugs.python.org/issue5084

Another vaguely related change would be to store string and unicode  
objects in the pickler memo keyed as themselves rather than their  
object ids.  Depending on the data set, you can have many copies of  
the same string, e.g. "application/octet-stream".  This may marginally  
increase memory usage during pickling, depending on the data being  
pickled and the way in which the code was written.


I'm happy to write this up if people are interested...

-jake
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Raymond Hettinger


[Guido van Rossum]

My only thought is that whatever you do, target Python 3.1, not 3.0.1.


Of course.  


Do you have any thoughts on the most useful display format?
What do you want to see from pprint(mydict.items())?


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Raymond Hettinger


[Benjamin Peterson]

It seems like we are arguing over the version number of basically the
same thing. I would like to see 3.0.1 released in early February for
nearly the reasons you name. However, it seems to me that there are
two kinds of issues: those like __cmp__ removal and some silly IO bugs
that have been fixed for a while and our waiting to be released.
There's also projects like io in c which are important, but would not
make the schedule you and I want for 3.0.1/3.1.


What is involved in finishing io-in-c? 

ISTM, that is critical and that its absence is a serious barrier 
to adoption in a production environment.


How far away is it?


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Martin v. Löwis
> My preference is to drop 3.0 entirely (no incompatable bugfix release)
> and in early February release 3.1 as the real 3.x that migrators ought
> to aim for and that won't have incompatable bugfix releases.  Then at
> PyCon, we can have a real bug day and fix-up any chips in the paint.

I would fear that than 3.1 gets the same fate as 3.0. In May, we will
all think "what piece of junk was that 3.1 release, let's put it to
history", and replace it with 3.2. By then, users will wonder if there
is ever a 3.x release that is any good.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Martin v. Löwis

> The IO-in-C branch cannot be reasonably pulled in release30-maint, but it will
> be ready for 3.1.

Even if 3.1 is released in February?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Brett Cannon
On Tue, Jan 27, 2009 at 14:31, "Martin v. Löwis"  wrote:
>> My preference is to drop 3.0 entirely (no incompatable bugfix release)
>> and in early February release 3.1 as the real 3.x that migrators ought
>> to aim for and that won't have incompatable bugfix releases.  Then at
>> PyCon, we can have a real bug day and fix-up any chips in the paint.
>
> I would fear that than 3.1 gets the same fate as 3.0. In May, we will
> all think "what piece of junk was that 3.1 release, let's put it to
> history", and replace it with 3.2. By then, users will wonder if there
> is ever a 3.x release that is any good.

That's my fear as well. I have no problem doing a quick 3.0.1 release
any time between now and the end of February and start with the first
alpha or beta of 3.1 at PyCon.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pprint(iterator)

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 2:24 PM, Raymond Hettinger  wrote:
>
> [Guido van Rossum]
>>
>> My only thought is that whatever you do, target Python 3.1, not 3.0.1.
>
> Of course.
> Do you have any thoughts on the most useful display format?
> What do you want to see from pprint(mydict.items())?

Perhaps <['a', 'b', ...]> ? The list display is familiar to everyone;
the surrounding <> make it clear that it's not really a list without
adding much noise.

Another idea would be  which helpfully
includes the name of the type of the object that was passed into
pprint().

Regarding range(), I wonder if we really need to show more than
'range(0, 10)' -- anything besides that would be wasteful IMO.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO performance

2009-01-27 Thread Bill Janssen
Antoine Pitrou  wrote:

> There is an SVN branch with a complete rewrite (in C) of the IO stack. You can
> find it in branches/io-c. Apart from a problem in _ssl.c, it should be quite
> usable. Your tests and observations are welcome!

And I'll look at that _ssl.c problem.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Antoine Pitrou
Raymond Hettinger  rcn.com> writes:
> 
> What is involved in finishing io-in-c? 

Off the top of my head:
- fix the _ssl bug which prevents some tests from passing (issue #4967)
- clean up io.py (and decide what to do with the remaining Python code:
basically, the parts of StringIO which are implemented in Python)
- of course, test in various situations, review the code, suggest possible
improvements...

Now here are some performance figures. Text I/O is done in utf-8 with universal
newlines enabled:


=== I/O in C ===

** Binary input **

[ 400KB ] read one unit at a time...   1.64 MB/s
[ 400KB ] read 20 units at a time...   27.2 MB/s
[ 400KB ] read 4096 units at a time...  845 MB/s

[  20KB ] read whole contents at once...924 MB/s
[ 400KB ] read whole contents at once...883 MB/s
[  10MB ] read whole contents at once...980 MB/s

[ 400KB ] seek forward one unit at a time...  0.528 MB/s
[ 400KB ] seek forward 1000 units at a time...  516 MB/s
[ 400KB ] alternate read & seek one unit...1.33 MB/s
[ 400KB ] alternate read & seek 1000 units...   490 MB/s

** Text input **

[ 400KB ] read one unit at a time...   2.28 MB/s
[ 400KB ] read 20 units at a time...   29.2 MB/s
[ 400KB ] read one line at a time...   71.7 MB/s
[ 400KB ] read 4096 units at a time... 97.4 MB/s

[  20KB ] read whole contents at once...108 MB/s
[ 400KB ] read whole contents at once...112 MB/s
[  10MB ] read whole contents at once...   89.7 MB/s

[ 400KB ] seek forward one unit at a time... 0.0904 MB/s
[ 400KB ] seek forward 1000 units at a time... 87.4 MB/s

** Binary append **

[  20KB ] write one unit at a time... 0.668 MB/s
[ 400KB ] write 20 units at a time...  12.2 MB/s
[ 400KB ] write 4096 units at a time... 722 MB/s
[  10MB ] write 1e6 units at a time... 1529 MB/s

** Text append **

[  20KB ] write one unit at a time... 0.983 MB/s
[ 400KB ] write 20 units at a time...16 MB/s
[ 400KB ] write 4096 units at a time... 236 MB/s
[  10MB ] write 1e6 units at a time...  261 MB/s

** Binary overwrite **

[  20KB ] modify one unit at a time...0.677 MB/s
[ 400KB ] modify 20 units at a time... 12.1 MB/s
[ 400KB ] modify 4096 units at a time...382 MB/s

[ 400KB ] alternate write & seek one unit...  0.212 MB/s
[ 400KB ] alternate write & seek 1000 units...  173 MB/s
[ 400KB ] alternate read & write one unit...  0.827 MB/s
[ 400KB ] alternate read & write 1000 units...  276 MB/s

** Text overwrite **

[  20KB ] modify one unit at a time...0.296 MB/s
[ 400KB ] modify 20 units at a time... 5.69 MB/s
[ 400KB ] modify 4096 units at a time...151 MB/s


=== I/O in Python (branches/py3k) ===

** Binary input **

[ 400KB ] read one unit at a time...  0.174 MB/s
[ 400KB ] read 20 units at a time...   3.44 MB/s
[ 400KB ] read 4096 units at a time...  246 MB/s

[  20KB ] read whole contents at once...443 MB/s
[ 400KB ] read whole contents at once...216 MB/s
[  10MB ] read whole contents at once...274 MB/s

[ 400KB ] seek forward one unit at a time...  0.188 MB/s
[ 400KB ] seek forward 1000 units at a time...  182 MB/s
[ 400KB ] alternate read & seek one unit...  0.0821 MB/s
[ 400KB ] alternate read & seek 1000 units...  81.2 MB/s

** Text input **

[ 400KB ] read one unit at a time...  0.218 MB/s
[ 400KB ] read 20 units at a time...3.8 MB/s
[ 400KB ] read one line at a time...   3.69 MB/s
[ 400KB ] read 4096 units at a time... 34.9 MB/s

[  20KB ] read whole contents at once...   70.5 MB/s
[ 400KB ] read whole contents at once... 81 MB/s
[  10MB ] read whole contents at once...   68.7 MB/s

[ 400KB ] seek forward one unit at a time... 0.0709 MB/s
[ 400KB ] seek forward 1000 units at a time... 67.3 MB/s

** Binary append **

[  20KB ] write one unit at a time...  0.15 MB/s
[ 400KB ] write 20 units at a time...  2.88 MB/s
[ 400KB ] write 4096 units at a time... 346 MB/s
[  10MB ] write 1e6 units at a time...  728 MB/s

** Text append **

[  20KB ] write one unit at a time...0.0814 MB/s
[ 400KB ] write 20 units at a time...  1.51 MB/s
[ 400KB ] write 4096 units at a time... 118 MB/s
[  10MB ] write 1e6 units at a time...  218 MB/s

** Binary overwrite **

[  20KB ] modify one unit at

Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Benjamin Peterson
On Tue, Jan 27, 2009 at 4:44 PM, Antoine Pitrou  wrote:
> Raymond Hettinger  rcn.com> writes:
>>
>> What is involved in finishing io-in-c?
>
> Off the top of my head:
> - fix the _ssl bug which prevents some tests from passing (issue #4967)
> - clean up io.py (and decide what to do with the remaining Python code:
> basically, the parts of StringIO which are implemented in Python)
> - of course, test in various situations, review the code, suggest possible
> improvements...

There are also several IO bugs that should be fixed before it becomes
official like #5006.

>
> Now here are some performance figures. Text I/O is done in utf-8 with 
> universal
> newlines enabled:




-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Daniel Stutzbach
On Tue, Jan 27, 2009 at 4:44 PM, Antoine Pitrou  wrote:

> Now here are some performance figures. Text I/O is done in utf-8 with
> universal
> newlines enabled:
>

Would it be much trouble to also compare performance with Python 2.6?

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Antoine Pitrou
Martin v. Löwis  v.loewis.de> writes:
> 
> > The IO-in-C branch cannot be reasonably pulled in release30-maint, but it
will
> > be ready for 3.1.
> 
> Even if 3.1 is released in February?

No, unless we take some risks and rush it in.
(technically, it seems to work, but it's such a critical piece of code that it
would be nice to let it rest a little)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Antoine Pitrou
Daniel Stutzbach  stutzbachenterprises.com> writes:
> 
> Would it be much trouble to also compare performance with Python 2.6?


Here are the results on trunk. Keep in mind Text IO, while it's still `open("r",
filename)`, does not mean the same thing.


=== 2.7 I/O (trunk) ===

** Binary input **

[ 400KB ] read one unit at a time...   1.48 MB/s
[ 400KB ] read 20 units at a time...   29.2 MB/s
[ 400KB ] read 4096 units at a time... 1038 MB/s

[  20KB ] read whole contents at once...   1145 MB/s
[ 400KB ] read whole contents at once...891 MB/s
[  10MB ] read whole contents at once...966 MB/s

[ 400KB ] seek forward one unit at a time...  0.893 MB/s
[ 400KB ] seek forward 1000 units at a time...  568 MB/s
[ 400KB ] alternate read & seek one unit...1.11 MB/s
[ 400KB ] alternate read & seek 1000 units...   563 MB/s

** Text input **

[ 400KB ] read one unit at a time...   1.41 MB/s
[ 400KB ] read 20 units at a time...   28.4 MB/s
[ 400KB ] read one line at a time...207 MB/s
[ 400KB ] read 4096 units at a time... 1060 MB/s

[  20KB ] read whole contents at once...   1196 MB/s
[ 400KB ] read whole contents at once...841 MB/s
[  10MB ] read whole contents at once...966 MB/s

[ 400KB ] seek forward one unit at a time...  0.873 MB/s
[ 400KB ] seek forward 1000 units at a time...  589 MB/s

** Binary append **

[  20KB ] write one unit at a time... 0.887 MB/s
[ 400KB ] write 20 units at a time...  15.8 MB/s
[ 400KB ] write 4096 units at a time...1071 MB/s
[  10MB ] write 1e6 units at a time... 1523 MB/s

** Text append **

[  20KB ] write one unit at a time...  1.33 MB/s
[ 400KB ] write 20 units at a time...  22.9 MB/s
[ 400KB ] write 4096 units at a time...1244 MB/s
[  10MB ] write 1e6 units at a time... 1540 MB/s

** Binary overwrite **

[  20KB ] modify one unit at a time...0.867 MB/s
[ 400KB ] modify 20 units at a time... 15.3 MB/s
[ 400KB ] modify 4096 units at a time...446 MB/s

[ 400KB ] alternate write & seek one unit...  0.237 MB/s
[ 400KB ] alternate write & seek 1000 units...  151 MB/s
[ 400KB ] alternate read & write one unit...  0.221 MB/s
[ 400KB ] alternate read & write 1000 units...  153 MB/s

** Text overwrite **

[  20KB ] modify one unit at a time... 1.32 MB/s
[ 400KB ] modify 20 units at a time... 22.5 MB/s
[ 400KB ] modify 4096 units at a time...509 MB/s


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Raymond Hettinger


[Antoine Pitrou]

Now here are some performance figures. Text I/O is done in utf-8 with universal
newlines enabled:


That's a substantial boost.
How does it compare to Py2.x equivalents?


Raymond

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Barry Warsaw

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jan 27, 2009, at 5:36 PM, Brett Cannon wrote:

On Tue, Jan 27, 2009 at 14:31, "Martin v. Löwis"  
 wrote:
My preference is to drop 3.0 entirely (no incompatable bugfix  
release)
and in early February release 3.1 as the real 3.x that migrators  
ought
to aim for and that won't have incompatable bugfix releases.  Then  
at

PyCon, we can have a real bug day and fix-up any chips in the paint.


I would fear that than 3.1 gets the same fate as 3.0. In May, we will
all think "what piece of junk was that 3.1 release, let's put it to
history", and replace it with 3.2. By then, users will wonder if  
there

is ever a 3.x release that is any good.


That's my fear as well. I have no problem doing a quick 3.0.1 release
any time between now and the end of February and start with the first
alpha or beta of 3.1 at PyCon.



+1
Barry

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSX+R6HEjvBPtnXfVAQLQBwQAuJfVHtKQRqptjl1Hlkz37RSqMnCGNE/f
Fm2JmulfWbtlZgeZ+YgBMyPw2jGpmkSp/zB0aThuBNRrtcEPOnO0nFKxWwcFwBa/
ZddlM9RJvb+GgBPNOjnSXNSJcYmNLwea7GuKPkTVmkb9nH0JLOnk2dLVTGjJ89Q4
F3qsGz5coEc=
=gUH4
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Brett Cannon
On Tue, Jan 27, 2009 at 14:44, Antoine Pitrou  wrote:
> Raymond Hettinger  rcn.com> writes:
>>
>> What is involved in finishing io-in-c?
>
> Off the top of my head:
> - fix the _ssl bug which prevents some tests from passing (issue #4967)
> - clean up io.py (and decide what to do with the remaining Python code:
> basically, the parts of StringIO which are implemented in Python)

The other VMs might appreciate the code being available and used if
_io is not available for import. If you need help on how to have the
tests run twice, once on the Python code and again on the C code, you
can look at test_heapq and test_warnings for approaches.

> - of course, test in various situations, review the code, suggest possible
> improvements...
>
> Now here are some performance figures. Text I/O is done in utf-8 with 
> universal
> newlines enabled:
>

That is impressive! Congrats to you and (I think) Amaury for all the
hard work you guys have put in.

-Brett

>
> === I/O in C ===
>
> ** Binary input **
>
> [ 400KB ] read one unit at a time...   1.64 MB/s
> [ 400KB ] read 20 units at a time...   27.2 MB/s
> [ 400KB ] read 4096 units at a time...  845 MB/s
>
> [  20KB ] read whole contents at once...924 MB/s
> [ 400KB ] read whole contents at once...883 MB/s
> [  10MB ] read whole contents at once...980 MB/s
>
> [ 400KB ] seek forward one unit at a time...  0.528 MB/s
> [ 400KB ] seek forward 1000 units at a time...  516 MB/s
> [ 400KB ] alternate read & seek one unit...1.33 MB/s
> [ 400KB ] alternate read & seek 1000 units...   490 MB/s
>
> ** Text input **
>
> [ 400KB ] read one unit at a time...   2.28 MB/s
> [ 400KB ] read 20 units at a time...   29.2 MB/s
> [ 400KB ] read one line at a time...   71.7 MB/s
> [ 400KB ] read 4096 units at a time... 97.4 MB/s
>
> [  20KB ] read whole contents at once...108 MB/s
> [ 400KB ] read whole contents at once...112 MB/s
> [  10MB ] read whole contents at once...   89.7 MB/s
>
> [ 400KB ] seek forward one unit at a time... 0.0904 MB/s
> [ 400KB ] seek forward 1000 units at a time... 87.4 MB/s
>
> ** Binary append **
>
> [  20KB ] write one unit at a time... 0.668 MB/s
> [ 400KB ] write 20 units at a time...  12.2 MB/s
> [ 400KB ] write 4096 units at a time... 722 MB/s
> [  10MB ] write 1e6 units at a time... 1529 MB/s
>
> ** Text append **
>
> [  20KB ] write one unit at a time... 0.983 MB/s
> [ 400KB ] write 20 units at a time...16 MB/s
> [ 400KB ] write 4096 units at a time... 236 MB/s
> [  10MB ] write 1e6 units at a time...  261 MB/s
>
> ** Binary overwrite **
>
> [  20KB ] modify one unit at a time...0.677 MB/s
> [ 400KB ] modify 20 units at a time... 12.1 MB/s
> [ 400KB ] modify 4096 units at a time...382 MB/s
>
> [ 400KB ] alternate write & seek one unit...  0.212 MB/s
> [ 400KB ] alternate write & seek 1000 units...  173 MB/s
> [ 400KB ] alternate read & write one unit...  0.827 MB/s
> [ 400KB ] alternate read & write 1000 units...  276 MB/s
>
> ** Text overwrite **
>
> [  20KB ] modify one unit at a time...0.296 MB/s
> [ 400KB ] modify 20 units at a time... 5.69 MB/s
> [ 400KB ] modify 4096 units at a time...151 MB/s
>
>
> === I/O in Python (branches/py3k) ===
>
> ** Binary input **
>
> [ 400KB ] read one unit at a time...  0.174 MB/s
> [ 400KB ] read 20 units at a time...   3.44 MB/s
> [ 400KB ] read 4096 units at a time...  246 MB/s
>
> [  20KB ] read whole contents at once...443 MB/s
> [ 400KB ] read whole contents at once...216 MB/s
> [  10MB ] read whole contents at once...274 MB/s
>
> [ 400KB ] seek forward one unit at a time...  0.188 MB/s
> [ 400KB ] seek forward 1000 units at a time...  182 MB/s
> [ 400KB ] alternate read & seek one unit...  0.0821 MB/s
> [ 400KB ] alternate read & seek 1000 units...  81.2 MB/s
>
> ** Text input **
>
> [ 400KB ] read one unit at a time...  0.218 MB/s
> [ 400KB ] read 20 units at a time...3.8 MB/s
> [ 400KB ] read one line at a time...   3.69 MB/s
> [ 400KB ] read 4096 units at a time... 34.9 MB/s
>
> [  20KB ] read whole contents at once...   70.5 MB/s
> [ 400KB ] read whole contents at once... 81 MB/s
> [  10MB ] read whole contents at once...   68.7 MB/s
>
> [ 400KB ] seek forward one unit at a time... 0.0709 MB/s
> [ 400KB ] seek forward 1000 units at a time... 67.3 MB/s

Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Barry Warsaw

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jan 27, 2009, at 3:48 PM, Martin v. Löwis wrote:


Releasing 3.1 6 months after 3.0 sounds reasonable; I don't think
it should be released earlier (else 3.0 looks fairly ridiculous).


It sounds like my approval of Raymond's removal of certain  
(admittedly

obsolete) operators from the 3.0 branch was premature. Barry at least
thinks those should be rolled back. Others?


I agree that not too much harm is done by removing stuff in 3.0.1 that
erroneously had been left in the 3.0 release - in particular if 3.0.1
gets released quickly (e.g. within two months of the original  
release).


If that is an acceptable policy, then those changes would fall under
the policy. If the policy is *not* acceptable, a lot of changes to
3.0.1 need to be rolled back (e.g. the ongoing removal of __cmp__
fragments)


I have no problem with removing things that were advertised and/or  
documented to be removed in 3.0 but accidentally were not.  That seems  
like a reasonable policy to me.  However, if we did not tell people  
that something was going to be removed, then I don't think we can  
really remove it in 3.0.


Barry


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSX+S4nEjvBPtnXfVAQIjuQQAucsAp79ZtlcOq1GPiwDaEoYMKTEgkkNp
hLgdDW85ktmFf0xHl/KAU8lcxeaiWGepefsRxsx7c5fX6UIVZPUHDvkDkf5rImx6
wg7Nin2MirLT/lXY7a8//N+5TwLqIBTLLEfAIAFvDhrQT/CuMfZej7leB7BAd7Ti
puLWYYYUL+M=
=pK8E
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Bill Janssen
> - fix the _ssl bug which prevents some tests from passing (issue #4967)

I see you've already got a patch for this.  I'll try it out.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Raymond Hettinger


[Martin]

I would fear that than 3.1 gets the same fate as 3.0. In May, we will
all think "what piece of junk was that 3.1 release, let's put it to
history", and replace it with 3.2. By then, users will wonder if there
is ever a 3.x release that is any good.


I thought the gist of Guido's idea was to label 3.0.1 as 3.1 to emphasize
the magnitude of differences from 3.0.   That seemed like a good idea
to me.  But I'm happy no matter what you want to call it.  The important
thing is that the bugfixes go in and the half-started removals get finished.
I would like the next release (whatever it is called) to include the IO
speedups which will help remove a barrier to adoption for serious use.

I do hope the next release goes out as soon as possible.  I use 3.0 daily
and my impression is that the current version needs to be replaced as soon
as possible.

If it gets called 3.1, the nice side effect for me is that my itertools updates
get fielded a bit sooner.  But that is a somewhat unimportant consideration.
I really have no opinion on what the next release gets called.


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Raymond Hettinger

If something gets left in 3.0.1 and then ripped-out in 3.1, I think we're
doing more harm than good.  Very little code has been ported to 3.0
so far.  One there is a base, all changes become more difficult.

In the interests of our users, I vote for sooner than later.

Also, 3.0 is a special case because it is IMO a broken release.
AFAICT, it is not in any distro yet.  Hopefully, no one will keep it around
and it will vanish silently.


Raymond

- Original Message - 
I have no problem with removing things that were advertised and/or  
documented to be removed in 3.0 but accidentally were not.  That seems  
like a reasonable policy to me.  However, if we did not tell people  
that something was going to be removed, then I don't think we can  
really remove it in 3.0.


Barry
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Antoine Pitrou
Raymond Hettinger  rcn.com> writes:
> 
> Also, 3.0 is a special case because it is IMO a broken release.
> AFAICT, it is not in any distro yet.

I have access to an Ubuntu 8.10 box and:

$ apt-cache search python3.0
idle-python3.0 - An IDE for Python (v3.0) using Tkinter
libpython3.0 - Shared Python runtime library (version 3.0)
python3-all - Package depending on all supported Python runtime versions
python3-all-dbg - Package depending on all supported Python debugging packages
python3-all-dev - Package depending on all supported Python development packages
python3-dbg - Debug Build of the Python Interpreter (version 3.0)
python3.0 - An interactive high-level object-oriented language (version 3.0)
python3.0-dbg - Debug Build of the Python Interpreter (version 3.0)
python3.0-dev - Header files and a static library for Python (v3.0)
python3.0-doc - Documentation for the high-level object-oriented language Python
(v3.0)
python3.0-examples - Examples for the Python language (v3.0)
python3.0-minimal - A minimal subset of the Python language (version 3.0)


But it's not installed by default.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Daniel Stutzbach
On Tue, Jan 27, 2009 at 4:54 PM, Antoine Pitrou  wrote:

> Daniel Stutzbach  stutzbachenterprises.com> writes:
> > Would it be much trouble to also compare performance with Python 2.6?
>
> Here are the results on trunk.
>

Thanks, Antoine!  To make comparison easier, I put together the results into
a Google Spreadsheet:
http://spreadsheets.google.com/pub?key=pbqSxQEo4UXwPlifXmvPHGQ

Keep in mind Text IO, while it's still `open("r",
> filename)`, does not mean the same thing.


That's because in Python 3, the Text IO has to convert to Unicode, correct?


--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Georg Brandl
Raymond Hettinger schrieb:
> [Martin]
>> I would fear that than 3.1 gets the same fate as 3.0. In May, we will
>> all think "what piece of junk was that 3.1 release, let's put it to
>> history", and replace it with 3.2. By then, users will wonder if there
>> is ever a 3.x release that is any good.
> 
> I thought the gist of Guido's idea was to label 3.0.1 as 3.1 to emphasize
> the magnitude of differences from 3.0.   That seemed like a good idea
> to me.  But I'm happy no matter what you want to call it.  The important
> thing is that the bugfixes go in and the half-started removals get finished.
> I would like the next release (whatever it is called) to include the IO
> speedups which will help remove a barrier to adoption for serious use.

FWIW, I completely agree here.

> I do hope the next release goes out as soon as possible.  I use 3.0 daily
> and my impression is that the current version needs to be replaced as soon
> as possible.

That's important to note: I do not use Python 3.x productively in any way,
other than trying to port a bit of a library every now and then, and I expect
that many others here are in the same position.  In these matters, we should
give more weight to what *actual users* like Raymond think.

It's a great thing that we actually got 3.0 out, and didn't stall somewhere
along the way, but the next step is to make sure it gets accepted and used,
and doesn't get abandoned for a long time because of policies that come from
the 2.x branch but might not be healthy for 3.x.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Benjamin Peterson
On Tue, Jan 27, 2009 at 5:04 PM, Barry Warsaw  wrote:
> I have no problem with removing things that were advertised and/or
> documented to be removed in 3.0 but accidentally were not.  That seems like
> a reasonable policy to me.  However, if we did not tell people that
> something was going to be removed, then I don't think we can really remove
> it in 3.0.

As others have said, this would technically include cmp() removal. In
the 2.x docs, there are big warnings by the operator functions and a
suggestion to use ABCs. We also already have a 2to3 fixer for the
module.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Antoine Pitrou
Daniel Stutzbach  stutzbachenterprises.com> writes:
> 
> Thanks, Antoine!  To make comparison easier, I put together the results into a
Google 
Spreadsheet:http://spreadsheets.google.com/pub?key=pbqSxQEo4UXwPlifXmvPHGQ

Thanks, that's much more readable indeed.

> That's because in Python 3, the Text IO has to convert to Unicode, correct?  

Yes, exactly.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Barry Warsaw

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Jan 27, 2009, at 6:21 PM, Raymond Hettinger wrote:

If something gets left in 3.0.1 and then ripped-out in 3.1, I think  
we're

doing more harm than good.  Very little code has been ported to 3.0
so far.  One there is a base, all changes become more difficult.

In the interests of our users, I vote for sooner than later.

Also, 3.0 is a special case because it is IMO a broken release.
AFAICT, it is not in any distro yet.  Hopefully, no one will keep it  
around

and it will vanish silently.


I stand by my opinion about the right way to do this.  I also think  
that a 3.1 release 6 months after 3.0 is perfectly fine and serves our  
users just as well.


Barry

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSX+fNnEjvBPtnXfVAQJO1QQAmRVH0tslNfRfpQsC+2jlJu5uljOVvuvN
uE3/HFktxLUr6NPdOk+Ir1r2p4mQ5iXFlZbJvOSNckM3UYSFkeKmS/T0nVJzqx89
+23sv7UC2Qf8zJRJBEhzuePT1iAE8OybRH1Vxql9ka8FVzCrZHt2JhnRZUmHNblT
Y2d92iL7eqE=
=Qzdr
-END PGP SIGNATURE-
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PyCon 2009 registration is now open!

2009-01-27 Thread David Goodger
Register here:
http://us.pycon.org/2009/register/

Information (rates etc.):
http://us.pycon.org/2009/registration/

Hotel information & reservations:
http://us.pycon.org/2009/about/hotel/

Early bird registration ends February 21, so don't delay!

-- David Goodger, PyCon 2009 Chair
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Daniel Stutzbach
On Tue, Jan 27, 2009 at 5:44 PM, Antoine Pitrou  wrote:

> Daniel Stutzbach  stutzbachenterprises.com> writes:
> > That's because in Python 3, the Text IO has to convert to Unicode,
> correct?
>
> Yes, exactly.
>

What kind of input are you using for the Text tests?  I'm kind of surprised
that the conversion to Unicode results in such a dramatic slowdown, if
you're feeding it plain text (characters 0x00 through 0x7f).

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Antoine Pitrou
Daniel Stutzbach  stutzbachenterprises.com> writes:
> 
> What kind of input are you using for the Text tests?  I'm kind of surprised
that the conversion to Unicode results in such a dramatic slowdown, if you're
feeding it plain text (characters 0x00 through 0x7f).

It's some arbitrary text composed of 95% ASCII characters and 5% non-ASCII. On
this specific example, utf8 decodes at around 250 MB/s, latin1 at almost 1 GB/s
(on the same machine on which I ran the benchmarks).

You can find the test here:
http://svn.python.org/view/sandbox/trunk/iobench/



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Daniel Stutzbach
On Tue, Jan 27, 2009 at 6:15 PM, Antoine Pitrou  wrote:

> It's some arbitrary text composed of 95% ASCII characters and 5% non-ASCII.
> On
> this specific example, utf8 decodes at around 250 MB/s, latin1 at almost 1
> GB/s
> (on the same machine on which I ran the benchmarks).
>

For the "10MB whole contents at once" test, we then have:
(assuming the code does no pipelining of disk I/O with decoding)

10MB / 980MB/s to read from disk = 10 ms
10MB / 250MB/s to decode to utf8 = 40 ms
10MB / (10ms + 40ms) = 200 MB/s

In practice, your results shows around 90 MB/s.  That's at least vaguely in
the same ballpark.

--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Antoine Pitrou
Daniel Stutzbach  stutzbachenterprises.com> writes:
> For the "10MB whole contents at once" test, we then have:
> (assuming the code does no pipelining of disk I/O with decoding)
> 
> 10MB / 980MB/s to read from disk = 10 ms
> 10MB / 250MB/s to decode to utf8 = 40 ms
> 10MB / (10ms + 40ms) = 200 MB/s 
> 
> In practice, your results shows around 90 MB/s.  That's at least vaguely in
> the same ballpark.

Yes, the remaining CPU time is spent in the IncrementalNewlineDecoder (which
does universal newline translation).


Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Steve Holden
Barry Warsaw wrote:
> On Jan 27, 2009, at 6:21 PM, Raymond Hettinger wrote:
> 
>> If something gets left in 3.0.1 and then ripped-out in 3.1, I think we're
>> doing more harm than good.  Very little code has been ported to 3.0
>> so far.  One there is a base, all changes become more difficult.
> 
>> In the interests of our users, I vote for sooner than later.
> 
>> Also, 3.0 is a special case because it is IMO a broken release.
>> AFAICT, it is not in any distro yet.  Hopefully, no one will keep it
>> around
>> and it will vanish silently.
> 
> I stand by my opinion about the right way to do this.  I also think that
> a 3.1 release 6 months after 3.0 is perfectly fine and serves our users
> just as well.
> 
+1

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Victor Stinner

Benjamin Peterson a écrit :

There are also several IO bugs that should be fixed before it becomes
official like #5006.
  

I looked at this one, but I discovered another a bug with f.tell(): it's
now issue #5008. This issue is now closed, that I will look again to #5006.

See also #5016 (f.seekable() bug).

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Matthew Wilkes


On 27 Jan 2009, at 23:56, Barry Warsaw wrote:


Also, 3.0 is a special case because it is IMO a broken release.
AFAICT, it is not in any distro yet.  Hopefully, no one will keep  
it around

and it will vanish silently.


I stand by my opinion about the right way to do this.  I also think  
that a 3.1 release 6 months after 3.0 is perfectly fine and serves  
our users just as well.



I'm lurking here, as I usually have nothing to contribute, but here's  
my take on this:



I'm generally a Python 2.4 user, but have recently been able to tinker  
in 2.6.  I hope to be using 2.6 as my main language within a year.  I  
anticipate dropping all 2.4 projects within 5 years.  We have not yet  
dropped 2.3.


I didn't know 3.0 is considered a broken release, but teething  
troubles are to be expected.  Knowing this, I would be reluctant to  
use 3.0.1, it sounds like too small a change.  If you put a lot of  
things into a minor point release you risk setting expectations about  
future ones.  From the 2.x series I 2.x.{y,y+1) to be seemless, but 2. 
{x,x+1} to be more performant, include new features and potentially  
break comlpex code.


I personally would see a 3.1 with C based IO support as being more  
sensible than a 3.0.1 with lots of changes.  I wouldn't worry about  
3.x being seen as a dead duck, as you say it's not in wide use yet.   
We trust you guys, if there's been big fixes there should be a big  
version update.  Broadcast what's been made better and it'll encourage  
us to try it.



Matt
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Alexandre Vassalotti
On Tue, Jan 27, 2009 at 5:16 PM, Jake McGuire  wrote:
> Another vaguely related change would be to store string and unicode objects
> in the pickler memo keyed as themselves rather than their object ids.

That wouldn't be difficult to do--i.e., simply add a type check in
Pickler.memoize and another in Pickler.save().  But I am not sure if
that would be a good idea, since you would end up hashing every string
pickled. And, that would probably be expensive if you are pickling for
long strings.

-- Alexandre
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Raymond Hettinger


[Matthew Wilkes]
I didn't know 3.0 is considered a broken release, but teething  
troubles are to be expected.  Knowing this, I would be reluctant to  
use 3.0.1, it sounds like too small a change.


Not to worry.  Many of the major language features are stable
and many of the rough edges are quickly getting ironed-out.
Over time, anything that's slow will get optimized and all will be well.

What we're discussing are subtlies of major vs minor releases.
When the tp_compare change goes in, will it affect third-party
C extensions enough to warrant a 3.1 name instead of 3.0.1?
Are users better served by removing operator.isSequenceType()
in 3.0.1 while there are still few early adopers and few converted
third-party modules or will we help them more by warning them 
in advance and waiting for 3.1.


The nice thing about the IO speedups is that the API is already
set and won't change.  So, the speedup doesn't really affect whether
the release gets named 3.0.1 or 3.1.  The important part is that
we get it out as soon as it's solid so that we don't preclude adoption
by users who need fast IO.


Raymond
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Steve Holden
Steve Holden wrote:
> Barry Warsaw wrote:
[...]
>> I stand by my opinion about the right way to do this.  I also think that
>> a 3.1 release 6 months after 3.0 is perfectly fine and serves our users
>> just as well.
>>
> +1
> 
I should have been more explicit. I think that stuff that was slated for
removal in 3.0 should be removed as soon as possible, and a micro
release is fine for that.

ISTM that if we really cared about our users we would have got this
right before we released 3.0. Since we clearly didn't, it behooves us
make sure that any 3.1 release isn't a( repeat performance.

There are changes that should clearly have been made before 3.0 saw the
light of day, which are now being discussed for incorporation. If those
 changes were *supposed* to be made before 3.0 came out then they should
be made as soon as possible. Waiting for a major release only encourages
people to use them, and once they get use further changes will be seen
as introducing incompatibilities that we have promised would not occur.
So it seems that the operator functions should stand not on the order of
their going, but depart.

While a quick 3.1 release  might look like the best compromise for now,
it cannot then be followed with a quick 3.2 release, and then we are in
the territory Martin warned about. Quality is crucial after a poor
initial release: we have to engender confidence in the user base that we
are not dicking them around with ill-thought-out changes.

So on balance I think it might be better to live with the known
inadequacies of 3.0, making small changes for 3.0.1 and possibly
ignoring the policy that says we don't remove features in point releases
(since they apparently should have been taken out of 3.0 but weren't).
But this is only going to work if the quality of 3.1 is considerably
higher than 3.0, making it worth the wait.

I think that both 3.0 and 2.6 were rushed releases. 2.6 showed it in the
inclusion (later recognizable as somewhat ill-advised so late in the
day) of multiprocessing; 3.0 shows it in the very fact that this
discussion has become necessary. So we face an important turning point:
is 3.1 going to be serious production quality or not?

Given that we have just been presented with a fabulous resource that
could help improve delivered quality (I am talking about snakebite.org,
of course) we might be well-advised to use the 3.1 release as a
demonstration of how much it is going to improve the quality of
delivered releases.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1 (io-in-c)

2009-01-27 Thread Scott David Daniels

Raymond Hettinger wrote:


[Antoine Pitrou]
Now here are some performance figures. Text I/O is done in utf-8 with 
universal

newlines enabled:


That's a substantial boost.
How does it compare to Py2.x equivalents?


Comparison of three cases (including performance rations):
   MB/S MB/SMB/S
   in C  in py3k  in 2.7 C/3k 2.7/3k
** Binary append **
 10M write 1e6 units at a time   1529.00 728.000 1523.000 2.10  2.09
 20K write one unit at a time   0.668  0.1500.887 4.45  5.91
400K write 20 units at a time  12.200  2.880   15.800 4.24  5.49
400K write 4096 units at a time   722.00 346.000 1071.000 2.09  3.10
** Binary input **
 10M read whole contents at once  980.00 274.000  966.000 3.58  3.53
 20K read whole contents at once  924.00 443.000 1145.000 2.09  2.58
400K alternate read & seek 1000 units 490.000 81.200  563.000 6.03  6.93
400K alternate read & seek one unit 1.330  0.0821.11 16.20 13.52
400K read 20 units at a time   27.200  3.440   29.200 7.91  8.49
400K read 4096 units at a time845.00 246.000 1038.000 3.43  4.22
400K read one unit at a time1.64   0.1741.480 9.43  8.51
400K read whole contents at once  883.00 216.000  891.000 4.09  4.13
400K seek forward 1000 units a time   516.00 182.000  568.000 2.84  3.12
400K seek forward one unit at a time0.528  0.1880.893 2.81  4.75
** Binary overwrite **
 20K modify one unit at a time  0.677  0.1230.867 5.50  7.05
400K alternate read & write 1000 unit 276.000 41.100  153.000 6.72  3.72
400K alternate read & write one unit0.827  0.0450.22 18.46  4.93
400K alternate write & seek 1000 unit 173.000 71.400  151.000 2.42  2.11
400K alternate write & seek one unit0.212  0.0820.237 2.60  2.90
400K modify 20 units at a time 12.100  2.340   15.300 5.17  6.54
400K modify 4096 units at a time  382.00 213.000  446.000 1.79  2.09
** Text append **
 10M write 1e6 units at a time261.00 218.000 1540.000 1.20  7.06
 20K write one unit at a time   0.983  0.0811.33 12.08 16.34
400K write 20 units at a time  16.000  1.510   22.90 10.60 15.17
400K write 4096 units at a time   236.00 118.000 1244.000 2.00 10.54
** Text input **
 10M read whole contents at once   89.700 68.700  966.000 1.31 14.06
 20K read whole contents at once  108.000 70.500 1196.000 1.53 16.96
400K read 20 units at a time   29.200  3.800   28.400 7.68  7.47
400K read 4096 units at a time 97.400 34.900 1060.000 2.79 30.37
400K read one line at a time   71.700  3.690  207.00 19.43 56.10
400K read one unit at a time2.280  0.2181.41 10.46  6.47
400K read whole contents at once  112.000 81.000  841.000 1.38 10.38
400K seek forward 1000 units at a time 87.400 67.300  589.000 1.30  8.75
400K seek forward one unit at a time0.090  0.0710.873 1.28 12.31
** Text overwrite **
 20K modify one unit at a time  0.296  0.0721.320 4.09 18.26
400K modify 20 units at a time  5.690  1.360   22.500 4.18 16.54
400K modify 4096 units at a time  151.000 88.300  509.000 1.71  5.76


--Scott David Daniels
scott.dani...@acm.org

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Jeroen Ruigrok van der Werven
-On [20090128 00:21], Raymond Hettinger (pyt...@rcn.com) wrote:
>Also, 3.0 is a special case because it is IMO a broken release.
>AFAICT, it is not in any distro yet.  Hopefully, no one will keep it around
>and it will vanish silently.

It is in FreeBSD's ports since December. Fairly good chance it is in pkgsrc
also by now. Might even be that it is part of FreeBSD's 7.1-RELEASE.

So I reckon with 'distro' you were speaking of Linux only?

-- 
Jeroen Ruigrok van der Werven  / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
Earth to earth, ashes to ashes, dust to dust...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.0.1

2009-01-27 Thread Jeroen Ruigrok van der Werven
-On [20090128 00:57], Barry Warsaw (ba...@python.org) wrote:
>I stand by my opinion about the right way to do this.  I also think  
>that a 3.1 release 6 months after 3.0 is perfectly fine and serves our  
>users just as well.

When API fixes were mentioned, does that mean changes in the API which
influence the C extension? If so, then I think a minor number update (3.1)
is more warranted than a revision number update (3.0.1).

-- 
Jeroen Ruigrok van der Werven  / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
Earth to earth, ashes to ashes, dust to dust...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [PSF-Board] I've got a surprise for you!

2009-01-27 Thread Terry Reedy

Steve Holden wrote:


We now have zone servers in the OpenSolaris test farm, and
I plan to add guest os servers in the next few weeks using
ldoms (sparc) and xvm (x64). The zone servers provide whole
root zones, which should be a good development environment
for most projects. Check it out:

http://test.opensolaris.org/testfarm


Requires sign-in.


http://www.opensolaris.org/os/community/testing/testfarm/zones/


Freely readable.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com