Re: [Python-Dev] Pickling problems are hard to debug

2006-03-26 Thread Martin v. Löwis
Greg Ewing wrote:
 Anyone have any ideas how the situation could
 be improved?

As always: on a case-by-case basis. If you find a specific
case where you think the diagnosis should be better, make it
better for this case. Perhaps some generalization arises while
doing so, but if not, atleast this specific case gets improved.

While doing so, try to cover similar cases in the process, e.g.
different pickle protocols, and different pickle implementations.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Class decorators

2006-03-26 Thread Greg Ewing
I've just been playing around with metaclasses, and
I think I've stumbled across a reason for having
class decorators as an alternative to metaclasses
for some purposes.

The metaclass I wrote was for the purpose of
adding a class to a registry, the reason for which
isn't important here. It worked, but I was surprised
to find that it not only registered the classes that
I made it the metaclass of, but all subclasses of
those classes as well.

I'm not sure whether that's really the behaviour I
want, and I can imagine some cases in which it's
definitely not what I'd want.

The general principle brought out here is that when
you use a metaclass, it gets inherited by subclasses,
but if we had class decorators, they would only affect
to the classes that you explicitly applied them to.
I think there are uses for both behaviours.

Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PySet API

2006-03-26 Thread Raymond Hettinger
[Alex]
  And I'm on the fence regarding the specific issue  of PySet_Next.

 So, having carefully staked out a position smack in the middle, I
 cheerfully now expect to be fired upon from both sides!-)

Okay, here's the first cheap shot ;-)  Which of the following pieces of code is 
preferable?  The first loops with the iterator protocol and the second loops 
with the _next protocol.


static long
frozenset_hash(PyObject *self)
{
PySetObject *so = (PySetObject *)self;
long h, hash = 0;
PyObject *it, *key;

if (so-hash != -1)
return so-hash;

it = PyObject_GetIter(self);
if (it == NULL)
return -1;

while ((key = PyIter_Next(it)) != NULL) {
h = PyObject_Hash(key);
Py_DECREF(key);
if (h == -1) {
Py_DECREF(it);
return -1;
}
hash ^= h * 3644798167;
}
Py_DECREF(it);
if (PyErr_Occurred())
return -1;

if (hash == -1)
hash = 590923713L;
so-hash = hash;
return hash;
}

static long
frozenset_hash(PyObject *self)
{
PySetObject *so = (PySetObject *)self;
long h, hash = 0;
PyObject *key;
Py_ssize_t pos = 0;

if (so-hash != -1)
return so-hash;

while (set_next(so, pos, key)) {
h = PyObject_Hash(key);
if (h == -1) {
return -1;
}
hash ^= h * 3644798167;
}

if (hash == -1)
hash = 590923713L;
so-hash = hash;
return hash;
} 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PySet API

2006-03-26 Thread Aahz
On Sun, Mar 26, 2006, Raymond Hettinger wrote:
 [Alex]

  And I'm on the fence regarding the specific issue  of PySet_Next.

 So, having carefully staked out a position smack in the middle, I
 cheerfully now expect to be fired upon from both sides!-)
 
 Okay, here's the first cheap shot ;-) Which of the following pieces of
 code is preferable?  The first loops with the iterator protocol and
 the second loops with the _next protocol.

Speaking as a person who does relatively little C programming, I don't
see much difference between them.  The first example is more Pythonic --
for Python.  I agree with Barry that it's not much of a virtue for C
code.

However, I do have one nitpick with both your examples; I don't know
whether this is an artifact of them being examples:

 hash ^= h * 3644798167;

Seems to me that magic numbers like this need to be made constants and
explained with a comment.
-- 
Aahz ([EMAIL PROTECTED])   * http://www.pythoncraft.com/

Look, it's your affair if you want to play with five people, but don't
go calling it doubles.  --John Cleese anticipates Usenet
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PySet API

2006-03-26 Thread Raymond Hettinger
[Aahz]
 Speaking as a person who does relatively little C programming, I don't
 see much difference between them.  The first example is more Pythonic --
 for Python.  I agree with Barry that it's not much of a virtue for C
 code.

It was a trick question.  Everyone is supposed to be attracted to the _next 
version because it is shorter, faster, and takes less ref counting management. 
However, the _next version has a hard-to-find bug.  The call to PyObject_Hash() 
can trigger arbitrary Python code and possibly mutate the table, leaving 
pointers to invalid memory addresses.  It would likely take Armin less than 
five 
minutes to write a pure Python crasher for the code.  And THAT is why 
PySet_Next() should never come into being.

The iterator form is more duck-typable and re-usable than the set specific 
_next 
version, but the example was chosen to take that issue off of the table and 
just 
focus on mutation issues.


 However, I do have one nitpick with both your examples; I don't know
 whether this is an artifact of them being examples:

 hash ^= h * 3644798167;

 Seems to me that magic numbers like this need to be made constants and
 explained with a comment

FWIW, the actual code does have comments.  I stripped them out of the posting 
because they weren't relevant to the code comparison.



Raymond 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PySet API

2006-03-26 Thread Alex Martelli

On Mar 26, 2006, at 8:43 AM, Raymond Hettinger wrote:

 [Aahz]
 Speaking as a person who does relatively little C programming, I  
 don't
 see much difference between them.  The first example is more  
 Pythonic --
 for Python.  I agree with Barry that it's not much of a virtue for C
 code.

 It was a trick question.  Everyone is supposed to be attracted to  
 the _next
 version because it is shorter, faster, and takes less ref counting  
 management.
 However, the _next version has a hard-to-find bug.  The call to  
 PyObject_Hash()
 can trigger arbitrary Python code and possibly mutate the table,  
 leaving
 pointers to invalid memory addresses.  It would likely take Armin  
 less than five
 minutes to write a pure Python crasher for the code.  And THAT is why
 PySet_Next() should never come into being.

Sure, accidentally mutating underlying iterables is a subtle (but  
alas frequent) bug, but I don't see why it should be any harsher when  
the loop is using a hypothetical PySet_Next than when it is using  
PyIter_Next -- whatever precautions the latter takes to detect the  
bug and raise an exception instead of crashing, wouldn't it be at  
least as feasible for PySet_Next to take similar precautions  
(probably easier, since PySet_Next need only worry about one concrete  
case rather than an arbitrary variety)? What does PyDict_Next do in  
similar cases, and why couldn't PySet_Next behave similarly?  (Yes, I  
could/should look it up myself, but I'm supposed to be working on the  
2nd Ed of the Nutshell, whose deadline is getting worryingly  
close...;-).


Alex

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PySet API

2006-03-26 Thread Raymond Hettinger
[Alex]
 Sure, accidentally mutating underlying iterables is a subtle (but  alas 
 frequent) bug, but I don't see why it should be any harsher when  the loop is 
 using a hypothetical PySet_Next than when it is using  PyIter_Next -- 
 whatever 
 precautions the latter takes to detect the  bug and raise an exception 
 instead 
 of crashing, wouldn't it be at  least as feasible for PySet_Next to take 
 similar precautions

The difference is that the PySet_Next returns pointers to the table keys and 
that the mutation occurs AFTER the call to PySet_Next, leaving pointers to 
invalid addresses.  IOW, the function cannot detect the mutation.

PyIter_Next on the other hand returns an object (not a pointer to an object 
such 
as those in the hash table).  If the table has mutated before the function is 
called, then it simply raises an exception instead of returning an object.  If 
the table mutates afterwards, it is no big deal because the returned object is 
still valid.

FWIW, here's an easier to understand example of the same ilk (taken from real 
code):

   s = PyString_AS_STRING(item);
   Py_DECREF(item);
   if (s == NULL)
break;
   x = strtol(s, endptr, 10);

The problem, of course, is that the decref can render the string pointer 
invalid.  The correct code moves the decref after the strtol() call and inside 
the conditional. This is at the core of the issue.  I don't want the set 
iteration API to return pointers inside the table.  The PyIter_Next API takes a 
couple more lines but is easy to get correct and has nice duck-typing 
properties.

For dicts, the _next api is worth the risk because it saves a double lookup and 
because there are legitimate use cases for changing the contents of the value 
field directly inside the hash table.  For sets, those arguments don't apply. 
We have a safe way that takes a couple more lines and a proposed 
second-way-to-do-it that is dangerously attractive, yet somewhat unsafe.  For 
that reason, I say no to PySet_Next().

Hopefully, as the module author and principal maintainer, I get some say in the 
matter.


Raymond


Nothing is more conducive to peace of mind than not having any opinions at all.
-- Georg Christoph Lichtenberg 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PySet_Next (Was: PySet API)

2006-03-26 Thread Martin v. Löwis
Raymond Hettinger wrote:
 The difference is that the PySet_Next returns pointers to the table keys and 
 that the mutation occurs AFTER the call to PySet_Next, leaving pointers to 
 invalid addresses.  IOW, the function cannot detect the mutation.

I'm coming late to the discussion: where did anybody ever suggest that
PySet_Next should return a pointer into the set? Looking over the entire
discussion, I could not find any mentioning of a specific API.

If it is similar to PyDict_Next, it will have PyObject** /input/
variables, which are really meant as PyObject* /output/ variables.
But yes, PyDict_Next returns a borrowed reference, so if the dictionary
mutates between calls, your borrowed reference might become stale.

 PyIter_Next on the other hand returns an object (not a pointer to an
 object such as those in the hash table).

PyIter_Next behaves identical wrt. to result types to PyDict_Next.
The difference is that PyIter_Next always returns a new reference
(or NULL in case of an exception).

For the caller, a clear usage strategy follows from this: either discard
the references before making a potentially-mutating call, or Py_INCREF
the set element before making that mutating call.

Of course, *after* you made the mutating call, your iteration position
might be bogus, as the set might have been reorganized. If the position
is represented as a Py_ssize_t (as it is for PyDict_Next), the only
consequence of continuing the iteration is that you might see elements
twice or not at all - you cannot cause a crash with that.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PySet_Next (Was: PySet API)

2006-03-26 Thread Raymond Hettinger
 The difference is that the PySet_Next returns pointers to the table keys and
 that the mutation occurs AFTER the call to PySet_Next, leaving pointers to
 invalid addresses.  IOW, the function cannot detect the mutation.

 I'm coming late to the discussion: where did anybody ever suggest that
 PySet_Next should return a pointer into the set? Looking over the entire
 discussion, I could not find any mentioning of a specific API.

Pardon, I bungled the terminology.  PySet_Next returns a borrowed reference. 
That is problematic is arbitrary Python code can be run afterwards (such as 
PyObject_Hash in the example).  We could make a version that returns a new 
reference or immediately Py_INCREF the reference but then PySet_Next() loses 
its 
charm and you might as well be using PyIter_Next().

Aside from bad pointers, the issue of mid-stream table mutation has other 
reliability issues stemming from the contents of the table potentially changing 
in the an arbitrary way as the iteration proceeds.  That means you can make 
very 
few guarantees about the meaningfulness of the results even if you don't crash 
due to a bad pointer.

We have a perfectly good way to iterate with PyIter_Next().  It may take a 
couple of extra lines, but it is easy to get correct and has no surprises.  It 
seems that the only issue is that Barry says that he refuses to use the 
iterator 
protocol.  Heck, just turn it into a list and index directly.  There is no need 
to muck-up the set api for this.


Raymond 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Prevalence of low-level memory abuse?

2006-03-26 Thread Tim Peters
[Edward Loper]
 Could the debug build's macros for PyMem/PyObject_new/free be modified
 to check for mismatches?  Or would storing information about which
 method was used to allocate each pointer be too expensive?  Perhaps a
 special build could be used to check for mismatches?

It's partly possible (e.g., it's impossible to know whether a blob of
memory was obtained by calling malloc() directly).

If someone wants to do it (I do not), the debug build adds 8 bytes to
each side of each memory block obtained via each PyMem and PyObject
malloc/realloc call, and one of the (current) 8 FORBIDDEN_BYTEs could
be used to store flags without significant loss of functionality.  It
would make a decent enhancement.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] daily releases?

2006-03-26 Thread Neal Norwitz
Now that the buildbot is in place and seems to be running relatively
smoothly, maybe should consider making daily (or periodic) builds and
releasing them.  We've got a system in place to build on many
platforms automatically.  How much more difficult would it be to
package up the results and make them available for download?  Maybe we
could get some users testing these builds?  Would users accept a pure
debug build?  If not, what about a build with asserts enabled?

n
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Changing -Q to warn for 2.5?

2006-03-26 Thread Neal Norwitz
http://python.org/sf/1458927 asks if -Q warn option should become the
default in 2.5.  PEP 238 (http://www.python.org/dev/peps/pep-0238/)
says:


The -Q command line option takes a string argument that can take four
values: old, warn, warnall, or new.  The default is old in
Python 2.2 but will change to warn in later 2.x versions.


Is this still accurate?  Do we want to change the default in 2.x?  If
so, does x == 5?

I'm not sure this is worth in 2.x.  If we aren't going to change it,
we should update the PEP.  OTOH, people ask how they can find integer
division in their code.  Even though they can use the flag themselves,
I wonder if anyone who wants help finding integer division really uses
the flag.

n
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] daily releases?

2006-03-26 Thread Martin v. Löwis
Neal Norwitz wrote:
 Now that the buildbot is in place and seems to be running relatively
 smoothly, maybe should consider making daily (or periodic) builds and
 releasing them.  We've got a system in place to build on many
 platforms automatically.  How much more difficult would it be to
 package up the results and make them available for download?  Maybe we
 could get some users testing these builds?  Would users accept a pure
 debug build?  If not, what about a build with asserts enabled?

Depends on what precisely you mean by release. If it is a tarball of
the sources, we already have these:

http://svn.python.org/snapshots/

In a release, the most prominent binary is the Windows installer. While
it would be possible to define buildbot build steps to create a Windows
installer (and msi.py even has the notion of snapshot builds already),
fetching the results from the buildbot slave isn't really supported.

Whether daily RPMs or Mac installers would be possible, I don't know.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] TRUNK FREEZE for 2.5a1: 0000 UTC, Thursday 30th

2006-03-26 Thread Anthony Baxter
Ok, it's time to rock and roll. 

   The SVN trunk is FROZEN for 2.5a1 from 00:00 UTC on 
   Thursday 30th of March. 

I'll post again once it's open. Note that new features can keep going 
in during the alpha cycle, the feature freeze only happens once we 
hit beta. And we're not going to hit beta until the features we want 
are in. 

Please help in making this release as painless as possible by not 
checking in while the trunk is frozen.

Thanks!
Anthony

-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changing -Q to warn for 2.5?

2006-03-26 Thread Giovanni Bajo
Neal Norwitz [EMAIL PROTECTED] wrote:

 
 The -Q command line option takes a string argument that can take four
 values: old, warn, warnall, or new.  The default is old in
 Python 2.2 but will change to warn in later 2.x versions.
 

 I'm not sure this is worth in 2.x.  If we aren't going to change it,
 we should update the PEP.  OTOH, people ask how they can find integer
 division in their code.  Even though they can use the flag themselves,
 I wonder if anyone who wants help finding integer division really uses
 the flag.

-1 gratuitous breakage. There's noting really wrong or dangerous about the old
semantic, it just won't be the one used by Python 3.0. While it's nice to have
an option to help forward porting, I don't think we should force it.

Giovanni Bajo

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Changing -Q to warn for 2.5?

2006-03-26 Thread Anthony Baxter
On Monday 27 March 2006 16:04, Neal Norwitz wrote:
 http://python.org/sf/1458927 asks if -Q warn option should become
 the default in 2.5.  PEP 238
 (http://www.python.org/dev/peps/pep-0238/) says:

 
 The -Q command line option takes a string argument that can take
 four values: old, warn, warnall, or new.  The default is
 old in Python 2.2 but will change to warn in later 2.x
 versions. 

 Is this still accurate?  Do we want to change the default in 2.x? 
 If so, does x == 5?

-1 PITA.

;)


-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Inconsistency in 2.4.3 for __repr__() returning unicode

2006-03-26 Thread Hye-Shik Chang
We got an inconsistency for __repr__() returning unicode as
reported in http://python.org/sf/1459029 :

class s1:
def __repr__(self):
return '\\n'

class s2:
def __repr__(self):
return u'\\n'

print repr(s1()), repr(s2())

Until 2.4.2: \n \n
2.4.3: \n \\n

\\n looks bit weird but it's correct.  As once discussed[1] in
python-dev before, if __repr__ returns unicode object,
PyObject_Repr encodes it via unicode-escape codec.  So,
non-latin character also could be in repr neutrally.

But our unicode-escape had a bug since when it is introduced.
The bug was that it doesn't escape backslashes.  Therefore,
backslashes wasn't escaped in repr while it sholud be escaped
because we used the unicode-escape codec.

So, fixing the bug made a behavior inconsistency.
How do we resolve the problem?

Hye-Shik


[1] http://mail.python.org/pipermail/python-dev/2000-July/005353.html
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com