Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-31 Thread Aahz
In article 7xr62ufv1c@ruckus.brouhaha.com,
Paul Rubin  http://phr...@nospam.invalid wrote:
a...@pythoncraft.com (Aahz) writes:

 CPython's primitive storage management has a lot to do with the
 simplicity of interfacing CPython with external libraries.  Any solution
 that proposes to get rid of the GIL needs to address that.

This, I don't understand.  Other languages like Lisp and Java and
Haskell have foreign function interfaces that easier to program than
Python's, -and- they don't use reference counts.  There's usually some
primitive to protect objects from garbage collection while the foreign
function is using them, etc.  The Java Native Interface (JNI) and the
Haskell FFI are pretty well documented.  The Emacs Lisp system is not
too hard to figure out from examining the source code, etc.

This is the first time I've heard about Java being easier to interface
than Python.  I don't work at that level myself, so I rely on the
informed opinions of other people; can you provide a summary of what
makes those FFIs easier than Python?
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-27 Thread Bryan Olson

Paul Rubin wrote:

Bryan Olson fakeaddr...@nowhere.org writes:

An object's __dict__ slot is *not* mutable; thus we could gain some
efficiency by protecting the object and its dict with the same lock. I
do not see a major win in Mr. Banks' point that we do not need to lock
the object, just its dict.


If the dict contents don't change often, maybe we could use an
STM-like approach to eliminate locks when reading.  That would of
course require rework to just about every C function that accesses
Python objects.


I'm a fan of lock-free data structure and software transactional memory, 
but I'm also a realist. Heck, I'm one of this group's outspoken 
advocates of threaded architectures. Theoretical breakthroughs will 
happen, but in real world of today, threads are great but GIL-less 
Python is a loser.


Wherever Python is going, let's recognize that a scripting language that 
rocks is better than any other kind of language that sucks.



--
--Bryan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-27 Thread Steve Holden
Bryan Olson wrote:
 Paul Rubin wrote:
 Bryan Olson fakeaddr...@nowhere.org writes:
 An object's __dict__ slot is *not* mutable; thus we could gain some
 efficiency by protecting the object and its dict with the same lock. I
 do not see a major win in Mr. Banks' point that we do not need to lock
 the object, just its dict.

 If the dict contents don't change often, maybe we could use an
 STM-like approach to eliminate locks when reading.  That would of
 course require rework to just about every C function that accesses
 Python objects.
 
 I'm a fan of lock-free data structure and software transactional memory,
 but I'm also a realist. Heck, I'm one of this group's outspoken
 advocates of threaded architectures. Theoretical breakthroughs will
 happen, but in real world of today, threads are great but GIL-less
 Python is a loser.
 
 Wherever Python is going, let's recognize that a scripting language that
 rocks is better than any other kind of language that sucks.
 
 
Guido, IIRC, has said that he's against any GIL-removal policy that
lowers performance on single-processor systems. Personally I'd be happy
if there were an *alternative* multi-processor implementation that was
slower for single-processor architectures and faster for
multi-processor, but I'm not about to start developing it.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-27 Thread Paul Rubin
Bryan Olson fakeaddr...@nowhere.org writes:
 I'm a fan of lock-free data structure and software transactional
 memory, but I'm also a realist. Heck, I'm one of this group's
 outspoken advocates of threaded architectures. Theoretical
 breakthroughs will happen, but in real world of today, threads are
 great but GIL-less Python is a loser.

GIL-less Python (i.e. Jython) already exists and beats CPython in
performance a lot of the time, including on single processors.
Whether the GIL can be eliminated from CPython without massive rework
to every extension module ever written is a separate question, of
course.  Jython can be viewed a proof of concept.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-27 Thread Steve Holden
Paul Rubin wrote:
 Bryan Olson fakeaddr...@nowhere.org writes:
 I'm a fan of lock-free data structure and software transactional
 memory, but I'm also a realist. Heck, I'm one of this group's
 outspoken advocates of threaded architectures. Theoretical
 breakthroughs will happen, but in real world of today, threads are
 great but GIL-less Python is a loser.
 
 GIL-less Python (i.e. Jython) already exists and beats CPython in
 performance a lot of the time, including on single processors.
 Whether the GIL can be eliminated from CPython without massive rework
 to every extension module ever written is a separate question, of
 course.  Jython can be viewed a proof of concept.

nods. I think probably the GIL will never be extracted successfully.

Also IronPython and PyPy (though the latter only in concept for now, I
believe). Even Guido admits that CPython doesn't necessarily represent
the dominant future strain ...

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-27 Thread Rhamphoryncus
On Jan 27, 12:47 pm, Steve Holden st...@holdenweb.com wrote:
 Paul Rubin wrote:
  GIL-less Python (i.e. Jython) already exists and beats CPython in
  performance a lot of the time, including on single processors.
  Whether the GIL can be eliminated from CPython without massive rework
  to every extension module ever written is a separate question, of
  course.  Jython can be viewed a proof of concept.

 nods. I think probably the GIL will never be extracted successfully.

 Also IronPython and PyPy (though the latter only in concept for now, I
 believe). Even Guido admits that CPython doesn't necessarily represent
 the dominant future strain ...

IMO it's possible to rewrite only the core while keeping the refcount
API for external compatibility, but a tracing GC API in portable C is
hideous.  Enough to make me want to find or make a better
implementation language.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-27 Thread Paul Rubin
Rhamphoryncus rha...@gmail.com writes:
 IMO it's possible to rewrite only the core while keeping the refcount
 API for external compatibility, but a tracing GC API in portable C is
 hideous. 

It's done all the time for other languages, and is less hassle than
the incref/decref stuff and having to remember the difference between
owned and borrowed references, etc.

 Enough to make me want to find or make a better implementation language.

There is a lot to be said for this, including the self-respect that
comes from a language being able to host its own implementation.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-24 Thread Carl Banks
On Jan 23, 11:45 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
 Carl Banks wrote:
  Classes in Python are mutable types, usually.  Class instances are
  (except for the refcount) immutable objects, usually.

 There's where we disagree. I assert that class instances are usually
 mutable objects.

Nope, you're dead wrong, nothing more to it.  The bits of a class
instance never change.  The __dict__ is a mutable object.  The class
instance itself isn't.  It's not reasonable to call an object whose
bits can't change a mutable obect.

Anyway, all you're doing is distracting attention from my claim that
instance objects wouldn't need to be locked.  They wouldn't, no matter
how mutable you insist these objects whose bits would never change
are.


  BTW, here's a minor brain bender: immutable types are mutable objects.

 Some brains are too easily bent.
[Snip attempt to take this comment seriously]

And some brains are so stodgy they can't even take a lighthearted
comment lightheartedly.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL?

2009-01-24 Thread Hrvoje Niksic
Carl Banks pavlovevide...@gmail.com writes:

 On Jan 23, 11:45 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
 Carl Banks wrote:
  Classes in Python are mutable types, usually.  Class instances are
  (except for the refcount) immutable objects, usually.

 There's where we disagree. I assert that class instances are usually
 mutable objects.

 Nope, you're dead wrong, nothing more to it.  The bits of a class
 instance never change.  The __dict__ is a mutable object.  The class
 instance itself isn't.  It's not reasonable to call an object whose
 bits can't change a mutable obect.

The bits of class instances can very well change.

 class X(object): pass
...
 x = X()
 d = x.__dict__
 x.__dict__ = {}
 map(id, [d, x.__dict__])
[170329876, 170330012]

The Python cookbook even describes patterns that depend on this
operation working.  Class instance's contents can also change if
__slots__ is in use, when its __class__ is assigned to (admittedly the
latter being a rare operation, but still).

 Anyway, all you're doing is distracting attention from my claim that
 instance objects wouldn't need to be locked.  They wouldn't, no
 matter how mutable you insist these objects whose bits would never
 change are.

Only if you're not implementing Python, but another language that
doesn't support __slots__ and assignment to instance.__dict__.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-24 Thread Gabriel Genellina
En Sat, 24 Jan 2009 06:06:02 -0200, Carl Banks pavlovevide...@gmail.com  
escribió:

On Jan 23, 11:45 pm, Bryan Olson fakeaddr...@nowhere.org wrote:

Carl Banks wrote:
 Classes in Python are mutable types, usually.  Class instances are
 (except for the refcount) immutable objects, usually.

There's where we disagree. I assert that class instances are usually
mutable objects.


Nope, you're dead wrong, nothing more to it.  The bits of a class
instance never change.  The __dict__ is a mutable object.  The class
instance itself isn't.  It's not reasonable to call an object whose
bits can't change a mutable obect.

Anyway, all you're doing is distracting attention from my claim that
instance objects wouldn't need to be locked.  They wouldn't, no matter
how mutable you insist these objects whose bits would never change
are.


Me too, I don't get what you mean. Consider a list instance, it contains a  
count of allocated elements, and a pointer to some memory block. They  
change when the list is resized. This counts as mutable to me. I really  
don't understand your claim.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL?

2009-01-24 Thread Paul Rubin
Hrvoje Niksic hnik...@xemacs.org writes:
 Not only registered at the beginning of the function, but also (since
 CPython uses C, not C++) explicitly unregistered at every point of
 exit from the function.  Emacs implements these as macros called GCPRO
 and UNGCPRO, and they're very easy to get wrong.  In a way, they are
 even worse than the current Python INCREF/DECREF.

That's a fairly natural style in Lisp implementation and it is not
that difficult to code in.  I've hacked inside Emacs and have written
another interpreter with a similar setup; it's certainly easier than
keeping track of refcounts in my experience.  For one thing, you can
raise exceptions anywhere you want, and the stack unwind can clean up
the gc protection, but it can't know nearly as easily which refcounts
to adjust, unless you record all the increfs the same way as the
GCPROs.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-24 Thread Steve Holden
Carl Banks wrote:
 On Jan 23, 8:22 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
 Paul Rubin wrote:
 Bryan Olson writes:
 BTW, class instances are usually immutable and thus don't require a
 mutex in the system I described.
 Then you are describing a language radically different from Python.
 That one threw me for a minute too, but I think the idea is that the
 class instance itself is immutable, while its slots (specifically the
 attribute dictionary) point to mutable objects.
 The meaning of 'immutable' is well-established in the Python literature.
 Python's immutable types include tuple, frozenset, and various kinds of
 numbers and strings. Class instances, not so much.
 
 Of course class instances aren't immutable types: they're not even
 types.  Let me suggest that there is a distinction between an
 immutable type and an immutable object.
 
 Immutable types are what you are talking about: it means that the type
 provides usable mutator methods.  (Whether they mutate the object
 itself or some associated object doesn't matter.)  Immutable objects
 are a different thing: it means the object cannot change in memory.
 
 Classes in Python are mutable types, usually.  Class instances are
 (except for the refcount) immutable objects, usually.
 
 We usually talk about mutability of types, but mutability of objects
 is appropriate for discussion as well.  So I can't really agree with
 your assessment that I wrong to call class instances immutable objects
 aside from refcounts.
 
 BTW, here's a minor brain bender: immutable types are mutable objects.
 
 
 What's more, this matters when considering a GIL-less implementation.
 Typical method calls can traverse lots of mutable stuff just to find the
 function to invoke.
 
 Now that doesn't make sense at all.  What is all this mutable stuff
 you have to go through, and what does it have to do with the GIL-less
 implementation?  Can you explain further?  Or are you just saying
 it'll be slow.
 
OK, so we have recently discussed whether objects are values, whether
function arguments are passed by reference, whether names are
references, and now we are, I suspect, about to have a huge further
discussion on the meaning of immutable.

Sometimes I start to find this eternal pedantry a little tedious. I
suspect it's time I once more dropped out of c.l.py for a while.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-24 Thread Carl Banks
On Jan 24, 12:40 am, Gabriel Genellina gagsl-...@yahoo.com.ar
wrote:
 En Sat, 24 Jan 2009 06:06:02 -0200, Carl Banks pavlovevide...@gmail.com  
 escribió:



  On Jan 23, 11:45 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
  Carl Banks wrote:
   Classes in Python are mutable types, usually.  Class instances are
   (except for the refcount) immutable objects, usually.

  There's where we disagree. I assert that class instances are usually
  mutable objects.

  Nope, you're dead wrong, nothing more to it.  The bits of a class
  instance never change.  The __dict__ is a mutable object.  The class
  instance itself isn't.  It's not reasonable to call an object whose
  bits can't change a mutable obect.

  Anyway, all you're doing is distracting attention from my claim that
  instance objects wouldn't need to be locked.  They wouldn't, no matter
  how mutable you insist these objects whose bits would never change
  are.

 Me too, I don't get what you mean. Consider a list instance, it contains a  
 count of allocated elements, and a pointer to some memory block. They  
 change when the list is resized. This counts as mutable to me. I really  
 don't understand your claim.


Yeah, yeah, I know that, and in the bickering that ensued some aspects
of the original context were lost.  I should really not have been
pulled into Bryan's strawman over the definition of immutable, since
it's just a label, I oughtn't give a damn what it's called, I only
care what it does.  I didn't handle this repartee very well.

Anyway, it goes back to the original vision for a mark-and-sweep
Python language as I presented what seems like a long time ago.

I presented the type system that had three base metatypes instead of
the one base metatype we have now: immutable_type, mutable_type, and
mutable_dict_type.  The default metatype for Python classes would be
mutable_dict_type, which is a type wherein the object itself would be
mutable but it would still have all the mutator methods __init__,
__setattr__, etc., but they could only act on the __dict__.
mutable_dict_types would not be allowed to define any slots, and
__dict__ wouldn't be reassignable.  (However, it seems reasonable to
allow the base tp_new to accept a dict argument.)

OTOTH, list's metatype would be mutable_type, so the type object
itself would be mutable.

Bryan claimed that that would be a very different language from
Python, apparently because it hadn't occurred to him that by-and-
large, the instance itself doesn't change, only the dict does.
Perhaps Bryan was thinking of __dict__'s reassignability (that
certainly didn't occur to me); if he was I apologize for my snideness.

HAVING SAID THAT, I still still say what I proposed would not be a
radically different language from Python.  A little different, of
course.  Much slower, almost certainly.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL?

2009-01-24 Thread Carl Banks
On Jan 24, 12:33 am, Hrvoje Niksic hnik...@xemacs.org wrote:
 Carl Banks pavlovevide...@gmail.com writes:
  Anyway, all you're doing is distracting attention from my claim that
  instance objects wouldn't need to be locked.  They wouldn't, no
  matter how mutable you insist these objects whose bits would never
  change are.

 Only if you're not implementing Python, but another language that
 doesn't support __slots__ and assignment to instance.__dict__.

I am only going to say all Python types prior to 3.0 support classes
without __slots__, so while I agree that this would be a different
language, it wouldn't necessarily be not Python.

(Python, of course, is what GvR says Python is, and he isn't going to
say that the language I presented is Python.  No worries there! :)
I'm only saying that it is conceivably similar enough to be a
different version of Python.  It would be a different language in the
same way that Python 2.6 is a different language from Python 3.0.)

Incidentally, the proposal does allow slots to be defined, but only
for actual mutable types, not for ordinary class instances.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL?

2009-01-24 Thread Carl Banks
On Jan 24, 12:24 pm, Carl Banks pavlovevide...@gmail.com wrote:
 On Jan 24, 12:33 am, Hrvoje Niksic hnik...@xemacs.org wrote:

  Carl Banks pavlovevide...@gmail.com writes:
   Anyway, all you're doing is distracting attention from my claim that
   instance objects wouldn't need to be locked.  They wouldn't, no
   matter how mutable you insist these objects whose bits would never
   change are.

  Only if you're not implementing Python, but another language that
  doesn't support __slots__ and assignment to instance.__dict__.

 I am only going to say all Python types prior to 3.0 support classes
 without __slots__,

I made a mistake, and I don't want to risk confusion at this point.

all Python ***versions** prior to 3.0

and I am talking about old-style classes, of course.  Prior to 2.2 no
classes at all supported slots.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-24 Thread Carl Banks
On Jan 24, 12:05 pm, Carl Banks pavlovevide...@gmail.com wrote:
 The default metatype for Python classes would be
 mutable_dict_type, which is a type wherein the object itself would be
 mutable but it would still have all the mutator methods __init__,
 __setattr__, etc., but they could only act on the __dict__.


Not wanting to risk confusion.

The default metatype for Python classes would be mutable_dict_type,
which is a type wherein the object itself would be ***immutable*** but
it would still have all the mutator methods __init__, __setattr__,
etc., but they could only act on the __dict__.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Steve Holden
Paul Rubin wrote:
 Carl Banks pavlovevide...@gmail.com writes:
 3. If you are going to use the low-level API on a mutable object, or
 are going to access the object structure directly, you need to acquire
 the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
 would be provided.
 
 You mean every time you access a list or dictionary or class instance,
 you have to acquire a mutex?  That sounds like a horrible slowdown.

Indeed it would, but hey, let's not let that stop us repeating the
thinking that's gone into CPython over the last fifteen years. Those
who cannot remember the past are condemned to repeat it.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread skip

 You mean every time you access a list or dictionary or class
 instance, you have to acquire a mutex?  That sounds like a horrible
 slowdown.

Steve Indeed it would, but hey, let's not let that stop us repeating
Steve the thinking that's gone into CPython over the last fifteen
Steve years. Those who cannot remember the past are condemned to
Steve repeat it.

Also, every object is mutable at some level.  Tuples, ints and floats are
definitely mutable at creation time.  You need to hold a mutex then, so
Carl's notion of three types of objects breaks down then.

Skip
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Paul Rubin
s...@pobox.com writes:
 Also, every object is mutable at some level.  Tuples, ints and floats are
 definitely mutable at creation time.  You need to hold a mutex then, so
 Carl's notion of three types of objects breaks down then.

Hopefully, at creation time, they will usually be in a scope where
other threads can't see them.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Rhamphoryncus
On Jan 22, 11:09 pm, Carl Banks pavlovevide...@gmail.com wrote:
 On Jan 22, 9:38 pm, Rhamphoryncus rha...@gmail.com wrote:



  On Jan 22, 9:38 pm, Carl Banks pavlovevide...@gmail.com wrote:

   On Jan 22, 6:00 am, a...@pythoncraft.com (Aahz) wrote:

In article 7xd4ele060@ruckus.brouhaha.com,
Paul Rubin  http://phr...@nospam.invalid wrote:

alex23 wuwe...@gmail.com writes:

 Here's an article by Guido talking about the last attempt to remove
 the GIL and the performance issues that arose:

 I'd welcome a set of patches into Py3k *only if* the performance for
 a single-threaded program (and for a multi-threaded but I/O-bound
 program) *does not decrease*.

The performance decrease is an artifact of CPython's rather primitive
storage management (reference counts in every object).  This is
pervasive and can't really be removed.  But a new implementation
(e.g. PyPy) can and should have a real garbage collector that doesn't
suffer from such effects.

CPython's primitive storage management has a lot to do with the
simplicity of interfacing CPython with external libraries.  Any solution
that proposes to get rid of the GIL needs to address that.

   I recently was on a long road trip, and was not driver, and with
   nothing better to do thought quite a bit about how this.

   I concluded that, aside from one major trap, it wouldn't really be
   more difficult to inteface Python to external libraries, just
   differently difficult.  Here is briefly what I came up with:

   1. Change the singular Python type into three metatypes:
   immutable_type, mutable_type, and mutable_dict_type.  (In the latter
   case, the object itself is immutable but the dict can be modified.
   This, of course, would be the default metaclass in Python.)  Only
   mutable_types would require a mutex when accessing.

   2. API wouldn't have to change much.  All regular API would assume
   that objects are unlocked (if mutable) and in a consistent state.
   It'll lock any mutable objects it needs to access.  There would also
   be a low-level API that assumes the objects are locked (if mutable)
   and does not require objects to be consistent.  I imagine most
   extensions would call the standard API most of the time.

   3. If you are going to use the low-level API on a mutable object, or
   are going to access the object structure directly, you need to acquire
   the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
   would be provided.

   4. Objects would have to define a method, to be called by the GC, that
   marks every object it references.  This would be a lot like the
   current tp_visit, except it has to be defined for any object that
   references another object, not just objects that can participate in
   cycles.  (A conservative garbage collector wouldn't suffice for Python
   because Python quite often allocates blocks but sets the pointer to an
   offset within the block.  In fact, that's true of almost any Python-
   defined type.)  Unfortunately, references on the stack would need to
   be registered as well, so PyObject* p; might have to be replaced
   with something like Py_DECLARE_REF(PyObject,p); which magically
   registers it.  Ugly.

   5. Py_INCREF and Py_DECREF are gone.

   6. GIL is gone.

   So, you gain the complexity of a two-level API, having to lock mutable
   objects sometimes, and defining more visitor methods than before, but
   you don't have to keep INCREFs and DECREFs straight, which is no small
   thing.

   The major trap is the possibily of deadlock.  To help minimize the
   risk there would be macros to lock multiple objects at once.  Py_LOCK2
   (a,b), which guarantess that if in another thread is calling Py_LOCK2
   (b,a) at the same time, it won't result in a deadlock.  What's
   disappointing is that the deadlocking possibility is always with you,
   much like the reference counts are.

  IMO, locking of the object is a secondary problem.  Python-safethread
  provides one solution, but it's not the only conceivable one.  For the
  sake of discussion it's easier to assume somebody else is solving it
  for you.

 That assumption might be good for the sake of the discussion *you*
 want to have, but it's not for discussion I was having, which was to
 address Aahz's claim that GIL makes extension writing simple by
 presenting a vision of what Python might be like if it had a mark-and-
 sweep collector.  The details of the GC are a small part of that and
 wouldn't affect my main point even if they are quite different than I
 described.  Also, extension writers would have to worry about locking
 issues here, so it's not acceptable to assume somebody else will solve
 that problem.

  Instead, focus on just the garbage collection.

 [snip rest of threadjack]

 You can ignore most of what I was talking about and focus on
 technicalities of garbage collection if you want to.  I will not be
 joining you in that discussion, 

Re: Why GIL?

2009-01-23 Thread Hrvoje Niksic
Carl Banks pavlovevide...@gmail.com writes:

 Unfortunately, references on the stack would need to be registered
 as well, so PyObject* p; might have to be replaced with something
 like Py_DECLARE_REF(PyObject,p); which magically registers it.
 Ugly.

Not only registered at the beginning of the function, but also (since
CPython uses C, not C++) explicitly unregistered at every point of
exit from the function.  Emacs implements these as macros called GCPRO
and UNGCPRO, and they're very easy to get wrong.  In a way, they are
even worse than the current Python INCREF/DECREF.

See description at, for example,
http://www.xemacs.org/Documentation/beta/html/internals_19.html#SEC78
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Carl Banks
On Jan 23, 7:33 am, s...@pobox.com wrote:
      You mean every time you access a list or dictionary or class
      instance, you have to acquire a mutex?  That sounds like a horrible
      slowdown.

     Steve Indeed it would, but hey, let's not let that stop us repeating
     Steve the thinking that's gone into CPython over the last fifteen
     Steve years. Those who cannot remember the past are condemned to
     Steve repeat it.

 Also, every object is mutable at some level.  Tuples, ints and floats are
 definitely mutable at creation time.  You need to hold a mutex then, so
 Carl's notion of three types of objects breaks down then.

immutable_type objects wouldn't exist at all until their
PyWhatever_New or their tp_new member is called.  After that, the
reference exists only on the local stack, which is accessible only to
one thread.  As long as you finish initializing the object while it's
still only on the stack, there is no possibility of a conflict.

What about tp_init, then, you ask?  Well it's simple: immutable_type
doesn't call it.  In fact, it requires that tp_init, tp_setattro,
tp_mapping-mp_setitem, etc., are all null.

immutable_obejcts have no instance dict, so if you want to create
attributes in Python you have to use slots.  immutable_object.__new__
accepts keyword arguments and initializes the slots with the value.

class Record(immutable_object,slots=['name','number']):
def __new__(cls,name):
number = db.lookup_number(name)
immutable_object.__new__(cls,name=name,number=number)


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Steve Holden
Rhamphoryncus wrote:
[... eighty-eight quoted lines ...]
 
 I'm sorry, you're right, I misunderstood your context.

Perhaps you could trim your posts to quote only the relevant context?
Thanks.
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Bryan Olson

Carl Banks wrote:
[...]

BTW, class instances are usually immutable and thus don't require a
mutex in the system I described.


Then you are describing a language radically different from Python.


--
--Bryan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Carl Banks
On Jan 23, 5:48 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
 Carl Banks wrote:

 [...]

  BTW, class instances are usually immutable and thus don't require a
  mutex in the system I described.

 Then you are describing a language radically different from Python.

Bzzt.

Hint: aside from the reference count, most class instances are
immutable in Python *today*.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Paul Rubin
Bryan Olson fakeaddr...@nowhere.org writes:
  BTW, class instances are usually immutable and thus don't require a
  mutex in the system I described.
 Then you are describing a language radically different from Python.

That one threw me for a minute too, but I think the idea is that the
class instance itself is immutable, while its slots (specifically the
attribute dictionary) point to mutable objects.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Steven D'Aprano
On Fri, 23 Jan 2009 18:54:18 -0800, Carl Banks wrote:

 On Jan 23, 5:48 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
 Carl Banks wrote:

 [...]

  BTW, class instances are usually immutable and thus don't require a
  mutex in the system I described.

 Then you are describing a language radically different from Python.
 
 Bzzt.
 
 Hint: aside from the reference count, most class instances are immutable
 in Python *today*.


That seems so utterly wrong that either you're an idiot or you're talking 
at cross purposes to what Bryan and I think you're saying. Since I know 
you're not an idiot, I can only imagine you have a different 
understanding of what it means to be immutable than I do.

For example... is this instance immutable?

class Foo:
bar = None

f = Foo()
f.baz = True



If so, what do you mean by immutable?



-- 
Steven
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Paul Rubin
Steven D'Aprano st...@remove-this-cybersource.com.au writes:
 For example... is this instance immutable?
 
 class Foo:
 bar = None
 
 f = Foo()
 f.baz = True
 If so, what do you mean by immutable?

If I understand Carl, yes, f is immutable.  When you set f.bar, the
contents of f.__dict__ changes but f itself does not change.  It still
points to the same dictionary, etc.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Bryan Olson

Paul Rubin wrote:

Bryan Olson writes:

BTW, class instances are usually immutable and thus don't require a
mutex in the system I described.



Then you are describing a language radically different from Python.


That one threw me for a minute too, but I think the idea is that the
class instance itself is immutable, while its slots (specifically the
attribute dictionary) point to mutable objects.


The meaning of 'immutable' is well-established in the Python literature. 
Python's immutable types include tuple, frozenset, and various kinds of 
numbers and strings. Class instances, not so much.


What's more, this matters when considering a GIL-less implementation. 
Typical method calls can traverse lots of mutable stuff just to find the 
function to invoke.



--
--Bryan



--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Carl Banks
On Jan 23, 7:19 pm, Paul Rubin http://phr...@nospam.invalid wrote:
 Bryan Olson fakeaddr...@nowhere.org writes:
   BTW, class instances are usually immutable and thus don't require a
   mutex in the system I described.
  Then you are describing a language radically different from Python.

 That one threw me for a minute too, but I think the idea is that the
 class instance itself is immutable, while its slots (specifically the
 attribute dictionary) point to mutable objects.

Correct, and, getting back to the point, an instance itself would not
require a mutex.  The dict would need it, of course.

It's customary to gloss over this technicality for convenience's sake
in most discussions, but it matters in this case.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Paul Rubin
Bryan Olson fakeaddr...@nowhere.org writes:
 The meaning of 'immutable' is well-established in the Python
 literature. Python's immutable types include tuple, frozenset, and
 various kinds of numbers and strings. Class instances, not so much.

But we are talking about objects as they live in the C implementation,
not at the level where Python code deals with them.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Carl Banks
On Jan 23, 8:22 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
 Paul Rubin wrote:
  Bryan Olson writes:
  BTW, class instances are usually immutable and thus don't require a
  mutex in the system I described.
  Then you are describing a language radically different from Python.

  That one threw me for a minute too, but I think the idea is that the
  class instance itself is immutable, while its slots (specifically the
  attribute dictionary) point to mutable objects.

 The meaning of 'immutable' is well-established in the Python literature.
 Python's immutable types include tuple, frozenset, and various kinds of
 numbers and strings. Class instances, not so much.

Of course class instances aren't immutable types: they're not even
types.  Let me suggest that there is a distinction between an
immutable type and an immutable object.

Immutable types are what you are talking about: it means that the type
provides usable mutator methods.  (Whether they mutate the object
itself or some associated object doesn't matter.)  Immutable objects
are a different thing: it means the object cannot change in memory.

Classes in Python are mutable types, usually.  Class instances are
(except for the refcount) immutable objects, usually.

We usually talk about mutability of types, but mutability of objects
is appropriate for discussion as well.  So I can't really agree with
your assessment that I wrong to call class instances immutable objects
aside from refcounts.

BTW, here's a minor brain bender: immutable types are mutable objects.


 What's more, this matters when considering a GIL-less implementation.
 Typical method calls can traverse lots of mutable stuff just to find the
 function to invoke.

Now that doesn't make sense at all.  What is all this mutable stuff
you have to go through, and what does it have to do with the GIL-less
implementation?  Can you explain further?  Or are you just saying
it'll be slow.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Paul Rubin
Carl Banks pavlovevide...@gmail.com writes:
  What's more, this matters when considering a GIL-less implementation.
  Typical method calls can traverse lots of mutable stuff just to find the
  function to invoke.
 
 Now that doesn't make sense at all.  What is all this mutable stuff
 you have to go through, and what does it have to do with the GIL-less
 implementation? 

foo.bar() has to look up bar in foo's attribute dictionary.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Bryan Olson

Carl Banks wrote:

Paul Rubin wrote:

Bryan Olson writes:

BTW, class instances are usually immutable and thus don't require a
mutex in the system I described.

Then you are describing a language radically different from Python.



That one threw me for a minute too, but I think the idea is that the
class instance itself is immutable, while its slots (specifically the
attribute dictionary) point to mutable objects.


Correct, and, getting back to the point, an instance itself would not
require a mutex.  The dict would need it, of course.


The dict is part of the object and some important slots are mutable. 
What's more, if your point was to do away with the GIL without changing 
Python semantics nor requiring heaping masses of locking, I fear you've 
not fully grasped the problem.


Languages such as Java, C++, and C# do not require nearly as much 
locking as Python because they are not nearly as dynamic. Consider how a 
method is invoked. Java / C++ / C# can always resolve the method with no 
locking; the data they need is fixed at link time. Python is much more 
dynamic. A demo:



from __future__ import print_function

# A simple class hierarchy:

class Foo (object):
title = Mr. Foo

def identify(self):
print(I'm called, self.title)

class Bar (Foo):
title = Ms. Bar

class Jafo (Bar):
title = Major Jafo

dude = Jafo()


# Searches 5 dicts to find the function to call:

dude.identify()


# Class dicts are mutable:

def id(self):
print(I'm still called, self.title)

Jafo.identify = id
dude.identify()


# An object's class can change:

dude.__class__ = Bar
dude.identify()


# A class's base classes can change:

class Fu (object):
def identify(self):
print(Call me, self.title)

Bar.__bases__ = (Fu,)
dude.identify()


Result:

I'm called Major Jafo
I'm still called Major Jafo
I'm called Ms. Bar
Call me Ms. Bar



In that first simple call of dude.identify(), Python looked up dude in 
 the module's (mutable) dict to find the object. Then it looked in 
object's (mutable) dict, and did not find identify. So it looked at 
the object's (mutable) __class__ slot, and in that class's (mutable) 
dict. It still did not find identify, so it looked in the class's 
(mutable) __bases__ slot, following Python's depth-first object 
protocol and thus looking in what other (mutable) class dicts and 
(mutable) __bases__ slots were required.


An object's __dict__ slot is *not* mutable; thus we could gain some 
efficiency by protecting the object and its dict with the same lock. I 
do not see a major win in Mr. Banks' point that we do not need to lock 
the object, just its dict.



--
--Bryan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Paul Rubin
Bryan Olson fakeaddr...@nowhere.org writes:
 An object's __dict__ slot is *not* mutable; thus we could gain some
 efficiency by protecting the object and its dict with the same lock. I
 do not see a major win in Mr. Banks' point that we do not need to lock
 the object, just its dict.

If the dict contents don't change often, maybe we could use an
STM-like approach to eliminate locks when reading.  That would of
course require rework to just about every C function that accesses
Python objects.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Carl Banks
On Jan 23, 10:55 pm, Bryan Olson fakeaddr...@nowhere.org wrote:
 Carl Banks wrote:
  Paul Rubin wrote:
  Bryan Olson writes:
  BTW, class instances are usually immutable and thus don't require a
  mutex in the system I described.
  Then you are describing a language radically different from Python.
  That one threw me for a minute too, but I think the idea is that the
  class instance itself is immutable, while its slots (specifically the
  attribute dictionary) point to mutable objects.

  Correct, and, getting back to the point, an instance itself would not
  require a mutex.  The dict would need it, of course.

 The dict is part of the object and some important slots are mutable.
 What's more, if your point was to do away with the GIL without changing
 Python semantics nor requiring heaping masses of locking, I fear you've
 not fully grasped the problem.

If that's what you think I thought, I fear you haven't read anything
I've written.

[snip]
 An object's __dict__ slot is *not* mutable; thus we could gain some
 efficiency by protecting the object and its dict with the same lock. I
 do not see a major win in Mr. Banks' point that we do not need to lock
 the object, just its dict.

I'm not sure where you got the idea that I was claiming this was a
major win.  I'm not sure where you got the idea that I claimed that
having to lock all mutable objects wouldn't be slow.  For Pete's sake,
you followed up to a post where I *agreed* that it would be slow.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-23 Thread Bryan Olson

Carl Banks wrote:

Bryan Olson wrote:

Paul Rubin wrote:

Bryan Olson writes:

BTW, class instances are usually immutable and thus don't require a
mutex in the system I described.



Then you are describing a language radically different from Python.



That one threw me for a minute too, but I think the idea is that the
class instance itself is immutable, while its slots (specifically the
attribute dictionary) point to mutable objects.



The meaning of 'immutable' is well-established in the Python literature.
Python's immutable types include tuple, frozenset, and various kinds of
numbers and strings. Class instances, not so much.


Of course class instances aren't immutable types: they're not even
types. 


Class instances my or may not be types, but that has nothing to do with 
any point at issue here. I'm saying that class instances are usually, 
mutable, contrary to your claim, class instances are usually immutable.



Let me suggest that there is a distinction between an
immutable type and an immutable object.


Let me further suggest that Python's documentation is entirely clear: 
instances of immutable types are immutable objects. Instances of mutable 
types are generally mutable objects. For example, tuple is an immutable 
type, and thus tuples are immutable; list is a mutable type, and thus 
lists are mutable.



Immutable types are what you are talking about: it means that the type
provides usable mutator methods.  (Whether they mutate the object
itself or some associated object doesn't matter.)  Immutable objects
are a different thing: it means the object cannot change in memory.

Classes in Python are mutable types, usually.  Class instances are
(except for the refcount) immutable objects, usually.


There's where we disagree. I assert that class instances are usually 
mutable objects.



We usually talk about mutability of types, but mutability of objects
is appropriate for discussion as well.  So I can't really agree with
your assessment that I wrong to call class instances immutable objects
aside from refcounts.


That confusion disappears once one grasps that instances of immutable 
types are immutable objects.



BTW, here's a minor brain bender: immutable types are mutable objects.


Some brains are too easily bent. Python is one of the many 
object-oriented languages that reifies types as run-time objects. I see 
no point in going through Python's immutable types to examine if there 
is any way to mutate the corresponding type objects.




What's more, this matters when considering a GIL-less implementation.
Typical method calls can traverse lots of mutable stuff just to find the
function to invoke.


Now that doesn't make sense at all.  What is all this mutable stuff
you have to go through, and what does it have to do with the GIL-less
implementation?  Can you explain further?  Or are you just saying
it'll be slow.


I elaborated at some length in another strand of this thread.


--
--Bryan
--
http://mail.python.org/mailman/listinfo/python-list


Why GIL? (was Re: what's the point of rpython?)

2009-01-22 Thread Aahz
In article 7xd4ele060@ruckus.brouhaha.com,
Paul Rubin  http://phr...@nospam.invalid wrote:
alex23 wuwe...@gmail.com writes:

 Here's an article by Guido talking about the last attempt to remove
 the GIL and the performance issues that arose:
 
 I'd welcome a set of patches into Py3k *only if* the performance for
 a single-threaded program (and for a multi-threaded but I/O-bound
 program) *does not decrease*.

The performance decrease is an artifact of CPython's rather primitive
storage management (reference counts in every object).  This is
pervasive and can't really be removed.  But a new implementation
(e.g. PyPy) can and should have a real garbage collector that doesn't
suffer from such effects.

CPython's primitive storage management has a lot to do with the
simplicity of interfacing CPython with external libraries.  Any solution
that proposes to get rid of the GIL needs to address that.
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-22 Thread Carl Banks
On Jan 22, 6:00 am, a...@pythoncraft.com (Aahz) wrote:
 In article 7xd4ele060@ruckus.brouhaha.com,
 Paul Rubin  http://phr...@nospam.invalid wrote:

 alex23 wuwe...@gmail.com writes:

  Here's an article by Guido talking about the last attempt to remove
  the GIL and the performance issues that arose:

  I'd welcome a set of patches into Py3k *only if* the performance for
  a single-threaded program (and for a multi-threaded but I/O-bound
  program) *does not decrease*.

 The performance decrease is an artifact of CPython's rather primitive
 storage management (reference counts in every object).  This is
 pervasive and can't really be removed.  But a new implementation
 (e.g. PyPy) can and should have a real garbage collector that doesn't
 suffer from such effects.

 CPython's primitive storage management has a lot to do with the
 simplicity of interfacing CPython with external libraries.  Any solution
 that proposes to get rid of the GIL needs to address that.

I recently was on a long road trip, and was not driver, and with
nothing better to do thought quite a bit about how this.

I concluded that, aside from one major trap, it wouldn't really be
more difficult to inteface Python to external libraries, just
differently difficult.  Here is briefly what I came up with:

1. Change the singular Python type into three metatypes:
immutable_type, mutable_type, and mutable_dict_type.  (In the latter
case, the object itself is immutable but the dict can be modified.
This, of course, would be the default metaclass in Python.)  Only
mutable_types would require a mutex when accessing.

2. API wouldn't have to change much.  All regular API would assume
that objects are unlocked (if mutable) and in a consistent state.
It'll lock any mutable objects it needs to access.  There would also
be a low-level API that assumes the objects are locked (if mutable)
and does not require objects to be consistent.  I imagine most
extensions would call the standard API most of the time.

3. If you are going to use the low-level API on a mutable object, or
are going to access the object structure directly, you need to acquire
the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
would be provided.

4. Objects would have to define a method, to be called by the GC, that
marks every object it references.  This would be a lot like the
current tp_visit, except it has to be defined for any object that
references another object, not just objects that can participate in
cycles.  (A conservative garbage collector wouldn't suffice for Python
because Python quite often allocates blocks but sets the pointer to an
offset within the block.  In fact, that's true of almost any Python-
defined type.)  Unfortunately, references on the stack would need to
be registered as well, so PyObject* p; might have to be replaced
with something like Py_DECLARE_REF(PyObject,p); which magically
registers it.  Ugly.

5. Py_INCREF and Py_DECREF are gone.

6. GIL is gone.

So, you gain the complexity of a two-level API, having to lock mutable
objects sometimes, and defining more visitor methods than before, but
you don't have to keep INCREFs and DECREFs straight, which is no small
thing.

The major trap is the possibily of deadlock.  To help minimize the
risk there would be macros to lock multiple objects at once.  Py_LOCK2
(a,b), which guarantess that if in another thread is calling Py_LOCK2
(b,a) at the same time, it won't result in a deadlock.  What's
disappointing is that the deadlocking possibility is always with you,
much like the reference counts are.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-22 Thread Paul Rubin
a...@pythoncraft.com (Aahz) writes:
 CPython's primitive storage management has a lot to do with the
 simplicity of interfacing CPython with external libraries.  Any solution
 that proposes to get rid of the GIL needs to address that.

This, I don't understand.  Other languages like Lisp and Java and
Haskell have foreign function interfaces that easier to program than
Python's, -and- they don't use reference counts.  There's usually some
primitive to protect objects from garbage collection while the foreign
function is using them, etc.  The Java Native Interface (JNI) and the
Haskell FFI are pretty well documented.  The Emacs Lisp system is not
too hard to figure out from examining the source code, etc.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-22 Thread Rhamphoryncus
On Jan 22, 9:38 pm, Carl Banks pavlovevide...@gmail.com wrote:
 On Jan 22, 6:00 am, a...@pythoncraft.com (Aahz) wrote:



  In article 7xd4ele060@ruckus.brouhaha.com,
  Paul Rubin  http://phr...@nospam.invalid wrote:

  alex23 wuwe...@gmail.com writes:

   Here's an article by Guido talking about the last attempt to remove
   the GIL and the performance issues that arose:

   I'd welcome a set of patches into Py3k *only if* the performance for
   a single-threaded program (and for a multi-threaded but I/O-bound
   program) *does not decrease*.

  The performance decrease is an artifact of CPython's rather primitive
  storage management (reference counts in every object).  This is
  pervasive and can't really be removed.  But a new implementation
  (e.g. PyPy) can and should have a real garbage collector that doesn't
  suffer from such effects.

  CPython's primitive storage management has a lot to do with the
  simplicity of interfacing CPython with external libraries.  Any solution
  that proposes to get rid of the GIL needs to address that.

 I recently was on a long road trip, and was not driver, and with
 nothing better to do thought quite a bit about how this.

 I concluded that, aside from one major trap, it wouldn't really be
 more difficult to inteface Python to external libraries, just
 differently difficult.  Here is briefly what I came up with:

 1. Change the singular Python type into three metatypes:
 immutable_type, mutable_type, and mutable_dict_type.  (In the latter
 case, the object itself is immutable but the dict can be modified.
 This, of course, would be the default metaclass in Python.)  Only
 mutable_types would require a mutex when accessing.

 2. API wouldn't have to change much.  All regular API would assume
 that objects are unlocked (if mutable) and in a consistent state.
 It'll lock any mutable objects it needs to access.  There would also
 be a low-level API that assumes the objects are locked (if mutable)
 and does not require objects to be consistent.  I imagine most
 extensions would call the standard API most of the time.

 3. If you are going to use the low-level API on a mutable object, or
 are going to access the object structure directly, you need to acquire
 the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
 would be provided.

 4. Objects would have to define a method, to be called by the GC, that
 marks every object it references.  This would be a lot like the
 current tp_visit, except it has to be defined for any object that
 references another object, not just objects that can participate in
 cycles.  (A conservative garbage collector wouldn't suffice for Python
 because Python quite often allocates blocks but sets the pointer to an
 offset within the block.  In fact, that's true of almost any Python-
 defined type.)  Unfortunately, references on the stack would need to
 be registered as well, so PyObject* p; might have to be replaced
 with something like Py_DECLARE_REF(PyObject,p); which magically
 registers it.  Ugly.

 5. Py_INCREF and Py_DECREF are gone.

 6. GIL is gone.

 So, you gain the complexity of a two-level API, having to lock mutable
 objects sometimes, and defining more visitor methods than before, but
 you don't have to keep INCREFs and DECREFs straight, which is no small
 thing.

 The major trap is the possibily of deadlock.  To help minimize the
 risk there would be macros to lock multiple objects at once.  Py_LOCK2
 (a,b), which guarantess that if in another thread is calling Py_LOCK2
 (b,a) at the same time, it won't result in a deadlock.  What's
 disappointing is that the deadlocking possibility is always with you,
 much like the reference counts are.

IMO, locking of the object is a secondary problem.  Python-safethread
provides one solution, but it's not the only conceivable one.  For the
sake of discussion it's easier to assume somebody else is solving it
for you.

Instead, focus on just the garbage collection.  What are the practical
issues of modifying CPython to use a tracing GC throughout?  It
certainly is possible to write an exact GC in C, but the stack
manipulation would be hideous.  It'd also require significant rewrites
of the entire code base.  Throw on that the performance is unclear (it
could be far worse for a single-threaded program), with no
straightforward way to make it a compile-time option..

Got any ideas for that?
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-22 Thread Carl Banks
On Jan 22, 9:38 pm, Rhamphoryncus rha...@gmail.com wrote:
 On Jan 22, 9:38 pm, Carl Banks pavlovevide...@gmail.com wrote:



  On Jan 22, 6:00 am, a...@pythoncraft.com (Aahz) wrote:

   In article 7xd4ele060@ruckus.brouhaha.com,
   Paul Rubin  http://phr...@nospam.invalid wrote:

   alex23 wuwe...@gmail.com writes:

Here's an article by Guido talking about the last attempt to remove
the GIL and the performance issues that arose:

I'd welcome a set of patches into Py3k *only if* the performance for
a single-threaded program (and for a multi-threaded but I/O-bound
program) *does not decrease*.

   The performance decrease is an artifact of CPython's rather primitive
   storage management (reference counts in every object).  This is
   pervasive and can't really be removed.  But a new implementation
   (e.g. PyPy) can and should have a real garbage collector that doesn't
   suffer from such effects.

   CPython's primitive storage management has a lot to do with the
   simplicity of interfacing CPython with external libraries.  Any solution
   that proposes to get rid of the GIL needs to address that.

  I recently was on a long road trip, and was not driver, and with
  nothing better to do thought quite a bit about how this.

  I concluded that, aside from one major trap, it wouldn't really be
  more difficult to inteface Python to external libraries, just
  differently difficult.  Here is briefly what I came up with:

  1. Change the singular Python type into three metatypes:
  immutable_type, mutable_type, and mutable_dict_type.  (In the latter
  case, the object itself is immutable but the dict can be modified.
  This, of course, would be the default metaclass in Python.)  Only
  mutable_types would require a mutex when accessing.

  2. API wouldn't have to change much.  All regular API would assume
  that objects are unlocked (if mutable) and in a consistent state.
  It'll lock any mutable objects it needs to access.  There would also
  be a low-level API that assumes the objects are locked (if mutable)
  and does not require objects to be consistent.  I imagine most
  extensions would call the standard API most of the time.

  3. If you are going to use the low-level API on a mutable object, or
  are going to access the object structure directly, you need to acquire
  the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
  would be provided.

  4. Objects would have to define a method, to be called by the GC, that
  marks every object it references.  This would be a lot like the
  current tp_visit, except it has to be defined for any object that
  references another object, not just objects that can participate in
  cycles.  (A conservative garbage collector wouldn't suffice for Python
  because Python quite often allocates blocks but sets the pointer to an
  offset within the block.  In fact, that's true of almost any Python-
  defined type.)  Unfortunately, references on the stack would need to
  be registered as well, so PyObject* p; might have to be replaced
  with something like Py_DECLARE_REF(PyObject,p); which magically
  registers it.  Ugly.

  5. Py_INCREF and Py_DECREF are gone.

  6. GIL is gone.

  So, you gain the complexity of a two-level API, having to lock mutable
  objects sometimes, and defining more visitor methods than before, but
  you don't have to keep INCREFs and DECREFs straight, which is no small
  thing.

  The major trap is the possibily of deadlock.  To help minimize the
  risk there would be macros to lock multiple objects at once.  Py_LOCK2
  (a,b), which guarantess that if in another thread is calling Py_LOCK2
  (b,a) at the same time, it won't result in a deadlock.  What's
  disappointing is that the deadlocking possibility is always with you,
  much like the reference counts are.

 IMO, locking of the object is a secondary problem.  Python-safethread
 provides one solution, but it's not the only conceivable one.  For the
 sake of discussion it's easier to assume somebody else is solving it
 for you.

That assumption might be good for the sake of the discussion *you*
want to have, but it's not for discussion I was having, which was to
address Aahz's claim that GIL makes extension writing simple by
presenting a vision of what Python might be like if it had a mark-and-
sweep collector.  The details of the GC are a small part of that and
wouldn't affect my main point even if they are quite different than I
described.  Also, extension writers would have to worry about locking
issues here, so it's not acceptable to assume somebody else will solve
that problem.


 Instead, focus on just the garbage collection.
[snip rest of threadjack]

You can ignore most of what I was talking about and focus on
technicalities of garbage collection if you want to.  I will not be
joining you in that discussion, however.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-22 Thread Paul Rubin
Carl Banks pavlovevide...@gmail.com writes:
 3. If you are going to use the low-level API on a mutable object, or
 are going to access the object structure directly, you need to acquire
 the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
 would be provided.

You mean every time you access a list or dictionary or class instance,
you have to acquire a mutex?  That sounds like a horrible slowdown.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why GIL? (was Re: what's the point of rpython?)

2009-01-22 Thread Carl Banks
On Jan 22, 10:15 pm, Paul Rubin http://phr...@nospam.invalid wrote:
 Carl Banks pavlovevide...@gmail.com writes:
  3. If you are going to use the low-level API on a mutable object, or
  are going to access the object structure directly, you need to acquire
  the object's mutex. Macros such as Py_LOCK(), Py_LOCK2(), Py_UNLOCK()
  would be provided.

 You mean every time you access a list or dictionary or class instance,
 you have to acquire a mutex?  That sounds like a horrible slowdown.

Yes, and it's never going to happen in CPython any other way.  It's
considered a bug if Python code can segfault the interpreter; all
runtime errors are supposed to raise exceptions.  The only way to
ensure that won't happen is to make sure that only one thread can can
access the internals of a mutable object at a time.

BTW, class instances are usually immutable and thus don't require a
mutex in the system I described.


Carl Banks
--
http://mail.python.org/mailman/listinfo/python-list