Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Josiah Carlson

Michael Sparks [EMAIL PROTECTED] wrote:
 On Thursday 06 October 2005 23:15, Josiah Carlson wrote:
 [... 6 specific use cases ...]
  If Kamaelia is able to handle all of the above mechanisms in both a
  blocking and non-blocking fashion, then I would guess it has the basic
  requirements for most concurrent applications.
 It can. I can easily knock up examples for each if required :-)

That's cool, I trust you.  One thing I notice is absent from the
Kamaelia page is benchmarks.

On the one hand, benchmarks are technically useless, as one can tend to
benchmark those things that a system does well, and ignore those things
that it does poorly (take, for example how PyLinda's speed test only
ever inserts and removes one tuple at a time...try inserting 100k  and
use wildcards to extract those 100k, and you'll note how poor it
performs, or database benchmarks, etc.).  However, if one's benchmarks
provide examples from real use, then it shows that at least someone has
gotten some X performance from the system.

I'm personally interested in latency and throughput for varying sizes of
data being passed through the system.

 That said, a more interesting example implemented this week (as part of
 a rapid prototyping project to look at collaborative community radio)
 implements an networked audio mixer matrix. That allows mutiple sources of
 audio to be mixed, sent on to multiple destinations, may be duplicate mixes
 of each other, but also may select different mixes. The same system also
 includes point to point communications for network control of the mix.

Very neat.  How much data?  What kind of throughput?  What kinds of

 That application covers ( I /think/ ) 1, 2, 3, 4,  and 6 on your list of
 things as I understand what you mean. 5 is fairly trivial though.


 Regarding blocking  non-blocking, links can be marked to synchronous, which
 forces blocking style behaviour. Since generally we're using generators, we
 can't block for real which is why we throw an exception there. However,
 threaded components can  do block. The reason for this was due to the
 architecture being inspired by noting the similarities between asynchronous
 hardware systems/langages and network systems.

On the client side, I was lazy and used synchronous/blocking sockets to
block on read/write (every client thread gets its own connection,
meaning that tuple puts are never sitting in a queue).  I've also got
server-side timeouts for when you don't want to wait too long for data.
rslt = tplspace.get(PATTERN, timeout=None)

  into my tuple space implementation before it is released. 
 I'd be interested in hearing more about that BTW. One thing we've found is
 that much organic systems have a neural system for communications between
 things, (hence Axon :), that you also need to equivalent of a hormonal system.
 In the unix shell world, IMO the environment acts as that for pipelines, and
 similarly that's why we have an assistant system. (Which has key/value lookup

I have two recent posts about the performance and features of a (hacked
together) tuple space system I worked on (for two afternoons) in my blog.
Feel Lucky for Josiah Carlson in google and you will find it.

 It's a less obvious requirement, but is a useful one nonetheless, so I don't
 really see a message passing style as excluding a linda approach - since
 they're orthoganal approaches.

Indeed.  For me, the idea of being able to toss a tuple into memory
somewhere and being able to find it later maps into my mind as:
('name', arg1, ...) - name(arg1, ...), which is, quite literally, an
RPC semantic (which seems a bit more natural to me than subscribing to
the 'name' queue).  With the ability to send to either single or
multiple listeners, you get message passing, broadcast messages, and a
standard job/result queueing semantic. The only thing that it is missing
is a prioritization mechanism (fifo, numeric priority, etc.), which
would get us a job scheduling kernel. Not bad for a message
passing/tuple space/IPC library.  (all of the above described have
direct algorithms for implementation).

 - Josiah

Python-Dev mailing list

Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-07 Thread Michael Hudson
Guido van Rossum [EMAIL PROTECTED] writes:

 How does this sound to the non-AST-branch developers who have to
 suffer the inevitable post-merge instability? I think it's now or
 never -- waiting longer isn't going to make this thing easier (not
 with several more language changes approved: with-statement, extended
 import, what else...)

It sounds OK to me.


  To summarise the summary of the summary:- people are a problem.
   -- The Hitch-Hikers Guide to the Galaxy, Episode 12
Python-Dev mailing list

[Python-Dev] PyObject_Init documentation

2005-10-07 Thread Martin v. Löwis

If type  indicates that the object participates in the cyclic garbage 
detector, it is added to the detector's set of observed objects.

Is this really correct? I thought you need to invoke PyObject_GC_TRACK

Python-Dev mailing list

[Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Nick Coghlan
Based on Jason's comments regarding decimal.Context, and to explicitly cover 
the terminology agreed on during the documentation discussion back in July, 
I'm proposing a number of changes to PEP 343. I'll be updating the checked in 
PEP assuming there aren't any objections in the next week or so (and assuming 
I get CVS access sorted out ;).

The idea of dropping __enter__/__exit__ and defining the with statement solely 
in terms of coroutines is *not* included in the suggested changes, but I added 
a new item under Resolved Open Issues to cover some of the reasons why.


1. Amend the statement specification such that:

   with EXPR as VAR:

is translated as:

   abc = (EXPR).__with__()
   exc = (None, None, None)
   VAR = abc.__enter__()
   exc = sys.exc_info()

2. Add the following to the subsequent explanation:

 The call to the __with__ method serves a similar purpose to the __iter__
   method for iterables and iterators. An object such as threading.Lock may
   provide its own __enter__ and __exit__ methods, and simply return 'self'
   from its __with__ method. A more complex object such as decimal.Context may
   return a distinct context manager which takes care of setting and restoring
   the appropriate decimal context in the thread.

3. Update ContextWrapper in the Generator Decorator section to include:

  def  __with__(self):
  return self

4. Add a paragraph to the end of the Generator Decorator section:

 By applying the @contextmanager decorator to a context's __with__ method,
   it is as easy to write a generator-based context manager for the context as
   it is to write a generator-based iterator for an iterable (see the
   decimal.Context example below).

5. Add three items under Resolved Open Issues:

 2.  After this PEP was originally approved, a subsequent discussion on
   python-dev [4] settled on the term context manager for objects which
   provide __enter__ and __exit__ methods, and context management
   protocol for the protocol itself. With the addition of the __with__
   method to the protocol, a natural extension is to call objects which
   provide only a __with__ method contexts (or manageable contexts in
   situations where the general term context would be ambiguous).
 The distinction between a context and a context manager is very
   similar to the distinction between an iterable and an iterator.

 3.  The originally approved version of this PEP did not include a __with__
   method - the method was only added to the PEP after Jason Orendorff
   pointed out the difficulty of writing appropriate __enter__ and __exit__
   methods for decimal.Context [5].
  This approach allows a class to use the @contextmanager decorator
   to defines a native context manager using generator syntax. It also
   allows a class to use an existing independent context manager as its
   native context manager by applying the independent context manager to
   'self' in its __with__ method. It even allows a class written in C to
   use a coroutine based context manager written in Python.
  The __with__ method parallels the __iter__ method which forms part of
   the iterator protocol.

 4.  The suggestion was made by Jason Orendorff that the __enter__ and
   __exit__ methods could be removed from the context management protocol,
   and the protocol instead defined directly in terms of the coroutine
   interface described in PEP 342 (or a cleaner version of that interface
   with start() and finish() convenience methods) [6].
 Guido rejected this idea [7]. The following are some of benefits of
   keeping the __enter__ and __exit__ methods:
   - it makes it easy to implement a simple context manager in C
 without having to rely on a separate coroutine builder
   - it makes it easy to provide a low-overhead implementation for
 context managers which don't need to maintain any special state
 between the __enter__ and __exit__ methods (having to use a
 coroutine for these would impose unnecessary overhead without any
 compensating benefit)
   - it makes it possible to understand how the with statement works
 without having to first understand the concept of a coroutine

6. Add new references:


7. Update Example 4 to include a __with__ method:

  def  __with__(self):

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Fredrik Lundh
Nick Coghlan wrote:

 9. Here's a proposed native context manager for decimal.Context:

 # This would be a new decimal.Context method
 def __with__(self):

wouldn't it be better if the ContextWrapper class (or some variation thereof) 
be used as a base class for the decimal.Context class?  using decorators on 
to provide is a behaviour for the class doesn't really feel pythonic...


Python-Dev mailing list

[Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods

2005-10-07 Thread Nick Coghlan
I'm lifting Jason's PEP 342 suggestions out of the recent PEP 343 thread, in 
case some of the folks interested in coroutines stopped following that 

Jason suggested two convenience methods, .start() and .finish().

start() simply asserted that the generator hadn't been started yet, and I find 
the parallel with Thread.start() appealing:

 def start(self):
  Convenience method -- exactly like next(), but
 assert that this coroutine hasn't already been started.
 if self.__started:
 raise RuntimeError(Coroutine already started)

I've embellished Jason's suggested finish() method quite a bit though.
   1. Use send() rather than next()
   2. Call it __call__() rather than finish()
   3. Add an unwind_call() variant that gives similar semantics for throw()
   4. Support getting a return value from the coroutine
  using the syntax raise StopIteration(val)
   5. Add an exception ContinueIteration that is used to indicate the
  generator hasn't finished yet, rather than expecting the generator to
  finish and raising RuntimeError if it doesn't

It ends up looking like this:

 def __call__(self, value=None):
  Call a generator as a coroutine

 Returns the first argument supplied to StopIteration or
 None if no argument was supplied.
 Raises ContinueIteration with the value yielded as the
 argument if the generator yields a value
 if not self.__started:
 raise RuntimeError(Coroutine not started)
 if exc:
 yield_val = self.throw(value, *exc)
 yield_val = self.send(value)
 except (StopIteration), ex:
 if ex.args:
 return args[0]
 raise ContinueIteration(yield_val)

 def unwind_call(self, *exc):
 Raise an exception in a generator used as a coroutine.

 Returns the first argument supplied to StopIteration or
 None if no argument was supplied.
 Raises ContinueIteration if the generator yields a value
 with the value yield as the argument
 yield_val = self.throw(*exc)
 except (StopIteration), ex:
 if ex.args:
 return args[0]
 raise ContinueIteration(yield_val)

Now here's the trampoline scheduler from PEP 342 using this idea:

 import collections

 class Trampoline:
 Manage communications between coroutines

 running = False

 def __init__(self):
 self.queue = collections.deque()

 def add(self, coroutine):
 Request that a coroutine be executed

 def run(self):
 result = None
 self.running = True
 while self.running and self.queue:
 func = self.queue.popleft()
 result = func()
 return result
 self.running = False

 def stop(self):
 self.running = False

 def schedule(self, coroutine, stack=(), call_result=None, *exc):
 # Define the new pseudothread
 def pseudothread():
 if exc:
 result = coroutine.unwind_call(call_result, *exc)
 result = coroutine(call_result)
 except (ContinueIteration), ex:
 # Called another coroutine
 callee = ex.args[0]
 self.schedule(callee, (coroutine,stack))
 if stack:
 # send the error back to the caller
caller = stack[0]
prev_stack = stack[1]
 caller, prev_stack, *sys.exc_info()
 # Nothing left in this pseudothread to
 # handle it, let it propagate to the
 # run loop
 if stack:
 # Finished, so pop the stack and send the
 # result to the caller
 caller = stack[0]
 prev_stack = stack[1]

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Nick Coghlan
Fredrik Lundh wrote:
 Nick Coghlan wrote:
9. Here's a proposed native context manager for decimal.Context:

# This would be a new decimal.Context method
def __with__(self):
 wouldn't it be better if the ContextWrapper class (or some variation thereof) 
 be used as a base class for the decimal.Context class?  using decorators on 
 to provide is a behaviour for the class doesn't really feel pythonic...

That's not what the decorator is for - it's there to turn the generator used 
to implement the __with__ method into a context manager, rather than saying 
anything about decimal.Context as a whole.

However, requiring a decorator to get a slot to work right looks pretty ugly 
to me, too.

What if we simply special-cased the __with__ slot in type(), such that if it 
is populated with a generator object, that object is automatically wrapped 
using the @contextmanager decorator? (Jason actually suggested this idea 

I initially didn't like the idea because of EIBTI, but I've realised that def 
__with__(self): is pretty darn explicit in its own right. I've also realised 
that defining __with__ using a generator, but forgetting to add the 
@contextmanager to the front would be a lovely source of bugs, particularly if 
generators are given a default __exit__() method that simply invokes 

On the other hand, if __with__ is special-cased, then the slot definition 
wouldn't look ugly, and we'd still be free to define a generator's normal with 
statement semantics as:

   def __exit__(self, *exc):


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

Re: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods

2005-10-07 Thread Nick Coghlan
Nick Coghlan wrote:
 It ends up looking like this:
  def __call__(self, value=None):
   Call a generator as a coroutine
  Returns the first argument supplied to StopIteration or
  None if no argument was supplied.
  Raises ContinueIteration with the value yielded as the
  argument if the generator yields a value
  if not self.__started:
  raise RuntimeError(Coroutine not started)
  if exc:
  yield_val = self.throw(value, *exc)
  yield_val = self.send(value)
  except (StopIteration), ex:
  if ex.args:
  return args[0]
  raise ContinueIteration(yield_val)

Oops, I didn't finish fixing this after I added unwind_call(). Try this 
version instead:

   def __call__(self, value=None):
Call a generator as a coroutine

   Returns the first argument supplied to StopIteration or
   None if no argument was supplied.
   Raises ContinueIteration with the value yielded as the
   argument if the generator yields a value
   yield_val = self.send(value)
   except (StopIteration), ex:
   if ex.args:
   return args[0]
   raise ContinueIteration(yield_val)


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Michael Hudson
Nick Coghlan [EMAIL PROTECTED] writes:

 What if we simply special-cased the __with__ slot in type(), such that if it 
 is populated with a generator object, that object is automatically wrapped 
 using the @contextmanager decorator? (Jason actually suggested this idea 

You don't want to check if it's a generator, you want to check if it's
a function whose func_code has the relavent bit set.

Seems a bit magical to me, but haven't thought about it hard.


  I think my standards have lowered enough that now I think ``good
  design'' is when the page doesn't irritate the living fuck out of 
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Anders J. Munch
Nick Coghlan did a +1 job to write:
 1. Amend the statement specification such that:
with EXPR as VAR:
 is translated as:
abc = (EXPR).__with__()
exc = (None, None, None)
VAR = abc.__enter__()
exc = sys.exc_info()

Note that __with__ and __enter__ could be combined into one with no
loss of functionality:

abc,VAR = (EXPR).__with__()
exc = (None, None, None)
exc = sys.exc_info()
- Anders
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Eric Nieuwland
Nick Coghlan wrote:

 1. Amend the statement specification such that:

with EXPR as VAR:

 is translated as:

abc = (EXPR).__with__()
exc = (None, None, None)
VAR = abc.__enter__()
exc = sys.exc_info()

Is this correct?
What happens to

with 40*13+2 as X:
print X



Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Nick Coghlan
Eric Nieuwland wrote:
 What happens to
 with 40*13+2 as X:
 print X

It would fail with a TypeError because the relevant slot in the type object 
was NULL - the TypeError checks aren't shown for simplicity's sake.

This behaviour isn't really any different from the existing PEP 343 - the only 
difference is that the statement looks for a __with__ slot on the original 
EXPR, rather than looking directly for an __enter__ slot.


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Nick Coghlan
Anders J. Munch wrote:
 Note that __with__ and __enter__ could be combined into one with no
 loss of functionality:
 abc,VAR = (EXPR).__with__()

They can't be combined, because they're invoked on different objects. It would 
be like trying to combine __iter__() and next() into the same method for 
iterators. . .


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Nick Coghlan
Michael Hudson wrote:
 You don't want to check if it's a generator, you want to check if it's
 a function whose func_code has the relavent bit set.

Fair point :)

 Seems a bit magical to me, but haven't thought about it hard.

Same here - I'm just starting to think that the alternative is worse, because 
it leaves open the nonsensical possibility of writing a __with__ method as a 
generator *without* applying the contextmanager decorator, and that would just 
be bizarre - if you want to get an iterable, why aren't you writing an 
__iter__ method instead?


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

[Python-Dev] Sourceforge CVS access

2005-10-07 Thread Nick Coghlan
Could one of the Sourceforge powers-that-be grant me check in access so I can 
update PEP 343 directly?


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Eric Nieuwland
Nick Coghlan wrote:
 Eric Nieuwland wrote:
 What happens to

 with 40*13+2 as X:
 print X

 It would fail with a TypeError because the relevant slot in the type 
 was NULL - the TypeError checks aren't shown for simplicity's sake.

 This behaviour isn't really any different from the existing PEP 343 - 
 the only
 difference is that the statement looks for a __with__ slot on the 
 EXPR, rather than looking directly for an __enter__ slot.

Hmmm I hadn't noticed that.
In my memory a partial implementation of the protocol was possible.
Thus, __enter__/__exit__ would only be called if they exist.

Oh well, I'll just add some empty methods.


Python-Dev mailing list

Re: [Python-Dev] Sourceforge CVS access

2005-10-07 Thread Guido van Rossum
I will, if you tell me your sourceforge username.

On 10/7/05, Nick Coghlan [EMAIL PROTECTED] wrote:
 Could one of the Sourceforge powers-that-be grant me check in access so I can
 update PEP 343 directly?

--Guido van Rossum (home page:
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Fredrik Lundh
Nick Coghlan wrote:

 That's not what the decorator is for - it's there to turn the generator used
 to implement the __with__ method into a context manager, rather than saying
 anything about decimal.Context as a whole.

possibly, but using a decorated __with__ method doesn't make much
sense if the purpose isn't to turn the class into something that can be
used with the with statement.

 However, requiring a decorator to get a slot to work right looks pretty ugly
 to me, too.

the whole concept might be perfectly fine on the this construct corre-
sponds to this code level, but if you immediately end up with things that
are not what they seem, and names that don't mean what the say, either
the design or the description of it needs work.

 (yes, I know you can use this class to manage the context, but it's not
really a context manager, because it's that method that's a manager, not
the class itself.  yes, all the information that belongs to the context are
managed by the class, but that doesn't make... oh, shut up and read the


Python-Dev mailing list

[Python-Dev] Pythonic concurrency

2005-10-07 Thread Bruce Eckel
Early in this thread there was a comment to the effect that if you
don't know how to use threads, don't use them, which I pointedly
avoided responding to because it seemed to me to simply be
inflammatory. But Ian Bicking just posted a weblog entry: where he
says threads aren't as hard as they imply and An especially poor
argument is one that tells me that I'm currently being beaten with a
stick, but apparently don't know it.

I always have a problem with this. After many years of studying
concurrency on-and-off, I continue to believe that threading is very
difficult (indeed, the more I study it, the more difficult I
understand it to be). And I admit this. The comments I sometimes get
back are to the effect that threading really isn't that hard. Thus,
I am just too dense to get it.

It's hard to know how to answer. I've met enough brilliant people to
know that it's just possible that the person posting really does
easily grok concurrency issues and thus I must seem irreconcilably
thick. This may actually be one of those people for whom threading is
obvious (and Ian has always seemed like a smart guy, for example).

But. I do happen to have contact with a lot of people who are at the
forefront of the threading world, and *none* of them (many of whom
have written the concurrency libraries for Java 5, for example) ever
imply that threading is easy. In fact, they generally go out of their
way to say that it's insanely difficult.

And Java has taken until version 5 to (apparently) get it right,
partly by defining a new memory model in order to accurately describe
what goes on with threading issues. This same model is being adapted
for the next version of C++. This is not stuff that was already out
there, that everyone knew about -- this is new stuff.

Also, look at the work that Scott Meyers, Andrei Alexandrescu, et al
did on the Double Checked Locking idiom, showing that it was broken
under threading. That was by no means trivial and obvious during all
the years that people thought that it worked.

My own experience in discussions with folks who think that threading
is transparent usually uncovers, after a few appropriate questions,
that said person doesn't actually understand the depth of the issues
involved. A common story is someone who has written a few programs and
convinced themselves that these programs work (the it works for me
proof of correctness). Thus, concurrency must be easy.

I know about this because I have learned the hard way throughout many
years, over and over again. Every time I've thought that I understood
concurrency, something new has popped up and shown me a whole new
aspect of things that I have heretofore missed. Then I start thinking
OK, now I finally understand concurrency.

One example: when I was rewriting the threading chapter for the 3rd
(previous) edition of Thinking in Java, I decided to get a
dual-processor machine so I could really test things. This way, I
discovered that the behavior of a program on a single-processor
machine could be dramatically different than the same program on a
multiprocessor machine. That seems obvious, now, but at the time I
thought I was writing pretty reasonable code. In addition, it turns
out that some things in Java concurrency were broken (even the people
who were creating thread support in the language weren't getting it
right) so that threw in extra monkey wrenches. And when you start
studying the new memory model, which takes into account instruction
reordering and cache coherency issues, you realize that it's
mind-numbingly far from trivial.

Or maybe not, for those who think it's easy. But my experience is that
the people who really do understand concurrency never suggest that
it's easy.

Bruce Eckel   mailto:[EMAIL PROTECTED]
Contains electronic books: Thinking in Java 3e  Thinking in C++ 2e
Web log:
Subscribe to my newsletter:
My schedule can be found at:

Python-Dev mailing list

[Python-Dev] Extending tuple unpacking

2005-10-07 Thread Gustavo Niemeyer
Not sure if this has been proposed before, but one thing
I occasionally miss regarding tuple unpack is being able
to do:

  first, second, *rest = something

Also in for loops:

  for first, second, *rest in iterator:

This seems to match the current meaning for starred
variables in other contexts.

What do you think?

Gustavo Niemeyer
Python-Dev mailing list

Re: [Python-Dev] Python 2.5 and ast-branch

2005-10-07 Thread Jeremy Hylton
On 10/6/05, Phillip J. Eby [EMAIL PROTECTED] wrote:
 At 07:34 PM 10/6/2005 -0700, Guido van Rossum wrote:
 How does this sound to the non-AST-branch developers who have to
 suffer the inevitable post-merge instability? I think it's now or
 never -- waiting longer isn't going to make this thing easier (not
 with several more language changes approved: with-statement, extended
 import, what else...)

 Do the AST branch changes affect the interface of the parser module?  Or
 do they just add new functionality?

It doesn't affect the parser module.  For now, the same parser is
used, so the parser module can still work the way it does.  If we
changed the parser in the future, well, the parser module would
change, too.  I'd also like to add an analogous ast module that
exposed the abstract syntax tree for manipulation, along the lines of
the parser module.  Not sure if we'll actually get to it for this

Python-Dev mailing list

Re: [Python-Dev] Extending tuple unpacking

2005-10-07 Thread Guido van Rossum
On 10/7/05, Gustavo Niemeyer [EMAIL PROTECTED] wrote:
 Not sure if this has been proposed before, but one thing
 I occasionally miss regarding tuple unpack is being able
 to do:

   first, second, *rest = something

 Also in for loops:

   for first, second, *rest in iterator:

 This seems to match the current meaning for starred
 variables in other contexts.

Someone should really write up a PEP -- this was just discussed a week
or two ago.

I personally think this is adequately handled by writing:

  (first, second), rest = something[:2], something[2:]

I believe that this wish is an example of hypergeneralization -- an
incorrect generalization based on a misunderstanding of the underlying

Argument lists are not tuples [*] and features of argument lists
should not be confused with features of tuple unpackings.

[*] Proof: f(1) is equivalent to f(1,) even though (1) is an int but
(1,) is a tuple.

--Guido van Rossum (home page:
Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Aahz
On Fri, Oct 07, 2005, Bruce Eckel wrote:

 I always have a problem with this. After many years of studying
 concurrency on-and-off, I continue to believe that threading is very
 difficult (indeed, the more I study it, the more difficult I
 understand it to be). And I admit this. The comments I sometimes get
 back are to the effect that threading really isn't that hard. Thus,
 I am just too dense to get it.

What I generally say is that threading isn't too hard if you stick with
some fairly simple idioms and tools -- and make absolutely certain to
follow some rules about sharing data.  But it's certainly true that
threading (and concurrency) in general is mind-numbingly complex.

If you think it's expensive to hire a professional to do the job, wait
until you hire an amateur.  --Red Adair
Python-Dev mailing list

Re: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods

2005-10-07 Thread Phillip J. Eby
At 09:50 PM 10/7/2005 +1000, Nick Coghlan wrote:
Notice how a non-coroutine callable can be yielded, and it will still work
happily with the scheduler, because the desire to continue execution is
indicated by the ContinueIteration exception, rather than by the type of the
returned value.

Wh?  You raise an exception to indicate the *normal* case?  That seems, 
um...  well, a Very Bad Idea.

I also don't see any point to start(), or understand what finish() does or 
why you'd want it.

Last, but far from least, as far as I can tell you can implement all of 
these semantics using PEP 342 as it sits.  That is, it's very simple to 
make decorators or classes that add those semantics.  I don't see anything 
that requires them to be part of Python.

Python-Dev mailing list

Re: [Python-Dev] Extending tuple unpacking

2005-10-07 Thread Gustavo Niemeyer
 Someone should really write up a PEP -- this was just discussed a week
 or two ago.

Heh.. I should follow the list more closely.

 I personally think this is adequately handled by writing:
   (first, second), rest = something[:2], something[2:]

That's an alternative indeed. But the the proposed way does look better:

  for item in iterator:
  (first, second), rest = item[2:], item[:2]


  for first, second, *rest in iterator:

 I believe that this wish is an example of hypergeneralization -- an
 incorrect generalization based on a misunderstanding of the underlying

Thanks for trying so hard to say in a nice way that this is not
a good idea. :-)

 Argument lists are not tuples [*] and features of argument lists
 should not be confused with features of tuple unpackings.

Do you agree that the concepts are related?

For instance:

   def f(first, second, *rest):
  ...   print first, second, rest
  1 2 (3, 4)

   first, second, *rest = (1,2,3,4)
   print first, second, rest
  1 2 (3, 4)

 [*] Proof: f(1) is equivalent to f(1,) even though (1) is an int but
 (1,) is a tuple.

Extended *tuple* unpacking was a wrong subject indeed. This is
general unpacking, since it's supposed to work with any sequence.

Gustavo Niemeyer
Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Phillip J. Eby
At 10:47 AM 10/7/2005 -0600, Bruce Eckel wrote:
Also, look at the work that Scott Meyers, Andrei Alexandrescu, et al
did on the Double Checked Locking idiom, showing that it was broken
under threading. That was by no means trivial and obvious during all
the years that people thought that it worked.

One of the nice things about the GIL is that it means double-checked 
locking *does* work in Python.  :)

My own experience in discussions with folks who think that threading
is transparent usually uncovers, after a few appropriate questions,
that said person doesn't actually understand the depth of the issues
involved. A common story is someone who has written a few programs and
convinced themselves that these programs work (the it works for me
proof of correctness). Thus, concurrency must be easy.

I know about this because I have learned the hard way throughout many
years, over and over again. Every time I've thought that I understood
concurrency, something new has popped up and shown me a whole new
aspect of things that I have heretofore missed. Then I start thinking
OK, now I finally understand concurrency.

The day when I knew, beyond all shadow of a doubt, that the people who say 
threading is easy are full of it, is when I wrote an event-driven 
co-operative multitasking system in Python and managed to create a race 
condition in *single-threaded code*.

Of course, due to its nature, a race condition in an event-driven system is 
at least reproducible given the same sequence of events, and it's fixable 
using turns (as described in a paper posted here yesterday).  With 
threads, it's not anything like reproducible, because pre-emptive threading 
is non-deterministic.

What the GIL-ranters don't get is that the GIL actually gives you just 
enough determinism to be able to write threaded programs that don't crash, 
and that maybe will even work if you treat every point of interaction 
between threads as a minefield and program with appropriate care.  So, if 
threads are easy in Python compared to other langauges, it's *because of* 
the GIL, not in spite of it.

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Shane Hathaway
Bruce Eckel wrote:
 But. I do happen to have contact with a lot of people who are at the
 forefront of the threading world, and *none* of them (many of whom
 have written the concurrency libraries for Java 5, for example) ever
 imply that threading is easy. In fact, they generally go out of their
 way to say that it's insanely difficult.

What's insanely difficult is really locking, and locking is driven by 
concurrency in general, not just threads.  It's hard to reason about 
locks.  There are only general rules about how to apply locking 
correctly, efficiently, and without deadlocks.  Personally, to be 
absolutely certain I've applied locks correctly, I have to think for 
hours.  Even then, it's hard to express my conclusions, so it's hard to 
be sure future maintainers will keep the locking correct.

Java uses locks very liberally, which is to be expected of a language 
that provides locking using a keyword.  This forces Java programmers to 
deal with the burden of locking everywhere.  It also forces the 
developers of the language and its core libraries to make locking 
extremely fast yet safe.  Java threads would be easy if there wasn't so 
much locking going on.

Zope, OTOH, is far more conservative with locks.  There is some code 
that dispatches HTTP requests to a worker thread, and other code that 
reads and writes an object database, but most Zope code isn't aware of 
concurrency.  Thus locking is hardly an issue in Zope, and as a result, 
threading is quite easy in Zope.

Recently, I've been simulating high concurrency on a PostgreSQL 
database, and I've discovered that the way you reason about row and 
table locks is very similar to the way you reason about locking among 
threads.  The big difference is the consequence of incorrect locking: in 
PostgreSQL, using the serializable mode, incorrect locking generally 
only leads to aborted transactions; while in Python and most programming 
languages, incorrect locking instantly causes corruption and chaos. 
That's what hurts developers.  I want a concurrency model in Python that 
acknowledges the need for locking while punishing incorrect locking with 
an exception rather than corruption.  *That* would be cool, IMHO.

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Barry Warsaw
On Fri, 2005-10-07 at 14:42, Shane Hathaway wrote:

 What's insanely difficult is really locking, and locking is driven by 
 concurrency in general, not just threads.  It's hard to reason about 

I think that's a very interesting observation!  I have not built a
tremendous number of concurrent apps, but even the dumb locking that
Mailman does (which is not a great model of granularity ;) has burned
many bch's (brain cell hours) to get right.

Where I have used more concurrency, I generally try to structure my apps
into the one-producer-many-independent-consumers architecture that was
outlined in a previous message.  In that case, if you can narrow your
touch points to the Queue module for example, then yeah, threading is
easy.  A gaggle of independent workers isn't that hard to get right in


Description: This is a digitally signed message part
Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Antoine Pitrou


(my 2 cents, probably not very constructive)

 Recently, I've been simulating high concurrency on a PostgreSQL 
 database, and I've discovered that the way you reason about row and 
 table locks is very similar to the way you reason about locking among 
 threads.  The big difference is the consequence of incorrect locking: in 
 PostgreSQL, using the serializable mode, incorrect locking generally 
 only leads to aborted transactions; while in Python and most programming 
 languages, incorrect locking instantly causes corruption and chaos. 
 That's what hurts developers.  I want a concurrency model in Python that 
 acknowledges the need for locking while punishing incorrect locking with 
 an exception rather than corruption.  *That* would be cool, IMHO.

A relational database has a very strict and regular data model. Also, it
has transactions. This makes it easy to precisely define concurrency at
the engine level.

To apply the same thing to Python you would at least need :
  1. a way to define a subset of the current bag of reachable objects
which has to stay consistent w.r.t. transactions that are applied to it
(of course, you would have several such subsets in any non-trivial
  2. a way to start and end a transaction on a bag of objects (begin /
commit / rollback)
  3. a precise definition of the semantics of consistency here : for
example, only one thread could modify a bag of objects at any given
time, and other threads would continue to see the frozen, stable version
of that bag until the next version is committed by the writing thread

For 1), a helpful paradigm would be to define an object as being the
root of a bag, and all its properties would automatically and
recursively (or not ?) belong to this bag. One has to be careful that no
property leaks and makes the bag become the set of all reachable
Python objects (one could provide a means to say that a specific
property must not be transitively put in the bag). Then, use
my_object.begin_transaction() and my_object.commit_transaction().

The implementation of 3) does not look very obvious ;-S


Python-Dev mailing list

Re: [Python-Dev] __doc__ behavior in class definitions

2005-10-07 Thread Fredrik Lundh
Martin Maly wrote:

 I came across a case which I am not sure if by design or a bug in Python
 (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python

 # module begin
 module doc

 class c:
 print __doc__
 __doc__ = class doc (1)
 print __doc__

 print c.__doc__
 # module end

 When ran, it prints:

 module doc
 class doc
 class doc

 Based on the binding rules described in the Python documentation, I
 would expect the code to throw because binding created on the line (1)
 is local to the class block and all the other __doc__ uses should
 reference that binding. Apparently, it is not the case.

 Is this bug in Python or are __doc__ strings in classes subject to some
 additional rules?

it's not limited to __doc__ strings, or, for that matter, to attributes:

spam = spam

class c:
print spam
spam = bacon
print spam

print len(spam)

def len(self):
return 10

print c.spam

the language reference uses the term local scope for both class and
def-statements, but it's not really the same thing.  the former is more
like a temporary extra global scope with a (class, global) search path,
names are resolved when they are found (just as in the global scope);
there's no preprocessing step.

for additional class issues, see the Discussion in the nested scopes

hope this helps!


Python-Dev mailing list

Re: [Python-Dev] __doc__ behavior in class definitions

2005-10-07 Thread Phillip J. Eby
At 12:15 PM 10/7/2005 -0700, Martin Maly wrote:
Based on the binding rules described in the Python documentation, I
would expect the code to throw because binding created on the line (1)
is local to the class block and all the other __doc__ uses should
reference that binding. Apparently, it is not the case.

Correct - the scoping rules about local bindings causing a symbol to be 
local only apply to *function* scopes.  Class scopes are able to refer to 
module-level names until the name is shadowed in the class scope.

Is this bug in Python or are __doc__ strings in classes subject to some
additional rules?

Neither; the behavior you're seeing doesn't have anything to do with 
docstrings per se, it's just normal Python binding behavior, coupled with 
the fact that the class' docstring isn't set until the class suite is 

It's currently acceptable (if questionable style) to do things like this in 
today's Python:

 X = 1

 class X:
 X = X + 1

 print X.X  # this will print 2

More commonly, and less questionably, this would manifest as something like:

 def function_taking_foo(foo, bar):

 class Foo(blah):
 function_taking_foo = function_taking_foo

This makes it possible to call 'function_taking_foo(aFooInstance, someBar)' 
or 'aFooInstance.function_taking_foo(someBar)'.  I've used this pattern a 
couple times myself, and I believe there may actually be cases in the 
standard library that do something like this, although maybe not binding 
the method under the same name as the function.

Python-Dev mailing list

Re: [Python-Dev] __doc__ behavior in class definitions

2005-10-07 Thread Steve Holden
Martin Maly wrote:
 Hello Python-Dev,
 My name is Martin Maly and I am a developer at Microsoft, working on the
 IronPython project with Jim Hugunin. I am spending lot of time making
 IronPython compatible with Python to the extent possible.
 I came across a case which I am not sure if by design or a bug in Python
 (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python
 # module begin
 module doc
 class c:
 print __doc__
 __doc__ = class doc (1)
 print __doc__
 print c.__doc__
 # module end
 When ran, it prints:
 module doc
 class doc
 class doc
 Based on the binding rules described in the Python documentation, I
 would expect the code to throw because binding created on the line (1)
 is local to the class block and all the other __doc__ uses should
 reference that binding. Apparently, it is not the case.
 Is this bug in Python or are __doc__ strings in classes subject to some
 additional rules?
Well, it's nothing to do with __doc__, as the following example shows:

crud = module crud

class c:
 print crud
 crud = class crud
 print crud

print c.crud

As you might by now expect, this outputs

module crud
class crud
class crud

Clearly the rules for class scopes aren't quite the same as those for 
function scopes, as the module

crud = module crud

def f():
 print crud
 crud = function crud
 print crud


does indeed raise an UnboundLocalError exception.

I'm not enough of a language lawyer to determine exactly why this is, 
but it's clear that class variables aren't scoped in the same way as 
function locals.

Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC
PyCon TX 2006

Python-Dev mailing list

Re: [Python-Dev] __doc__ behavior in class definitions

2005-10-07 Thread Jack Diederich
On Fri, Oct 07, 2005 at 12:15:04PM -0700, Martin Maly wrote:
 Hello Python-Dev,
 My name is Martin Maly and I am a developer at Microsoft, working on the
 IronPython project with Jim Hugunin. I am spending lot of time making
 IronPython compatible with Python to the extent possible.
 I came across a case which I am not sure if by design or a bug in Python
 (Python 2.4.1 (#65, Mar 30 2005, 09:13:57)). Consider following Python
 # module begin
 module doc
 class c:
 print __doc__
 __doc__ = class doc (1)
 print __doc__


 Based on the binding rules described in the Python documentation, I
 would expect the code to throw because binding created on the line (1)
 is local to the class block and all the other __doc__ uses should
 reference that binding. Apparently, it is not the case.
 Is this bug in Python or are __doc__ strings in classes subject to some
 additional rules?

Classes behave just like you would expect them to, for proper variations
of what to expect *wink*.

The class body is evaluated first with the same local/global name lookups
as would happen inside another scope (e.g. a function).  The results
of that evaluation are then passed to the class constructor as a dict.
The __new__ method of metaclasses and the less used 'new' module highlight
the final step that turns a bucket of stuff in a namespace into a class.

 import new
 A = new.classobj('w00t', (object,), {'__doc__':no help at all, 
 'myself':lambda x:x})
 a = A()
__main__.w00t object at 0xb7bc32cc
__main__.w00t object at 0xb7bc32cc
Help on w00t in module __main__ object:

class w00t(__builtin__.object)
 |  no help at all
 |  Methods defined here:
 |  lambdax

Hope that helps,

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Michael Sparks
[ Possibly overlengthy reply. However given a multiple sets of cans of
  worms... ]
On Friday 07 October 2005 07:25, Josiah Carlson wrote:
 One thing I notice is absent from the Kamaelia page is benchmarks.

That's largely for one simple reason: we haven't done any yet. 

At least not anything I'd call a benchmark. There's lies, damn lies,
statistics and then there's benchmarks.

//Theoretically// I suspect that the system /could/ perform as well as
traditional approaches to dealing with concurrent problems single threaded 
(and multi-thread/process). This is based on the recognition of two things:

* Event systems (often implementing state machine type behaviour, not
  always though), often have intermediate buffers between states 
  operations. Some systems divide a problem into multiple reactors and
  stages and have communication between them, though this can sometimes
 be hidden. All we've done is make this much more explicit.

   * Event systems (and state machine based approaches) can often be used
  to effectively say I want to stop and wait here, come back to me later
  or simply I'm doing something processor intensive, but I'm being nice
  and letting something else have a go. The use of generators here simply
  makes that particular behaviour more explicit. This is a nice bonus of

[neither is a negative really, just different. The first bullet has implicit
 buffers in the system, the latter has a more implicit state machine
 in the system. ICBVW here of course.]

However, COULD is not is, and whilst I say in theory, I am painfully aware 
that theory and practice often have a big gulf between them.

Also, I'm certain that at present our performance is nowhere near optimal.
We've focussed on trying to find what works from a few perspectives rather
than performance (one possible definition of correctness here, but certainly
not the only one). Along the way we've made compomises in favour of clarity
as to what's going on, rather than performance.

For example, one are we know we can optimise is the handling of
message delivery. The mini-axon tutorial represents delivery between
active components as being performed by an independent party - a
postman. This is precisely what happens in the current system.

That can be optimised for example by collapsing outboxes into inboxes
(ie removing one of the lists when a linkage is made and changing the
refernce), and at that point you have a single intermediate buffer (much
like an event/state system communicating between subsystems). We haven't
done this yet, Whilst it would partly simplify things, it makes other
areas more complex, and seems like premature optimisation.

However I have performed an //informal comparison// between the use of a 
Kamaelia type approach and a traditional approach not using any framework at 
all for implementing a trivial game. (Cats bouncing around the screen 
scaling, rotating, etc, controlled by a user) The reason I say Kamaelia-type 
approach is because it was a mini-axon based experiment using collapsed 
outboxes to inboxes (as above).

The measure I used was simply framerate. This is a fair real value and has a 
real use - if it drops too low, the system is simply unusable. I measured the 
framerate before transforming the simplistic game to work well in the 
framework, and after transforming it. The differences were:
   * 5% drop in performance/framerate
   * The ability to reuse much of the code in other systems and environments.

From that perspective it seems acceptable (for now). This *isn't* as you would
probably say a rigorous or trustable benchmark, but was a useful smoke test
if you like of the approach.

From a *pragmatic* perspective, currently the system is fast enough for simple 
games (say a hundred, 2 hundred, maybe more, sprites actve at once),
for interactive applications, video players, realtime audio mixing and a
variety of other things, so currently we're leaving that aside.

Also from an even more pragmatic perspective, I would say if you're after 
performance and throughput then I'd say use Twisted, since it's a proven 

**If** our stuff turns out to be useful, we'd like to find  way of making our
stuff available inside twisted -- if they'd like it (*) --  since we're not 
the least bit interested in competing with anyone :-) So far *we're* finding 
it useful, which is all I'd personally claim, and hope that it's useful to 
   (*) The all too brief conversation I had with Tommi Virtanen at Europython
   suggested that he at least thought the pipeline/graphline idea was
   worth taking - so I'd like to do that at some point, even if it 
   sidelines our work to date.

Once we've validated the model though (which I expect to take some time,
you only learn if it's validated by builiding things IMO), then we'll look at
optimisation.  (if the model is validated :-)

All that said, I'm open to suggestion as to what sort of 

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Jim Fulton
Shane Hathaway wrote:
 Antoine Pitrou wrote:
A relational database has a very strict and regular data model. Also, it
has transactions. This makes it easy to precisely define concurrency at
the engine level.

To apply the same thing to Python you would at least need :
  1. a way to define a subset of the current bag of reachable objects
which has to stay consistent w.r.t. transactions that are applied to it
(of course, you would have several such subsets in any non-trivial
  2. a way to start and end a transaction on a bag of objects (begin /
commit / rollback)
  3. a precise definition of the semantics of consistency here : for
example, only one thread could modify a bag of objects at any given
time, and other threads would continue to see the frozen, stable version
of that bag until the next version is committed by the writing thread

For 1), a helpful paradigm would be to define an object as being the
root of a bag, and all its properties would automatically and
recursively (or not ?) belong to this bag. One has to be careful that no
property leaks and makes the bag become the set of all reachable
Python objects (one could provide a means to say that a specific
property must not be transitively put in the bag). Then, use
my_object.begin_transaction() and my_object.commit_transaction().

The implementation of 3) does not look very obvious ;-S
 Well, I think you just described ZODB. ;-)  I'd be happy to explain how 
 ZODB solves those problems, if you're interested.
 However, ZODB doesn't provide locking, and that bothers me somewhat.  If 
 two threads try to modify an object at the same time, one of the threads 
 will be forced to abort, unless a method has been defined for resolving 
 the conflict.  If there are too many writers, ZODB crawls.  ZODB's 
 strategy works fine when there aren't many conflicting, concurrent 
 changes, but the complex locking done by relational databases seems to 
 be required for handling a lot of concurrent writers.

I don't think it would be all that hard to use a locking (rather than
a time-stamp) strategy for ZODB, although ZEO would make this
extra challenging.

In any case, the important thing to agree on here is that transactions
provide a useful approach to concurrency control in the case where

- separate control flows are independent, and

- we need to mediate access to shared resources.

Someone else pointed out essentially the same thing at the beginning
of this thread.


Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714
Zope Corporation
Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Antoine Pitrou

 Well, I think you just described ZODB. ;-)


 I'd be happy to explain how 
 ZODB solves those problems, if you're interested.

Well, yes, I'm interested :)
(I don't anything about Zope internals though, and I've never even used

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Shane Hathaway
Antoine Pitrou wrote:
I'd be happy to explain how 
ZODB solves those problems, if you're interested.
 Well, yes, I'm interested :)
 (I don't anything about Zope internals though, and I've never even used

Ok.  Quoting your list:

  To apply the same thing to Python you would at least need :
1. a way to define a subset of the current bag of reachable objects
  which has to stay consistent w.r.t. transactions that are applied
  to it (of course, you would have several such subsets in any
  non-trivial application)

ZODB holds a tree of objects.  When you add an attribute to an object 
managed by ZODB, you're expanding the tree.  Consistency comes from 
several features:

   - Each thread has its own lazy copy of the object tree.

   - The application doesn't see changes to the object tree except at 
transaction boundaries.

   - The ZODB store keeps old revisions, and the new MVCC feature lets 
the application see the object system as it was at the beginning of the 

   - If you make a change to the object tree that conflicts with a 
concurrent change, all changes to that copy of the object tree are aborted.

2. a way to start and end a transaction on a bag of objects (begin /
  commit / rollback)

ZODB includes a transaction module that does just that.  In fact, the 
module is so useful that I think it belongs in the standard library.

3. a precise definition of the semantics of consistency here : for
  example, only one thread could modify a bag of objects at any given
  time, and other threads would continue to see the frozen,
  stable version of that bag until the next version is committed by the
  writing thread

As mentioned above, the key is that ZODB maintains a copy of the objects 
per thread.  A fair amount of RAM is lost that way, but the benefit in 
simplicity is tremendous.

You also talked about the risk that applications would accidentally pull 
a lot of objects into the tree just by setting an attribute.  That can 
and does happen, but the most common case is already solved by the 
pickle machinery: if you pickle something global like a class, the 
pickle stores the name and location of the class instead of the class 

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Bruce Eckel
 //Theoretically// I suspect that the system /could/ perform as well as
 traditional approaches to dealing with concurrent problems single threaded
 (and multi-thread/process).

I also think it's important to factor in the possibility of
multiprocessors. If Kamaelia (for example) has a very safe and
straightforward programming model so that more people are easily able
to use it, but it has some performance impact over more complex
systems, I think the ease of use issue opens up far greater
possibilities if you include multiprocessing -- because if you can
easily write concurrent programs in Python, then Python could gain a
significant advantage over less agile languages when multiprocessors
become common. That is, with multiprocessors, it could be way easier
to write a program in Python that also runs way faster than the
competition. Yes, of course given enough time they might theoretically
be able to write a program that is as fast or faster using their
threading mechanism, but it would be so hard by comparison that
they'll either never get it done or never be sure if it's reliable.

That's what I'm looking for.

Bruce Eckel   mailto:[EMAIL PROTECTED]
Contains electronic books: Thinking in Java 3e  Thinking in C++ 2e
Web log:
Subscribe to my newsletter:
My schedule can be found at:

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Michael Sparks
On Friday 07 October 2005 23:26, Bruce Eckel wrote:
  I think the ease of use issue opens up far greater possibilities if you
  include multiprocessing  
 That's what I'm looking for.

In which case that's an area we need to push our work into sooner rather
than later. After all, the PS3 and CELL arrive next year. Sun already has
some interesting stuff shipping. I'd like to use that kit effectively, and
more importantly make using that kit effectively available to collegues
sooner rather than later. That really means multiprocess now not later.

BTW, I hope it's clear that I'm not saying concurrency is easy per se (noting
your previous post ;-) but rather than it /should/ be made as simple as is
humanly possible.


Michael Sparks, Senior RD Engineer, Digital Media Group
British Broadcasting Corporation, Research and Development
Kingswood Warren, Surrey KT20 6NP

This e-mail may contain personal views which are not the views of the BBC.
Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Michael Sparks
On Thursday 06 October 2005 21:06, Bruce Eckel wrote:
 So yes indeed, this is quite high on my list to research. Looks like
 people there have been doing some interesting work.

 Right now I'm just trying to cast a net, so that people can put in
 ideas, for when the Java book is done and I can spend more time on it.

Thanks for your kind words. Hopefully it's of use!


Python-Dev mailing list

Re: [Python-Dev] __doc__ behavior in class definitions

2005-10-07 Thread Jason Orendorff

These two cases generate different bytecode.

def foo(): # foo.func_code.co_flags == 0x43
print x# LOAD_FAST 0
x = 3

class Foo: # code object.co_flags == 0x40
print x# LOAD_NAME 'x'
x = 3

In functions, local variables are just numbered slots. (co_flags bits
1 and 2 indicate this.)  The LOAD_FAST opcode is used.  If the slot is
empty, LOAD_FAST throws.

In other code, the local variables are actually stored in a
dictionary.  LOAD_NAME is used.  This does a locals dictionary lookup;
failing that, it falls back on the globals dictionary; and failing
that, it falls back on builtins.

Why the discrepancy?  Beats me.  I would definitely implement what
CPython does up to this point, if that's your question.

Btw, functions that use 'exec' are in their own category way out

def foo2(): # foo2.func_code.co_flags == 0x42
print x # LOAD_NAME 'x'
exec x=3  # don't ever do this, it screws everything up
print x

Pretty weird.  Jython seems to implement this.

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Nick Coghlan
Bruce Eckel wrote:
 I always have a problem with this. After many years of studying
 concurrency on-and-off, I continue to believe that threading is very
 difficult (indeed, the more I study it, the more difficult I
 understand it to be). And I admit this. The comments I sometimes get
 back are to the effect that threading really isn't that hard. Thus,
 I am just too dense to get it.

The few times I have encountered anyone saying anything resembling threading 
is easy, it was because the full sentence went something like threading is 
easy if you use message passing and copy-on-send or release-reference-on-send 
to communicate between threads, and limit the shared data structures to those 
required to support the messaging infrastructure. And most of the time there 
was an implied compared to using semaphores and locks directly,  at the start.

Which is obiously a far cry from simply saying threading is easy. If I 
encountered anyone who thought it was easy *in general*, then I would fear any 
threaded code they wrote, because they clearly weren't thinking about the 
problem hard enough ;)


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

Re: [Python-Dev] Proposed changes to PEP 343

2005-10-07 Thread Nick Coghlan
Fredrik Lundh wrote:
 Nick Coghlan wrote:
However, requiring a decorator to get a slot to work right looks pretty ugly
to me, too.
 the whole concept might be perfectly fine on the this construct corre-
 sponds to this code level, but if you immediately end up with things that
 are not what they seem, and names that don't mean what the say, either
 the design or the description of it needs work.
  (yes, I know you can use this class to manage the context, but it's not
 really a context manager, because it's that method that's a manager, not
 the class itself.  yes, all the information that belongs to the context are
 managed by the class, but that doesn't make... oh, shut up and read the

Heh. OK, my current inclinitation is to make the new paragraph at the end of 
the Generator Decorator section read like this:

4. Add a paragraph to the end of the Generator Decorator section:

  If a generator is used to write a context's __with__ method, then
Python's type machinery will automatically take care of applying this
decorator. This means that it is just as easy to write a generator-based
context manager for a context as it is to write a generator-based iterator
for an iterable (see the decimal.Context example below).

And then update the decimal.Context example to remove the @contextmanager 


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

[Python-Dev] Sandboxed Threads in Python

2005-10-07 Thread Adam Olsen
Okay, basic principal first.  You start with a sandboxed thread that
has access to nothing.  No modules, no builtins, *nothing*.  This
means it can run without the GIL but it can't do any work.  To make it
do something useful we need to give it two things: first, immutable
types that can be safely accessed without locks, and second a
thread-safe queue to coordinate.  With those you can bring modules and
builtins back into the picture, either by making them immutable or
using a proxy that handles all the methods in a single thread.

Unfortunately python has a problem with immutable types.  For the most
part it uses an honor system, trusting programmers not to make a class
that claims to be immutable yet changes state anyway.  We need more
than that, and freezing a dict would work well enough, so it's not
the problem.  The problem is the reference counting, and even if we do
it safely all the memory writes just kill performance so we need to
avoid it completely.

Turns out it's quite easy and it doesn't harm performance of existing
code or require modification (but a recompile is necessary).  The idea
is to only use a cyclic garbage collector for cleaning them up, which
means we need to disable the reference counting.  That requires we
modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a
magic constant (probably a negative value).

That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
magic constant.  Ahh, but the performance?  See for yourself.

[EMAIL PROTECTED]:~/src/Python-2.4.1$ ./python Lib/test/ 50
Pystone(1.1) time for 50 passes = 13.34
This machine benchmarks at 37481.3 pystones/second

Modified Py_INCREF/Py_DECREF with magic constant
[EMAIL PROTECTED]:~/src/Python-2.4.1-sandbox$ ./python Lib/test/ 
Pystone(1.1) time for 50 passes = 13.38
This machine benchmarks at 37369.2 pystones/second

The numbers aren't significantly different.  In fact the second one is
often slightly faster, which shows the difference is smaller than the
statistical noise.

So to sum up, by prohibiting mutable objects from being transferred
between sandboxes we can achieve scalability on multiple CPUs, making
threaded programming easier and more reliable, as a bonus get secure
sandboxes[1], and do that all while maintaining single-threaded
performance and requiring minimal changes to existing C modules

A proof of concept patch to Py_INCREF/Py_DECREF (only demonstrates
performance effects, does not create or utilize any new functionality)
can be found here:

[1] We need to remove any backdoor methods of getting to mutable
objects outside of your sandbox, which gets us most of the way towards
a restricted execution environment.

Adam Olsen, aka Rhamphoryncus
Python-Dev mailing list

Re: [Python-Dev] Sandboxed Threads in Python

2005-10-07 Thread Phillip J. Eby
At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
Okay, basic principal first.  You start with a sandboxed thread that
has access to nothing.  No modules, no builtins, *nothing*.  This
means it can run without the GIL but it can't do any work.

It sure can't.  You need at least the threadstate and a builtins dictionary 
to do any work.

   To make it
do something useful we need to give it two things: first, immutable
types that can be safely accessed without locks,

This is harder than it sounds.  Integers, for example, have a custom 
allocator and a free list, not to mention a small-integer cache.  You would 
somehow need to duplicate all that for each sandbox, or else you have to 
make those integers immortal using your magic constant.

Turns out it's quite easy and it doesn't harm performance of existing
code or require modification (but a recompile is necessary).  The idea
is to only use a cyclic garbage collector for cleaning them up,

Um, no, actually.  You need a mark-and-sweep GC or something of that 
ilk.  Python's GC only works with objects that *have refcounts*, and it 
works by clearing objects that are in cycles.  The clearing causes 
DECREF-ing, which then causes objects to be freed.  If you have objects 
without refcounts, they would be immortal and utterly unrecoverable.

means we need to disable the reference counting.  That requires we
modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a
magic constant (probably a negative value).

And any object with the magic refcount will live *forever*, unless you 
manually deallocate it.

That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
magic constant.  Ahh, but the performance?  See for yourself.

First, you need to implement a garbage collection scheme that can deal with 
not having refcounts.  Otherwise you're not comparing apples to apples 
here, and your programs will leak like crazy.

Note that implementing a root-based GC for Python is non-trivial, since 
extension modules can store pointers to PyObjects anywhere they 
like.  Further, many Python objects don't even support being tracked by the 
current cycle collector.

So, changing this would probably require a lot of C extensions to be 
rewritten to support the needed API changes for the new garbage collection 

So to sum up, by prohibiting mutable objects from being transferred
between sandboxes we can achieve scalability on multiple CPUs, making
threaded programming easier and more reliable, as a bonus get secure
sandboxes[1], and do that all while maintaining single-threaded
performance and requiring minimal changes to existing C modules

Unfortunately, you have only succeeded in restating the problem, not 
reducing its complexity.  :)  In fact, you may have increased the 
complexity, since now you need a threadsafe garbage collector, too.

Oh, and don't forget - newstyle classes keep weak references to all their 
subclasses, which means for example that every time you subclass 'dict', 
you're modifying the immutable 'dict' class.  So, unless you recreate all 
the classes in each sandbox, you're back to needing locking.  And if you 
recreate everything in each sandbox, well, I think you've just reinvented 
processes.  :)

Python-Dev mailing list

Re: [Python-Dev] Sandboxed Threads in Python

2005-10-07 Thread Nick Coghlan
Phillip J. Eby wrote:
 Oh, and don't forget - newstyle classes keep weak references to all their 
 subclasses, which means for example that every time you subclass 'dict', 
 you're modifying the immutable 'dict' class.  So, unless you recreate all 
 the classes in each sandbox, you're back to needing locking.  And if you 
 recreate everything in each sandbox, well, I think you've just reinvented 
 processes.  :)

After all, there's a reason Bruce Eckel's recent post about multi-processing 
attracted a fair amount of interest.


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list

Re: [Python-Dev] Sandboxed Threads in Python

2005-10-07 Thread Adam Olsen
On 10/7/05, Phillip J. Eby [EMAIL PROTECTED] wrote:
 At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
 Okay, basic principal first.  You start with a sandboxed thread that
 has access to nothing.  No modules, no builtins, *nothing*.  This
 means it can run without the GIL but it can't do any work.

 It sure can't.  You need at least the threadstate and a builtins dictionary
 to do any work.

To make it
 do something useful we need to give it two things: first, immutable
 types that can be safely accessed without locks,

 This is harder than it sounds.  Integers, for example, have a custom
 allocator and a free list, not to mention a small-integer cache.  You would
 somehow need to duplicate all that for each sandbox, or else you have to
 make those integers immortal using your magic constant.

Yes, we'd probably want some per-sandbox allocators.  I'm no expert on
that but I know it can be done.

 Turns out it's quite easy and it doesn't harm performance of existing
 code or require modification (but a recompile is necessary).  The idea
 is to only use a cyclic garbage collector for cleaning them up,

 Um, no, actually.  You need a mark-and-sweep GC or something of that
 ilk.  Python's GC only works with objects that *have refcounts*, and it
 works by clearing objects that are in cycles.  The clearing causes
 DECREF-ing, which then causes objects to be freed.  If you have objects
 without refcounts, they would be immortal and utterly unrecoverable.

Perhaps I wasn't clear enough, I was assuming appropriate changes to
the GC would be done.  The important thing is it can be done without
changing the interface that the existing modules use.

 means we need to disable the reference counting.  That requires we
 modify Py_INCREF and Py_DECREF to be a no-op if ob_refcnt is set to a
 magic constant (probably a negative value).

 And any object with the magic refcount will live *forever*, unless you
 manually deallocate it.

See above.

 That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
 magic constant.  Ahh, but the performance?  See for yourself.

 First, you need to implement a garbage collection scheme that can deal with
 not having refcounts.  Otherwise you're not comparing apples to apples
 here, and your programs will leak like crazy.

 Note that implementing a root-based GC for Python is non-trivial, since
 extension modules can store pointers to PyObjects anywhere they
 like.  Further, many Python objects don't even support being tracked by the
 current cycle collector.

 So, changing this would probably require a lot of C extensions to be
 rewritten to support the needed API changes for the new garbage collection

They only need to be rewritten if you want them to provide an
immutable type that can be transferred between sandboxes.  Short of
that you can make the module object itself immutable, and from it
create mutable instances that are private to each sandbox and not

If you make no changes at all the module still works, but is only
usable from the main thread.  That allows us to transition

 So to sum up, by prohibiting mutable objects from being transferred
 between sandboxes we can achieve scalability on multiple CPUs, making
 threaded programming easier and more reliable, as a bonus get secure
 sandboxes[1], and do that all while maintaining single-threaded
 performance and requiring minimal changes to existing C modules

 Unfortunately, you have only succeeded in restating the problem, not
 reducing its complexity.  :)  In fact, you may have increased the
 complexity, since now you need a threadsafe garbage collector, too.

 Oh, and don't forget - newstyle classes keep weak references to all their
 subclasses, which means for example that every time you subclass 'dict',
 you're modifying the immutable 'dict' class.  So, unless you recreate all
 the classes in each sandbox, you're back to needing locking.  And if you
 recreate everything in each sandbox, well, I think you've just reinvented
 processes.  :)

I was aware that weakrefs needed some special handling (I just forgot
to mention it), but I didn't know it was used by subclassing. 
Unfortunately I don't know what purpose it serves so I can't
contemplate how to deal with it.

I need to stress that *only* the new, immutable and thread-safe
mark-and-sweep types would be affected by these changes.  Everything
else would continue to exist as it did before, and the benchmark
exists to show they can coexist without killing performance.

Adam Olsen, aka Rhamphoryncus
Python-Dev mailing list

Re: [Python-Dev] PEP 342 suggestion: start(), __call__() and unwind_call() methods

2005-10-07 Thread Nick Coghlan
Phillip J. Eby wrote:
 At 09:50 PM 10/7/2005 +1000, Nick Coghlan wrote:
 Notice how a non-coroutine callable can be yielded, and it will still 
 happily with the scheduler, because the desire to continue execution is
 indicated by the ContinueIteration exception, rather than by the type 
 of the
 returned value.
 Wh?  You raise an exception to indicate the *normal* case?  That 
 seems, um...  well, a Very Bad Idea.

The sheer backwardness of my idea occurred to me after I'd got some sleep :)

 Last, but far from least, as far as I can tell you can implement all of 
 these semantics using PEP 342 as it sits.  That is, it's very simple to 
 make decorators or classes that add those semantics.  I don't see 
 anything that requires them to be part of Python.

Yeah, I've now realised that you can do all of this more simply by doing it 
directly in the scheduler using StopIteration to indicate when the coroutine 
is done, and using yield to indicate I'm not done yet.

So with a bit of thought, I came up with a scheduler that has all the benefits 
I described, and only uses the existing PEP 342 methods.

When writing a coroutine for this scheduler, you can do 6 things via the 

   1. Raise StopIteration to indicate I'm done and return None to your caller
   2. Raise StopIteration with a single argument to return a value other than 
None to your caller
   3. Raise a different exception and have that exception propagate up to your 
   5. Yield None to allow other coroutines to be executed
   5. Yield a coroutine to request a call to that coroutine
   6. Yield a callable to request an asynchronous call using that object

Yielding anything else, or trying to raise StopIteration with more than one 
argument results in a TypeError being raised *at the point of the offending 
yield or raise statement*, rather than taking out the scheduler itself.

The more I explore the possibilities of PEP 342, the more impressed I am by 
the work that went into it!


P.S. Here's the Trampoline scheduler described above:

 import collections

 class Trampoline:
 Manage communications between coroutines

 running = False

 def __init__(self):
 self.queue = collections.deque()

 def add(self, coroutine):
 Request that a coroutine be executed

 def run(self):
 result = None
 self.running = True
 while self.running and self.queue:
 func = self.queue.popleft()
 result = func()
 return result
 self.running = False

 def stop(self):
 self.running = False

 def schedule(self, coroutine, stack=(), call_result=None, *exc):
 # Define the new pseudothread
 def pseudothread():
 if exc:
 callee = coroutine.throw(call_result, *exc)
 callee = coroutine(call_result)
 except (StopIteration), ex:
 # Coroutine finished cleanly
 if stack:
 # Send the result to the caller
 caller = stack[0]
 prev_stack = stack[1]
 if len(ex.args)  1:
 # Raise a TypeError in the current coroutine
 self.schedule(coroutine, stack,
  Too many arguments to StopIteration
 elif ex.args:
 self.schedule(caller, prev_stack, ex.args[0])
 self.schedule(caller, prev_stack)
 # Coroutine finished with an exception
 if stack:
 # send the error back to the caller
 caller = stack[0]
 prev_stack = stack[1]
  caller, prev_stack, *sys.exc_info()
 # Nothing left in this pseudothread to
 # handle it, let it propagate to the
 # run loop
 # Coroutine isn't finished yet
 if callee is None:
 # Reschedule the current 

Re: [Python-Dev] Sandboxed Threads in Python

2005-10-07 Thread Josiah Carlson

Adam Olsen [EMAIL PROTECTED] wrote:
 I need to stress that *only* the new, immutable and thread-safe
 mark-and-sweep types would be affected by these changes.  Everything
 else would continue to exist as it did before, and the benchmark
 exists to show they can coexist without killing performance.

All the benchmark showed was that checking for a constant in the
refcount during in/decrefing, and not garbage collecting those objects,
didn't adversely affect performance.

As an aside, there's also the ugly bit about being able to guarantee
that an object is immutable.  I personally mutate Python strings in my C
code all the time (long story, not to be discussed here), and if I can
do it now, then any malicious or inventive person can do the same in
this sandboxed thread Python of the future.

At least in the case of integers, one could work the tagged integer idea
to bypass the freelist issue the Phillip offered, but in general, I
don't believe there exists a truely immutable type as long as there is C
extensions and/or cTypes.  Further, the work to actually implement a new
garbage collector for Python in order to handle these 'immutable' types
seems to me to be more trouble than it is worth.

 - Josiah

Python-Dev mailing list

Re: [Python-Dev] Proposal for 2.5: Returning values from PEP 342 enhanced generators

2005-10-07 Thread James Y Knight
On Oct 3, 2005, at 1:53 AM, Piet Delport wrote:
 For generators written in this style, yield means suspend  
 execution of the
 current call until the requested result/resource can be provided, and
 return regains its full conventional meaning of terminate the  
 current call
 with a given result.

 The simplest / most straightforward implementation would be for  
 return Foo
 to translate to raise StopIteration, Foo. This is consistent with  
 translating to raise StopIteration, and does not break any existing
 generator code.

 (Another way to think about this change is that if a plain  
 StopIteration means
 the iterator terminated, then a valued StopIteration, by  
 extension, means
 the iterator terminated with the given value.)

It sounds like a nice idea to me. Of course, it is only useful to  
functions calling .next() explicitly; in something like a for loop,  
the return value would just be ignored.

Python-Dev mailing list

Re: [Python-Dev] Pythonic concurrency

2005-10-07 Thread Josiah Carlson

Michael Sparks [EMAIL PROTECTED] wrote:
 [ Possibly overlengthy reply. However given a multiple sets of cans of
   worms... ]
 On Friday 07 October 2005 07:25, Josiah Carlson wrote:
  One thing I notice is absent from the Kamaelia page is benchmarks.
 That's largely for one simple reason: we haven't done any yet. 

Perfectly reasonable.  If you ever do, I'd be happy to know!

 At least not anything I'd call a benchmark. There's lies, damn lies,
 statistics and then there's benchmarks.

Indeed.  But it does allow people to get an idea whether a system could
handle their workload.

 The measure I used was simply framerate. This is a fair real value and has a 
 real use - if it drops too low, the system is simply unusable. I measured the 
 framerate before transforming the simplistic game to work well in the 
 framework, and after transforming it. The differences were:
* 5% drop in performance/framerate
* The ability to reuse much of the code in other systems and environments.

Single process?  Multi-process single machine?  Multiprocess multiple

 Also from an even more pragmatic perspective, I would say if you're after 
 performance and throughput then I'd say use Twisted, since it's a proven 

I'm just curious.  I keep my fingers away from Twisted as a matter of
personal taste (I'm sure its great, but it's not for me).

 All that said, I'm open to suggestion as to what sort of benchmark you'd like 
 to see. I'm more interested in benchmarks that actually mean something rather 
 than say X is better than Y though.

I wouldn't dream of saying that X was better or worse than Y, unless one
was obvious crap (since it works for you, and you've gotten new users to
use it successfully, that is obviously not the case).

There are five benchmarks that I think would be interesting to see:
1. Send ~500 bytes of data round-trip from process A to process B and
back on the same machine as fast as you can (simulates a synchronous
message passing and discovers transfer latencies) a few (tens of)
thousands of times (A doesn't send message i until it has recieved
message i-1 back from B).

2. Increase the number of processes that round trip with B.  A quick
chart of #senders vs. messages/second would be far more than adequate.

3. Have process B send ~500 byte messages to many listening processes
via whatever is the fastest method (direct connections, multiple
subscriptions to a 'channel', etc.).  Knowing #listeners vs.
messages/second would be cool.

4. Send blocks of data from process A to process B (any size you want).
B immediately discards the data, but you pay attention to how much
data/second B recieves (a dual processor machine with proper processor
affinities would be fine here).

5. Start increasing the number of processes that send data to B.  A
quick chart of #senders vs. total bytes/second would be far more than

I'm just offering the above as example benchmarks (you certainly don't
need to do them to satisfy me, but I'll be doing those when my tuple
space implementation is closer to being done). They are certainly not
exhaustive, but they do offer a method by which one can measure
latencies, message volume throughput, data volume throughput, and
ability to handle many senders and/or recipients.

 [ Network controlled Networked Audio Mixing Matrix ]
  Very neat.  How much data?  What kind of throughput?  What kinds of
 For the test system we tested with 3 raw PCM audio data streams. That 's
 3 x 44.1Khz, 16 bit stereo - which is around 4.2Mbit/s of data from the
 network being processed realtime and output back to the network at
 1.4Mbit/s. So, not huge numbers, but not insignificant amounts of data
 either. I suppose one thing I can take more time with now is to look at
 the specific latency of the mixer. It didn't *appear* to be large however.
 (there appeared to be similar latency in the system with or without the

530Kbytes/second in, 176kbytes/second out.  Not bad (I imagine you are
using a C library/extension of some sort to do the mixing...perhaps
numarray, Numeric, ...).  How large are the blocks of data that you are
shuffling around at one time?  1,5,10,50,150kbytes?

 A more interesting effect we found was dealing with mouse movement in pygame
 where we found that *huge* numbers of messages being sent one at a time and
 processed one at a time (with yields after each) became a huge bottleneck.

I can imagine.

 The reason I like using pygame for these things is because a) it's relatively 
 raw and fast b) games are another often /naturally/ concurrent system. Also 
 it normally allows other senses beyond reading numbers/graphs to kick in when 
 evaluating changes that looks better/worse, Theres's something wrong 

Indeed.  I'm should get my fingers into PyGame, but haven't yet due to
other responsibilities.

  I have two recent posts about the performance and features of a (hacked
  together) tuple space system 

Re: [Python-Dev] Sandboxed Threads in Python

2005-10-07 Thread Phillip J. Eby
At 07:17 PM 10/7/2005 -0600, Adam Olsen wrote:
On 10/7/05, Phillip J. Eby [EMAIL PROTECTED] wrote:
  At 06:12 PM 10/7/2005 -0600, Adam Olsen wrote:
  Turns out it's quite easy and it doesn't harm performance of existing
  code or require modification (but a recompile is necessary).  The idea
  is to only use a cyclic garbage collector for cleaning them up,
  Um, no, actually.  You need a mark-and-sweep GC or something of that
  ilk.  Python's GC only works with objects that *have refcounts*, and it
  works by clearing objects that are in cycles.  The clearing causes
  DECREF-ing, which then causes objects to be freed.  If you have objects
  without refcounts, they would be immortal and utterly unrecoverable.

Perhaps I wasn't clear enough, I was assuming appropriate changes to
the GC would be done.  The important thing is it can be done without
changing the interface that the existing modules use.

No, it can't.  See more below.

  That's all it takes.  Modify Py_INCREF and Py_DECREFs to check for a
  magic constant.  Ahh, but the performance?  See for yourself.
  First, you need to implement a garbage collection scheme that can deal with
  not having refcounts.  Otherwise you're not comparing apples to apples
  here, and your programs will leak like crazy.
  Note that implementing a root-based GC for Python is non-trivial, since
  extension modules can store pointers to PyObjects anywhere they
  like.  Further, many Python objects don't even support being tracked by the
  current cycle collector.
  So, changing this would probably require a lot of C extensions to be
  rewritten to support the needed API changes for the new garbage collection

They only need to be rewritten if you want them to provide an
immutable type that can be transferred between sandboxes.

No.  You're missing my point.  If they are able to *reference* these 
objects, then the garbage collector has to know about it, or else it can't 
know when to reclaim them.  Ergo, these objects will leak, or else 
extensions will crash when they refer to the deallocated memory.

In other words, you can't handwave the whole problem away by assuming a 
garbage collector.  The garbage collector has to actually be able to work, 
and you haven't specified *how* it can work without changing the C API.

I was aware that weakrefs needed some special handling (I just forgot
to mention it), but I didn't know it was used by subclassing.
Unfortunately I don't know what purpose it serves so I can't
contemplate how to deal with it.

It allows changes to a supertype's C-level slots to propagate to subclasses.

Python-Dev mailing list

Re: [Python-Dev] Sourceforge CVS access

2005-10-07 Thread Nick Coghlan
Guido van Rossum wrote:
 I will, if you tell me your sourceforge username.

Sorry, forgot about that little detail ;)

Anyway, its ncoghlan, same as the gmail account.


Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
Python-Dev mailing list