Re: [Python-Dev] Active Objects in Python

2005-10-01 Thread Michael Sparks
On Friday 30 September 2005 22:13, Michael Sparks (home address) wrote:
> I wrote a white paper based on my Python UK talk, which is here:
>     * http://www.bbc.co.uk/rd/pubs/whp/whp11.shtml

Oops that URL isn't right. It should be:
   * http://www.bbc.co.uk/rd/pubs/whp/whp113.shtml

Sorry! (Thanks to LD 'Gus' Landis for pointing that out!)

Regards,


Michael.
--
"Though we are not now that which in days of old moved heaven and earth, 
   that which we are, we are: one equal temper of heroic hearts made 
 weak by time and fate but strong in will to strive, to seek, 
  to find and not to yield" -- "Ulysses", Tennyson
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Tests and unicode

2005-10-01 Thread Reinhold Birkenfeld
Hi,

I looked whether I could make the test suite pass again
when compiled with --disable-unicode.

One problem is that no Unicode escapes can be used since compiling
the file raises ValueErrors for them. Such strings would have to
be produced using unichr().

Is this the right way? Or is disabling Unicode not supported any more?

Reinhold

-- 
Mail address is perfectly valid!

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency - cooperative MT

2005-10-01 Thread Martin Blais
Hi.

I hear a confusion that is annoying me a bit in some of the
discussions on concurrency, and I thought I'd flush my thoughts
here to help me clarify some of that stuff, because some people
on the list appear to discuss generators as a concurrency scheme,
and as far as I know (and please correct me if I'm wrong) they
really are not adressing that at all (full explanation below).
Before I go on, I must say that I am not in any way an authority
on concurrent programming, I'm just a guy who happens to have
done a fair amount of threaded programming, so if any of the
smart people on the list notice something completely stupid and
off the mark that I might be saying here, please feel free to
bang on it with the thousand-pound hammer of your hacker-fu and
put me to shame (I love to learn).

As far as I understand, generators are just a convenient way to
program apparently "independent" control flows (which are not the
same as "concurrent" control flows) in a constrained, structured
way, a way that is more powerful than what is allowed by using a
stack.  By giving up using the stack concept as a fast way to
allocate local function variables, it becomes possible to exit
and enter chunks of code multiple times, at specific points,
within an automatically restored local context (i.e. the local
variables, stored on the heap).  Generators make it more
convenient to do just that:  enter and re-enter some code that is
expressed as if it would be running in a single execution
flow (with explicit points of exit/re-entry, "yields").

The full monty version of that, is what you get when you write
assembly code (*memories of adolescent assembly programming on
the C=64 abound here now*): you can JMP anywhere anytime, and a
chunk of code (a function) can be reentered anywhere anytime as
well, maybe even reentered somewhere else than where it left off.
The price to pay for this is one of complexity: in assembly you
have to manage restoring the local context yourself (i.e in
assembly code this just means restoring the values of some
registers which are assumed set and used by the code, like the
local variables of a function), and there is no clear grouping of
the local scope that is saved.  Generators give you that for
free: they automatically organize all that local context as
belonging to the generator object, and it expresses clear points
of exit/re-entry with the yield calls.  They are really just a
fancy goto, with some convenient assumptions about how control
should flow.  This happens to be good enough for simplifying a
whole class of problems and I suppose the Python and Ruby
communities are all learning to love them and use them more and
more.

(I think the more fundamental consequence of generators is to
raise questions about the definition of what "a function" is:  if
I have a single chunk of code in which different parts uses two
disjoint sets of variables, and it can be entered via a few
entry/exit points, is it really one or two or
multiple "functions"?  What if different parts share some of the
local scope only?  Where does the function begin and end?  And
more importantly, is there a more complex yet stull manageable
abstraction that would allow even more flexible control flow than
generators allow, straddling the boundaries of what a function
is?)

You could easily implement something very similar to generators
by encapsulating the local scope explicitly in the form of a
class, with instance attributes, and having an normal
method "step()" that would be careful about saving state in the
object's attributes everytime it returns and restoring state from
those attributes everytime it gets called.  This is what
iterators do.  Whenever you want to "schedule" your object to be
running, you call the step() method.  So just in that sense
generators really aren't all that exciting or "new".  The main
problem that generators solve is that they make this save/restore
mechanism automatic, thus allowing you to write a single flow of
execution as a normal function with explicit exit points (yield).
It's much nicer having that in the language than having to write
code that can be restored (especially when you have to write a
loop with complex conditions/flow which must run and return only
one iteration every time they become runnable).

Therefore, as far as I understand it, generators themselves DO
NOT implement any form of concurrency.

I feel that where generators and concurrency come together is
often met with confusion in the discussions I see about them, but
maybe that's just me.  I see two aspects that allow generators to
participate in the elaboration of a concurrency scheme:

1. The convenience of expression of a single execution flow (with
   explicit interruption points) makes it easy to implement
   pseudo-concurrency IF AND ONLY IF you consider a generator as
   an independent unit of control flow (i.e. a task).  Whether
   those generators can run asynchronously is yet undefined and
   depends on who calls them

Re: [Python-Dev] Pythonic concurrency - cooperative MT

2005-10-01 Thread Antoine

Hi Martin,

[snip]

The "confusion" stems from the fact that two issues are mixed up in this
discussion thread:
- improving concurrency schemes to make it easier to write well-behaving
applications with independent parallel flows
- improving concurrency schemes to improve performance when there are
several hardware threads available

The respective solutions to these problems do not necessarily go hand in
hand.

> To implement that explicitly, you would need an
> asynchronous version of all the functions that may block on
> resources (e.g. file open, socket write, etc.), in order to be
> able to insert a yield statement at that point, after the async
> call, and there should be a way for the scheduler to check if the
> resource is "ready" to be able to put your generator back in the
> runnable queue.

You can also use a helper thread and signal the scheduling loop when some
action in the helper thread has finished. It is an elegant solution
because it helps you keep a small generic scheduling loop instead of
putting select()-like calls in it.
(this is how I've implemented timers in my little cooperative
multi-threading system, for example)

> (A question comes to mind here: Twisted must be doing something
> like this with their "deferred objects", no?  I figure they would
> need to do something like this too.  I will have to check.)

A Deferred object is just the abstraction of a callback - or, rather, two
callbacks: one for success and one for failure. Twisted is architected
around an event loop, which calls your code back when a registered event
happens (for example when an operation is finished, or when some data
arrives on the wire). Compared to generators, it is a different way of
expressing cooperative multi-threading.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency - cooperative MT

2005-10-01 Thread Michael Sparks
On Saturday 01 October 2005 22:50, Martin Blais wrote:
...
> because some people on the list appear to discuss generators as
> a concurrency scheme, and as far as I know they really are not
> addressing that at all.

Our project started in the context of dealing with the task of a
naturally concurrent environment. Specifically the task is that of
dealing with large numbers of concurrent connections to a server.
As a result, when I've mentioned concurrency it's been due to coming
from that viewpoint.

In the past I have worked with systems essentially structured in a
similar way to Twisted for this kind of problem, but decided against that
style for our current project. (Note some people misunderstand my opinions
here due to a badly phrased lightning talk ~16 months ago at Europython
2004 - I think twisted is very much best of breed in python for what
it does. I just think there //might// be a nicer way. (might :) )

Since I now work in an R&D dept I wondered what would happen if instead
of the basic approach that underlies systems like twisted what would
happen if you took a much more CSP-like approach to building such
systems, but using generators rather than threads or explicit state
machines.

A specific goal was to try and make code simpler for people to work with -
with the aim actually of simplifying maintenance as the main by-product.
I hadn't heard of anyone trying this approach then and hypothesised it
*might* achieve that goal.

As a result from day 1 it became clear that where an event based system
would normally use a reactor/proactor based approach, that you replace
that with a scheduler that repeatedly calls a next method on objects
given to it to schedule.

In terms of concurrency that is clearly a co-operative multitasking system
in the same way as a simplistic event based system is. (Both get more
complex in reality when you can't avoid blocking forcing the use of
threads for some tasks)

So when you say this:
> explicitly frame the discussion in the context of a single
> process cooperative scheme that runs on a single processor (at
> any one time).  

This is spot on. However, any generator can be farmed off and run in
a thread. Any communications you did with the generator can be wrapped
via Queues then - forming a controlled bridge between the threads.
Similarly we're currently looking at using non-blocking pipes and
pickling to communicate with generators running in a forked environment.

As a result if you write your code as generators it can migrate to a
threaded or process based environment, and scale across multiple
processes (and hence processors) if tools to perform this migration
are put in place. We're a little way off doing that, but this looks
to be highly reasonable.

> As far as I understand, generators are just a convenient way to

They give you code objects that can do a return and continue later.
This isn't really the same as the ability to just do a goto into
random points in a function. You can only go back to the point the
generator yielded at (unless someone has a perverse trick :-).

> You could easily implement something very similar to generators
> by encapsulating the local scope explicitly in the form of a
> class, with instance attributes, and having an normal
> method "step()" that would be careful about saving state in the
> object's attributes everytime it returns and restoring state from
> those attributes everytime it gets called. 

For a more explicit version of this we have a (deliberately naive) C++
version of generators & our core concurrency system. Mechanism is here:
http://tinyurl.com/7gaol , example use here: http://tinyurl.com/bgwro
That does precisely that. (except we use a next() method there :-)

> Therefore, as far as I understand it, generators themselves DO
> NOT implement any form of concurrency.

By themselves, they don't. They can be used to deal with concurrency though.

> 2. In a more advanced framework/language, perhaps some generators
>could be considered to always be possible to run
>asynchronously, ruled by a system of true concurrency with
>some kind of scheduling algorithm that oversees that process.
>Whether this has been implemented by many is still a mystery
>to me, 

This is what we do. Our tutorial we've given to trainees (one of whom
have had very little experience of even programming) was able to
pick up our approach quickly due to our tutorial. This requires them to
implement a mini-version of the framework, which might actually aid
the discussion here since it very clearly shows the core of our system.
(nb it is however a simplified version) I previously posted a link to
it, which is here: http://kamaelia.sourceforge.net/MiniAxon/

>but I can see how a low-level library that provides
>asynchronously running execution vehicles for each CPU could
>be used to manage and run a pool of shared generator objects
>in a way that is better (for a specific application) than the
>relatively un

[Python-Dev] Why does __getitem__ slot of builtin call sequence methods first?

2005-10-01 Thread Travis Oliphant

The new ndarray object of scipy core (successor to Numeric Python) is a 
C extension type that has a getitem defined in both the as_mapping and 
the as_sequence structure. 

The as_sequence mapping is just so PySequence_GetItem will work correctly.

As exposed to Python the ndarray object as a .__getitem__  wrapper method.

Why does this wrapper call the sequence getitem instead of the mapping 
getitem method?

Is there anyway to get at a mapping-style __getitem__ method from Python?

This looks like a bug to me (which is why I'm posting here...)

Thanks for any help or insight.

-Travis Oliphant



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first?

2005-10-01 Thread Guido van Rossum
On 10/1/05, Travis Oliphant <[EMAIL PROTECTED]> wrote:
>
> The new ndarray object of scipy core (successor to Numeric Python) is a
> C extension type that has a getitem defined in both the as_mapping and
> the as_sequence structure.
>
> The as_sequence mapping is just so PySequence_GetItem will work correctly.
>
> As exposed to Python the ndarray object has a .__getitem__  wrapper method.
>
> Why does this wrapper call the sequence getitem instead of the mapping
> getitem method?
>
> Is there anyway to get at a mapping-style __getitem__ method from Python?

Hmm... I'm sure the answer is in typeobject.c, but that is one of the
more obfuscated parts of Python's guts. I wrote it four years ago and
since then I've apparently lost enough brain cells (or migrated them
from language implementation to to language design service :) that I
don't understand it inside out any more like I did while I was in the
midst of it.

However, I wonder if the logic isn't such that if you define both
sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by
removing sq_item it might call mp_subscript? Worth a try, anyway.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pythonic concurrency - cooperative MT

2005-10-01 Thread Christopher Armstrong
On 10/2/05, Martin Blais <[EMAIL PROTECTED]> wrote:
> One of the problems that you have with using generators like
> this, is that automatic "yield" on resource access does not occur
> automatically, like it does in threading.  With threads, the
> kernel is invoked when access to a low-level resource is
> requested, and may decide to put your process in the wait queue
> when it judges necessary.  I don't know how you would do that
> with generators.  To implement that explicitly, you would need an
> asynchronous version of all the functions that may block on
> resources (e.g. file open, socket write, etc.), in order to be
> able to insert a yield statement at that point, after the async
> call, and there should be a way for the scheduler to check if the
> resource is "ready" to be able to put your generator back in the
> runnable queue.
>
> (A question comes to mind here: Twisted must be doing something
> like this with their "deferred objects", no?  I figure they would
> need to do something like this too.  I will have to check.)

As I mentioned in the predecessor of this thread (I think), I've
written a thing called "Defgen" or "Deferred Generators" which allows
you to write a generator to yield control when waiting for a Deferred
to fire. So this is basically "yield or resource access". In the
Twisted universe, every asynchronous resource-retrieval is done by
returning a Deferred and later firing that Deferred. Generally, you
add callbacks to get the value, but if you use defgen you can say
stuff like (in Python 2.5 syntax)
try:
x = yield getPage('http://python.org/')
except PageNotFound:
print "Where did Python go!"
else:
assert "object-oriented" in x

Many in the Twisted community get itchy about over-use of defgen,
since it makes it easier to assume too much consistency in state, but
it's still light-years beyond pre-emptive shared-memory threading when
it comes to that.

--
  Twisted   |  Christopher Armstrong: International Man of Twistery
   Radix|-- http://radix.twistedmatrix.com
|  Release Manager, Twisted Project
  \\\V///   |-- http://twistedmatrix.com
   |o O||
wvw-+
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first?

2005-10-01 Thread Travis Oliphant
Guido van Rossum wrote:

>On 10/1/05, Travis Oliphant <[EMAIL PROTECTED]> wrote:
>  
>
>>The new ndarray object of scipy core (successor to Numeric Python) is a
>>C extension type that has a getitem defined in both the as_mapping and
>>the as_sequence structure.
>>
>>The as_sequence mapping is just so PySequence_GetItem will work correctly.
>>
>>As exposed to Python the ndarray object has a .__getitem__  wrapper method.
>>
>>Why does this wrapper call the sequence getitem instead of the mapping
>>getitem method?
>>
>>Is there anyway to get at a mapping-style __getitem__ method from Python?
>>
>>
>
>Hmm... I'm sure the answer is in typeobject.c, but that is one of the
>more obfuscated parts of Python's guts. I wrote it four years ago and
>since then I've apparently lost enough brain cells (or migrated them
>from language implementation to to language design service :) that I
>don't understand it inside out any more like I did while I was in the
>midst of it.
>
>However, I wonder if the logic isn't such that if you define both
>sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by
>removing sq_item it might call mp_subscript? Worth a try, anyway.
>
>  
>

Thanks for the tip.  I think I figured out the problem, and it was my 
misunderstanding of how types inherit in C that was the source of my 
problem.  

Basically, Python is doing what you would expect, the mp_item is used 
for __getitem__ if both mp_item and sq_item are present.  However, the 
addition of these descriptors  (and therefore the resolution of any 
comptetion for __getitem__ calls) is done  *before*  the inheritance of 
any slots takes place. 

The new ndarray object inherits from a "big" array object that doesn't 
define the sequence and buffer protocols (which have the size limiting 
int dependencing in their interfaces).   The ndarray object has standard 
tp_as_sequence and tp_as_buffer slots filled.  

Figuring the array object would inherit its tp_as_mapping protocol from 
"big" array (which it does just fine), I did not explicitly set that 
slot in its Type object.Thus, when PyType_Ready was called on the 
ndarray object, the tp_as_mapping was NULL and so __getitem__ mapped to 
the sequence-defined version.  Later the tp_as_mapping slots were 
inherited but too late for __getitem__ to be what I expected.

The easy fix was to initialize the tp_as_mapping slot before calling 
PyType_Ready.Hopefully, somebody else searching in the future for an 
answer to their problem will find this discussion useful.  

Thanks for your help,

-Travis




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first?

2005-10-01 Thread Nick Coghlan
Guido van Rossum wrote:
> Hmm... I'm sure the answer is in typeobject.c, but that is one of the
> more obfuscated parts of Python's guts. I wrote it four years ago and
> since then I've apparently lost enough brain cells (or migrated them
> from language implementation to to language design service :) that I
> don't understand it inside out any more like I did while I was in the
> midst of it.
> 
> However, I wonder if the logic isn't such that if you define both
> sq_item and mp_subscript, __getitem__ calls sq_item; I wonder if by
> removing sq_item it might call mp_subscript? Worth a try, anyway.

As near as I can tell, the C/API documentation is silent on how slots are 
populated when multiple methods mapping to the same slot are defined by a C 
object, but this is a quote from the comment describing add_operators() in 
typeobject.c:
>   In the latter case, the first slotdef entry encoutered wins.  Since
>slotdef entries are sorted by the offset of the slot in the
>PyHeapTypeObject, this gives us some control over disambiguating
>between competing slots: the members of PyHeapTypeObject are listed
>from most general to least general, so the most general slot is
>preferred.  In particular, because as_mapping comes before as_sequence,
>for a type that defines both mp_subscript and sq_item, mp_subscript
>wins.

Further, in PyObject_GetItem (in abstract.c), tp_as_mapping->mp_subscript is 
checked first, with tp_as_sequence->mp_item only being checked if mp_subscript 
isn't found. Importantly, this is the function invoked by the BINARY_SUBSCR 
opcode.

So, the *intent* certainly appears to be that mp_subscript should be preferred 
both by the C abstract object API and from normal Python code.

*However*, the precedence applied by add_operators() is governed by the 
slotdefs structure in typeobject.c, which, according to the above comment, is 
meant to match the order the slots appear in memory in the _typeobject 
structure in object.h, and favour the mapping methods over the sequence methods.

There's actually two serious problems with the description in this comment:

Firstly, the two orders don't actually match. In the object layout, the 
ordering of the abstract object methods is as follows:
PyNumberMethods *tp_as_number;
PySequenceMethods *tp_as_sequence;
PyMappingMethods *tp_as_mapping;

But in the slotdefs table, the PySequence and PyMapping slots are listed 
first, followed by the PyNumber methods.

Secondly, in both the object layout and the slotdefs table, the PySequence 
methods appear *before* the PyMapping methods, which means that 
tp_as_sequence->sq_item appears as "__getitem__" even though a subscript 
operation will actually invoke "tp_as_mapping->mp_subscript".

In short, I think Travis is right in calling this behaviour a bug. There's a 
similar problem with the methods that exist in both tp_as_number and 
tp_as_sequence - the abstract C API and the Python intepreter will favour the 
tp_as_number methods, but the slot definitions will favour tp_as_sequence.

The fix is actually fairly simple: reorder the slotdefs table so that the 
sequence of slots is "Number, Mapping, Sequence" rather than adhering strictly 
to the sequence of methods given in the definition of _typeobject.

The only objects affected by this change would be C extension objects which 
define two C-level methods which map to the same Python-level slot name. The 
observed behavioural change is that the methods accessible via the 
Python-level slot names would change (either from the Sequence method to the 
Mapping method, or from the Sequence method to the Number method).

Given that the only documentation I can find of the behaviour in that scenario 
is a comment in typeobject.c, that the implementation doesn't currently match 
the comment, and that the current implementation means that the methods 
accessed via the slot names don't match the methods normal Python syntax 
actually invokes, I find it hard to see how fixing it could cause any 
signficant problems.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why does __getitem__ slot of builtin call sequence methods first?

2005-10-01 Thread Nick Coghlan
Nick Coghlan wrote:
[A load of baloney]

Scratch everything I said in my last message - init_slotdefs() sorts the 
slotdefs table correctly, so that the order it is written in the source is 
irrelevant.

Travis found the real answer to his problem.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IDLE development

2005-10-01 Thread Kurt B. Kaiser
Noam Raphael <[EMAIL PROTECTED]> writes:

> More than a year and a half ago, I posted a big patch to IDLE which
> adds support for completion and much better calltips, along with some
> other improvements. 

I have responded on idle-dev.

-- 
KBK
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com