Re: [pypy-dev] Syntax for the 'transaction' module

Bengt Richter Tue, 01 May 2012 20:52:18 -0700

On 05/01/2012 08:35 AM Armin Rigo wrote:

Hi Holger,


On Tue, May 1, 2012 at 16:48, holger krekel<[email protected]>  wrote:

Maybe "atomic" could become a __pypy__ builtin and there could be a "ame" or
so package which atomic-using apps could depend on? In any case,
I really like the twist of "To remedy the GIL use AME" :)


Yes, indeed, a private name in the __pypy__ module looks fine.  The
applications are supposed to use the "ame" module or package (or
whatever name makes sense, but I'm getting convinced that
"transaction" is not a good one).  The "ame" module re-exports
__pypy__._atomic as ame.atomic for general use, but also offers more
stuff like the Runner class with add()/run() methods.

Also, again, it doesn't necessarily make sense to force a lexically
nested usage of ame.atomic, so we could replace __pypy__._atomic with
two primitives __pypy__._atomic_start and _atomic_stop, re-exported in
the "ame" module, and write the "ame.atomic" context manager in pure
Python.

I am wondering how this all applies to the execnet-execution model, btw.
(http://codespeak.net/execnet for those who wonder what i mean)
remote_exec()s on the same gateway run currently in different threads
and thus only send/receive needs to use "with atomic", right?


In my proposal, existing applications run fine, using multiple cores
if they are based on multiple threads.  You use "with atomic" to have
an additional degree of synchronization when you don't want to worry
about locks&  friends (which should be *always*, but is still an
optional benefit in this model).  Maybe you're just talking about
simplifying the implementation of execnet's channels to use "with
atomic" instead of acquiring and releasing locks.  Then yes, could be,
as long as you remember that "with atomic" gives neither more nor less
than its naive implementation: "don't release the GIL there".


I am looking at
_____________________________________________________________________

def add(f, *args, **kwds):
    """Register the call 'f(*args, **kwds)' as running a new
    transaction.  If we are currently running in a transaction too, the
    new transaction will only start after the end of the current
    transaction.  Note that if the same or another transaction raises an
    exception in the meantime, all pending transactions are cancelled.
    """
    r = random.random()
    assert r not in _pending    # very bad luck if it is
    _pending[r] = (f, args, kwds)
_____________________________________________________________________

from https://bitbucket.org/pypy/pypy/raw/stm-gc/lib_pypy/transaction.py

and wondering about implementing atomicity guarantees in the evaluation of
*args and **kwds, and maybe even f -- i.e., it would seem that there is
opportunity effectively to pass arguments by value (once compiled),
by reference, or by name, and arguments can be complex composites
or simple constants. To prepare them they will be evaluated according
to their expressions, either at call time or maybe partly at def time
or construction time or method binding time or combinations.
(When/how might multiple processors cooperate to evaluate arguments for
passing into the transaction context? Never?)

So how does one think about the state of arguments/values being
accessed by f when it runs in its transaction context?

I.e., if some arguments need to be version-synced, are there new ways
to program that? How would you put the evaluation of function arguments
into the inside of a transaction, e.g. if the arguments derive from stateful
stuff that is updated as a side effect? Wrap it with an outer transaction?

From the definition of transaction.run it appears that f must have
its effect as a global side effect identified either through its
arguments or built into its code (and presumably one could pass a bound
method in place of f for an atomic update of instance attributes?).

Seems like it might be nice to be able to pass a function and get back
a list of function results? What should the common convention for accumulating
results be with run as now? Should one pass f an additional queue argument to 
append to?
Or an index into a selected slot in a global list, if ordering is predetermined.

BTW,is side-by-side parallelism the only concern in the current attempt to run 
programs
on many cores? What about pipelining-type parallelism, like nested generators 
with
inner loops feeding outer ones running on different processors?

I've played with the idea of generators in the form of classes that can be 
glued with '|'
so they become logically one generator like (I've got a toy that will do this):
    for v in G(foo, seq)|G(bar)|G(baz): print v # pipe sequence-items through 
foo then bar then baz
having the effect of
    for v in (baz(z) for z in (bar(r) for r in (foo(o) for o in seq))): print v
or, I suppose,
    for o in seq: print baz(bar(foo(o)))
but that's factored differently.

Wondering how transaction stuff might play re transaction guarantees for 
generator states
and serializing the feed from one to the next. Does it even make sense or does 
pipelining need
a different kind of support?

Re names: what about a mnemonic like pasifras == [p]arallel [as] [if] 
[ra]ndomly [s]erial
Thence pasifras.add ?

Regards,
Bengt Richter

_______________________________________________
pypy-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-dev

Re: [pypy-dev] Syntax for the 'transaction' module

Reply via email to