Re: [Python-ideas] New PEP 550: Execution Context

2017-08-15 Thread Yury Selivanov
Hi Nick,

Thanks for writing this!  You reminded me that it's crucial to have an
ability to fully recreate generator behaviour in an iterator. Besides
this being a requirement for a complete EC model, it is something that
compilers like Cython absolutely need.

I'm still working on a rewrite (which is now a completely different
PEP), will probably finish it today.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-15 Thread Nick Coghlan
On 15 August 2017 at 05:25, Yury Selivanov  wrote:
> Nick, you nailed it with your example.
>
> In short: current PEP 550 defines Execution Context in such a way,
> that generators and iterators will interact differently with it. That
> means that it won't be possible to refactor an iterator class to a
> generator and that's not acceptable.
>
> I'll be rewriting the whole specification section of the PEP today.

Trying to summarise something I thought of this morning regarding
ec_back and implicitly isolating iterator contexts:

With the notion of generators running with their own private context
by default, that means the state needed to call __next__ on the
generator is as follows:

- current thread EC
- generator's private EC (stored on the generator)
- the generator's __next__ method

This means that if the EC manipulation were to live in the next()
builtin rather than in the individual __next__() methods, then this
can be made a general context isolation protocol:

- provide a `sys.create_execution_context()` interface
- set `__private_context__` on your iterable if you want `next()` to
use `ec.run()` (and update __private_context__  afterwards)
- set `__private_context__ = None` if you want `next()` to just call
`obj.__next__()` directly
- generators have __private_context__ set by default, but wrappers
like contextlib.contextmanager can clear it

That would also suggest that ec.run() will need to return a 2-tuple:

def run(self, f: Callable, *args, **kwds) -> Tuple[Any, ExecutionContext]:
"""Run the given function in this execution context

Returns a 2-tuple containing the function result and the
execution context
that was active when the function returned.
"""

That way next(itr) will be able to update itr.__private_context__
appropriately if it was initially set and the call changes the active
context.

We could then give send(), throw() and their asynchronous counterparts
the builtin+protocol method treatment, and put the EC manipulation in
their builtins as well.

Anyway, potentially a useful option to consider as you work on
revising the proposal - I'll refrain from further comments until you
have an updated draft available :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-14 Thread Yury Selivanov
Nick, you nailed it with your example.

In short: current PEP 550 defines Execution Context in such a way,
that generators and iterators will interact differently with it. That
means that it won't be possible to refactor an iterator class to a
generator and that's not acceptable.

I'll be rewriting the whole specification section of the PEP today.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-14 Thread Yury Selivanov
Hi Barry,

Yes, i18n is another use-case for execution context, and ec should be
a perfect fit for it.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-14 Thread Barry Warsaw
Yury Selivanov wrote:

> This is a new PEP to implement Execution Contexts in Python.

It dawns on me that I might be able to use ECs to do a better job of
implementing flufl.i18n's translation contexts.  I think this is another
example of what the PEP's abstract describes as "Context managers like
decimal contexts, numpy.errstate, and warnings.catch_warnings;"

The _ object maintains a stack of the language codes being used, and you
can push a new code onto the stack (typically using `with` so they get
automatically popped when exiting).  The use case for this is
translating say a notification to multiple recipients in the same
request, one who speaks French, one who speaks German, and another that
speaks English.

The problem is that _ is usually a global in a typical application, so
in an async environment, if one request is translating to 'fr', another
might be translating to 'de', or even a deferred context (e.g. because
you want to mark a string but not translate it until some later use).

While I haven't used it in an async environment yet, the current
approach probably doesn't work very well, or at all.  I'd probably start
by recommending a separate _ object in each thread, but that's less
convenient to use in practice.  It seems like it would be better to
either attach an _ object to each EC, or to implement the stack of codes
in the EC and let the global _ access that stack.

It feels a lot like `let` in lisp, but without the implicit addition of
the contextual keys into the local namespace.  E.g. in a PEP 550 world,
you'd have to explicitly retrieve the key/values from the EC rather than
have them magically appear in the local namespace, the former of course
being the Pythonic way to do it.

Cheers,
-Barry

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-14 Thread Yury Selivanov
On Mon, Aug 14, 2017 at 12:56 PM, Guido van Rossum  wrote:
> Could someone (perhaps in a new thread?) summarize the current proposal,
> with some examples of how typical use cases would look? This is an important
> topic but the discussion is way too voluminous for me to follow while I'm on
> vacation with my family, and the PEP spends too many words on motivation and
> not enough on crisply explaining how the proposed feature works (what state
> is stored where how it's accessed, and how it's manipulated behind the
> scenes).

I'm working on it. Will start a new thread today.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-14 Thread Guido van Rossum
Could someone (perhaps in a new thread?) summarize the current proposal,
with some examples of how typical use cases would look? This is an
important topic but the discussion is way too voluminous for me to follow
while I'm on vacation with my family, and the PEP spends too many words on
motivation and not enough on crisply explaining how the proposed feature
works (what state is stored where how it's accessed, and how it's
manipulated behind the scenes).

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-14 Thread Nick Coghlan
On 14 August 2017 at 02:33, Yury Selivanov  wrote:
> On Sat, Aug 12, 2017 at 10:09 PM, Nick Coghlan  wrote:
>> That similarity makes me wonder whether the "isolated or not"
>> behaviour could be moved from the object being executed and directly
>> into the key/value pairs themselves based on whether or not the values
>> were mutable, as that's the way function calls work: if the argument
>> is immutable, the callee *can't* change it, while if it's mutable, the
>> callee can mutate it, but it still can't rebind it to refer to a
>> different object.
>
> I'm afraid that if we design EC context to behave differently for
> mutable/immutable values, it will be an even harder thing to
> understand to end users.

There's nothing to design, as storing a list (or other mutable object)
in an EC will necessarily be the same as storing one in a tuple: the
fact you acquired the reference via an immutable container will do
*nothing* to keep you from mutating the referenced object.

And for use cases like web requests, that's exactly the behaviour we
want - changing the active web request is an EC level operation, but
making changes to the state of the currently active request (e.g. in a
middleware processor) won't require anything special.

[I'm going to snip the rest of the post, as it sounds pretty
reasonable to me, and my questions about the interaction between
sys.set_execution_context() and ec_back go away if
sys.set_execution_context() doesn't exist as you're currently
proposing]

> (gi_isolated_execution_context flag is still here for contextmanager).

This hidden flag variable on the types managing suspendable frames is
still the piece of the proposal that strikes me as being the most
potentially problematic, as it at least doubles the number of flows of
control that need to be tested.

Essentially what we're aiming to model is:

1. Performing operations in a way that modifies the active execution context
2. Performing them in a way that saves & restores the execution context

For synchronous calls, this distinction is straightforward:

- plain calls may alter the active execution context via state mutation
- use ec.run() to save/restore the execution context around the operation

(The ec_back idea means we may also need an "ec.run()" variant that
sets ec_back appropriately before making the call - for example,
"ec.run()" could set ec_back, while a separate "ec.run_isolated()"
could skip setting it. Alternatively, full isolation could be the
default, and "ec.run_shared()" would set ec_back. If we go with the
latter option, then "ec_shared" might be a better attribute name than
"ec_back")

A function can be marked as always having its own private context
using a decorator like so:

def private_context(f)
@functools.wraps(f)
def wrapper(*args, **kwds):
ec = sys.get_active_context()
return ec.run(f, *args, **kwds)
return wrapper

For next/send/throw and anext/asend/athrow, however, the proposal is
to bake the save/restore into the *target objects*, rather than having
to request it explicitly in the way those objects get called.

This means that unless we apply some implicit decorator magic to the
affected slot definitions, there's now going to be a major behavioural
difference between:

some_state = sys.new_context_item()

def local_state_changer(x):
for i in range(x):
some_state.set(x)
yield x

class ParentStateChanger:
def __init__(self, x):
self._itr = iter(range(x))
def __iter__(self):
return self
def __next__(self):
x = next(self._itr)
some_state.set(x)
return x

The latter would need the equivalent of `@private_context` on the
`__next__` method definition to get the behaviour that generators
would have by default (and similarly for __anext__ and asynchronous
generators).

I haven't fully thought through the implications of this problem yet,
but some initial unordered thoughts:

- implicit method decorators are always suspicious, but skipping them
in this case feels like we'd be setting up developers of custom
iterators for really subtle context management bugs
- contextlib's own helper classes would be fine, since they define
__enter__ & __exit__, which wouldn't be affected by this
- for lru_cache, we rely on `__wrapped__` to get access to the
underlying function without caching applied. Might it make sense to do
something similar for these implicitly context-restoring methods? If
so, should we use a dedicated name so that additional wrapper layers
don't overwrite it?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Yury Selivanov
On Sun, Aug 13, 2017 at 3:14 PM, Nathaniel Smith  wrote:
> On Sun, Aug 13, 2017 at 9:57 AM, Yury Selivanov  
> wrote:
>> 2. ContextItem.has(), ContextItem.get(), ContextItem.set(),
>> ContextItem.delete() -- pretty self-explanatory.
>
> It might make sense to simplify even further and declare that context
> items are initialized to None to start, and the only operations are
> set() and get(). And then get() can't fail, b/c there is no "value
> missing" state.

I like this idea! It aligns with what I wanted to do in PEP 550
initially, but without the awkwardness of "delete on None".  Will add
this to the PEP.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Nathaniel Smith
On Sun, Aug 13, 2017 at 9:57 AM, Yury Selivanov  wrote:
> 2. ContextItem.has(), ContextItem.get(), ContextItem.set(),
> ContextItem.delete() -- pretty self-explanatory.

It might make sense to simplify even further and declare that context
items are initialized to None to start, and the only operations are
set() and get(). And then get() can't fail, b/c there is no "value
missing" state.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Yury Selivanov
I'll start a new thread to discuss is we want this specific semantics
change soon (with some updates).

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Yury Selivanov
[replying to the list]

On Sun, Aug 13, 2017 at 6:14 AM, Nick Coghlan  wrote:
> On 13 August 2017 at 16:01, Yury Selivanov  wrote:
>> On Sat, Aug 12, 2017 at 10:56 PM, Nick Coghlan  wrote:
>> [..]
>>> As Nathaniel suggestion, getting/setting/deleting individual items in
>>> the current context would be implemented as methods on the ContextItem
>>> objects, allowing the return value of "get_context_items" to be a
>>> plain dictionary, rather than a special type that directly supported
>>> updates to the underlying context.
>>
>> The current PEP 550 design returns a "snapshot" of the current EC with
>> sys.get_execution_context().
>>
>> I.e. if you do
>>
>> ec = sys.get_execution_context()
>> ec['a'] = 'b'
>>
>> # sys.get_execution_context_item('a') will return None
>>
>> You did get a snapshot and you modified it -- but your modifications
>> are not visible anywhere. You can run a function in that modified EC
>> with `ec.run(function)` and that function will see that new 'a' key,
>> but that's it. There's no "magical" updates to the underlying context.
>
> In that case, I think "get_execution_context()" is quite misleading as
> a name, and is going to be prone to exactly the confusion we currently
> have with the mapping returned by locals(), which is that regardless
> of whether writes to it affect the target namespace or not, it's going
> to be surprising in at least some situations.
>
> So despite being initially in favour of exposing a mapping-like API at
> the Python level, I'm now coming around to Armin Ronacher's point of
> view: the copy-on-write semantics for the active context are
> sufficiently different from any other mapping type in Python that we
> should just avoid the use of __setitem__ and __delitem__ as syntactic
> sugar entirely.

I agree. I'll be redesigning the PEP to use the following API (please
ignore the naming peculiarities, there are so many proposals at this
point that I'll just stick to something I have in my head):

1. sys.new_execution_context_key('description') -> sys.ContextItem (or
maybe we should just expose the sys.ContextItem type and let people
instantiate it?)

A key (or "token") to use with the execution context. Besides
eliminating the names collision issue, it'll also have a slightly
better performance, because its __hash__ method will always return a
constant. (Strings cache their __hash__, but other types don't).

2. ContextItem.has(), ContextItem.get(), ContextItem.set(),
ContextItem.delete() -- pretty self-explanatory.

3. sys.get_active_context() -> sys.ExecutionContext -- an immutable
object, has no methods to modify the context.

3a. sys.ExecutionContext.run(callable, *args) -- run a callable(*args)
in some execution context.

3b. sys.ExecutionContext.items() -- an iterator of ContextItem ->
value for introspection and debugging purposes.

4. No sys.set_execution_context() method.  At this point I'm not sure
it's a good idea to allow users to change the current execution
context to something else entirely.  For use cases like enabling
concurrent.futures to run your function within the current EC, you
just use the sys.get_active_context()/ExecutionContext.run
combination. If anything, we can add this function later.

> Instead, we'd lay out the essential primitive operations that *only*
> the interpreter can provide and define procedural interfaces for
> those, and if anyone wanted to build a higher level object-oriented
> interface on top of those primitives, they'd be free to do so, with
> the procedural API acting as the abstraction layer that decouples "how
> interpreters actually implement it" (e.g. copy-on-write mappings) from
> "how libraries and frameworks model it for their own use" (e.g. rich
> application context objects). That way, each interpreter would also be
> free to define their *internal* object model in whichever way made the
> most sense for them, rather than enshrining a point-in-time snaphot of
> CPython's preferred implementation model as part of the language
> definition.

I agree. I like that this idea gives us more flexibility with the
exact implementation strategy.

[..]
> The essential capabilities for active context manipulation would then be:
>
> - get_active_context_token()
> - set_active_context(context_token)

As I mentioned above, at this point I'm not entirely sure that we even
need "set_active_context".  The only useful thing for it that I can
imagine is creating a decorator that isolates any changes of the
context, but the only usecase for this I see is unittests.

But even for unittests, a better solution is to use a decorator that
detects keys that were added but not deleted during the test (leaks).

> - implicitly saving and reverting the active context around various operations

Usually we need to save/revert one particular context item, not the
whole context.

> - accessing the active context id for suspended coroutines and
> generators (so parent contexts can opt-in to seeing changes made in
> child contexts)

Yes, t

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Yury Selivanov
On Sat, Aug 12, 2017 at 10:09 PM, Nick Coghlan  wrote:
> On 13 August 2017 at 03:53, Yury Selivanov  wrote:
>> On Sat, Aug 12, 2017 at 1:09 PM, Nick Coghlan  wrote:
>>> Now that you raise this point, I think it means that generators need
>>> to retain their current context inheritance behaviour, simply for
>>> backwards compatibility purposes. This means that the case we need to
>>> enable is the one where the generator *doesn't* dynamically adjust its
>>> execution context to match that of the calling function.
>>
>> Nobody *intentionally* iterates a generator manually in different
>> decimal contexts (or any other contexts). This is an extremely error
>> prone thing to do, because one refactoring of generator -- rearranging
>> yields -- would wreck your custom iteration/context logic. I don't
>> think that any real code relies on this, and I don't think that we are
>> breaking backwards compatibility here in any way. How many users need
>> about this?
>
> I think this is a reasonable stance for the PEP to take, but the
> hidden execution state around the "isolated or not" behaviour still
> bothers me.
>
> In some ways it reminds me of the way function parameters work: the
> bound parameters are effectively a *shallow* copy of the passed
> arguments, so callers can decide whether or not they want the callee
> to be able to modify them based on the arguments' mutability (or lack
> thereof).

Mutable default values for function arguments is one of the most
confusing things to its users.  I've seen numerous threads on
StackOverflow/Reddit with people complaining about it.

> That similarity makes me wonder whether the "isolated or not"
> behaviour could be moved from the object being executed and directly
> into the key/value pairs themselves based on whether or not the values
> were mutable, as that's the way function calls work: if the argument
> is immutable, the callee *can't* change it, while if it's mutable, the
> callee can mutate it, but it still can't rebind it to refer to a
> different object.

I'm afraid that if we design EC context to behave differently for
mutable/immutable values, it will be an even harder thing to
understand to end users.

> 1. If a parent context wants child contexts to be able to make
> changes, then it should put a *mutable* object in the context (e.g. a
> list or class instance)
> 2. If a parent context *does not* want child contexts to be able to
> make changes, then it should put an *immutable* object in the context
> (e.g. a tuple or number)
> 3. If a child context *wants to share a context key with its parent,
> then it should *mutate* it in place
> 4. If a child context *does not* want to share a context key with its
> parent, then it should *rebind* it to a different object

It's possible to put mutable values even with the current PEP 550 API.
The issue that Nathaniel has with it, is that he actually wants the
API to behave exactly like it does to implement his timeouts logic,
but: there's a corner case, where isolating generator state at the
time when it is created doesn't work in his favor.

FWIW I believe that I now have a complete solution for the
generator.send() problem that will make it possible for Nathaniel to
implement his Trio APIs.

The functional PoC is here: https://github.com/1st1/cpython/tree/pep550_gen

The key change is to make generators and asynchronous generators to:

1. Have their own empty execution context when created. It will be
used for whatever local modifications they do to it, ensuring that
their state never escapes to the outside world
(gi_isolated_execution_context flag is still here for contextmanager).

2. ExecutionContext has a new internal pointer called ec_back. In the
Generator.send/throw method, ec_back is dynamically set to the current
execution context.

3. This makes it possible for generators to see any outside changes in
the execution context *and* have their own, where they can make
*local* changes.

So (pseudo-code):

def gen():
print('1', context)
yield
print('2', context)
with context(spam=ham):
 yield
 print('3', context)
 yield
print('4', context)
yield

g = gen()
context(foo=1, spam='bar')
next(g)
context(foo=2)
next(g)
context(foo=3)
next(g)
context(foo=4)
next(g)

will print:

1 {foo=1, spam=bar}
2 {foo=2, spam=bar}
3 {foo=3, spam=ham}
4 {foo=4, spam=bar}

There are some downsides to the approach, mainly from the performance
standpoint, but in a common case they will be negligible, if
detectable at all.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Yury Selivanov
>> This is a new PEP to implement Execution Contexts in Python.

> The idea is of course great!

Thanks!

> A couple of issues for decimal:
>
>> Moreover, passing the context explicitly does not work at all for
>> libraries like ``decimal`` or ``numpy``, which use operator overloading.
>
> Instead of "with localcontext() ...", each coroutine can create a new
> Context() and use its methods, without any loss of functionality.
>
> All one loses is the inline operator syntax sugar.
>
> I'm aware you know all this, but the entire decimal paragraph sounds a bit
> as if this option did not exist.

The problem is that almost everybody does use the Decimal type
directly, as overloaded operators make it so convenient. It's not
apparent that using the decimal this way has a dangerous flaw.

>
>> Fast C API for packages like ``decimal`` and ``numpy``.
>
> _decimal relies on caching the most recently used thread-local context,
> which gives a speedup of about 25% for inline operators:
>
> https://github.com/python/cpython/blob/master/Modules/_decimal/_decimal.c#L1639

I've seen that, it's a clever trick!  With the current PEP 550
semantics it's possible to replicate this trick, you just store a
reference to the latest EC in your decimal context for cache
invalidation. Because ECs are immutable, it's a safe thing to do.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Yury Selivanov
Hi Pau,

Re string keys collisions -- I decided to update the PEP to follow
Nathaniel's suggestion to use a get_context_key api, which will
eliminate this problem entirely.

Re call_soon in asyncio.Task -- yes, it does use ec.run() to invoke
coroutine.send(). However, this has almost no visible effect, as
ExecutionContext.run() is a very cheap operation (think 1-2 function
calls). It's possible to add a new keyword arg to call_soon like
"ignore_execution_context" to eliminate even this small overhead, but
this is something we can easily do later.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Yury Selivanov
Hi Jonathan,

Thanks for the feedback. I'll update the PEP to use Nathaniel's idea
of of `sys.get_context_key`. It will be a pretty similar API to what
you currently have in prompt_toolkit.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Pau Freixes
Finally got an almost decent internet connection.

Seeing the changes related to that PEP I can confirm that the context
will be saved twice in any "task switch" in an Asyncio environment.
Once made by the run in context function executed by the Handler [1]
and immediately after by the send [2] method belonging to the
coroutine that belongs to that task.

Formally from my understanding, there is no use of the context in the
Asyncio layer, at least nowadays. Saving the context at the moment to
schedule a Task is, at first sight, useless and might have a
performance impact.

Don't you think that this edge case that happens a lot might be in
somehow optimized?

Am I missing something?


[1] https://github.com/1st1/cpython/blob/pep550/Lib/asyncio/events.py#L124
[2] https://github.com/1st1/cpython/blob/pep550/Lib/asyncio/tasks.py#L176

On Sat, Aug 12, 2017 at 11:03 PM, Pau Freixes  wrote:
> Good work Yuri, going for all in one will help to not increase the
> diferences btw async and the sync world in Python.
>
> I do really like the idea of the immutable dicts, it makes easy inherit the
> context btw tasks/threads/whatever without put in risk the consistency if
> there is further key colisions.
>
> Ive just take a look at the asyncio modifications. Correct me if Im wrong,
> but the handler strategy has a side effect. The work done to save and
> restore the context will be done twice in some situations. It would happen
> when the callback is in charge of execute a task step, once by the run in
> context method and the other one by the coroutine. Is that correct?
>
> El 12/08/2017 00:38, "Yury Selivanov"  escribió:
>
> Hi,
>
> This is a new PEP to implement Execution Contexts in Python.
>
> The PEP is in-flight to python.org, and in the meanwhile can
> be read on GitHub:
>
> https://github.com/python/peps/blob/master/pep-0550.rst
>
> (it contains a few diagrams and charts, so please read it there.)
>
> Thank you!
> Yury
>
>
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017
>
>
> Abstract
> 
>
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
>
> A few examples of where having a reliable state storage is required:
>
> * Context managers like decimal contexts, ``numpy.errstate``,
>   and ``warnings.catch_warnings``;
>
> * Storing request-related data such as security tokens and request
>   data in web applications;
>
> * Profiling, tracing, and logging in complex and large code bases.
>
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for isolating state of generators or
> asynchronous code because such code shares a single thread.
>
>
> Rationale
> =
>
> Traditionally a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
>
> def calculate(precision, ...):
> with decimal.localcontext() as ctx:
> # Set the precision for decimal calculations
> # inside this block
> ctx.prec = precision
>
> yield calculate_something()
> yield calculate_something_else()
>
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  The above code will
> not work correctly, if a user iterates over the ``calculate()``
> generator with different precisions in parallel::
>
> g1 = calculate(100)
> g2 = calculate(50)
>
> items = list(zip(g1, g2))
>
> # items[0] will be a tuple of:
> #   first value from g1 calculated with 100 precision,
> #   first value from g2 calculated with 50 precision.
> #
> # items[1] will be a tuple of:
> #   second value from g1 calculated with 50 precision,
> #   second value from g2 calculated with 50 precision.
>
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
>
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performance tracing data::
>
> async def handle_http_request(request):
> context.current_http_request = request
>
> await ...
> # Invoke your framework code, render templates,
> # make DB queries, 

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-13 Thread Jonathan Slenders
For what it's worth, as part of prompt_toolkit 2.0, I implemented something
very similar to Nathaniel's idea some time ago.
It works pretty well, but I don't have a strong opinion against an
alternative implementation.

- The active context is stored as a monotonically increasing integer.
- For each local, the actual values are stored in a dictionary that maps
the context ID to the value. (Could cause a GC issue - I'm not sure.)
- Every time when an executor is started, I have to wrap the callable in a
context manager that applies the current context to that thread.
- When a new 'Future' is created, I grab the context ID and apply it to the
callbacks when the result is set.

https://github.com/jonathanslenders/python-prompt-toolkit/blob/5c9ceb42ad9422a3c6a218a939843bdd2cc76f16/prompt_toolkit/eventloop/context.py
https://github.com/jonathanslenders/python-prompt-toolkit/blob/5c9ceb42ad9422a3c6a218a939843bdd2cc76f16/prompt_toolkit/eventloop/future.py

FYI: In my case, I did not want to pass the currently active "Application"
object around all of the code. But when I started supporting telnet,
multiple applications could be alive at once, each with a different I/O
backend. Therefore the active application needed to be stored in a kind of
executing context.

When PEP550 gets approved I'll probably make this compatible. It should at
least be possible to run prompt_toolkit on the asyncio event loop.

Jonathan







2017-08-13 1:35 GMT+02:00 Nathaniel Smith :

> I had an idea for an alternative API that exposes the same
> functionality/semantics as the current draft, but that might have some
> advantages. It would look like:
>
> # a "context item" is an object that holds a context-sensitive value
> # each call to create_context_item creates a new one
> ci = sys.create_context_item()
>
> # Set the value of this item in the current context
> ci.set(value)
>
> # Get the value of this item in the current context
> value = ci.get()
> value = ci.get(default)
>
> # To support async libraries, we need some way to capture the whole context
> # But an opaque token representing "all context item values" is enough
> state_token = sys.current_context_state_token()
> sys.set_context_state_token(state_token)
> coro.cr_state_token = state_token
> # etc.
>
> The advantages are:
> - Eliminates the current PEP's issues with namespace collision; every
> context item is automatically distinct from all others.
> - Eliminates the need for the None-means-del hack.
> - Lets the interpreter hide the details of garbage collecting context
> values.
> - Allows for more implementation flexibility. This could be
> implemented directly on top of Yury's current prototype. But it could
> also, for example, be implemented by storing the context values in a
> flat array, where each context item is assigned an index when it's
> allocated. In the current draft this is suggested as a possible
> extension for particularly performance-sensitive users, but this way
> we'd have the option of making everything fast without changing or
> extending the API.
>
> As precedent, this is basically the API that low-level thread-local
> storage implementations use; see e.g. pthread_key_create,
> pthread_getspecific, pthread_setspecific. (And the
> allocate-an-index-in-a-table is the implementation that fast
> thread-local storage implementations use too.)
>
> -n
>
> On Fri, Aug 11, 2017 at 3:37 PM, Yury Selivanov 
> wrote:
> > Hi,
> >
> > This is a new PEP to implement Execution Contexts in Python.
> >
> > The PEP is in-flight to python.org, and in the meanwhile can
> > be read on GitHub:
> >
> > https://github.com/python/peps/blob/master/pep-0550.rst
> >
> > (it contains a few diagrams and charts, so please read it there.)
> >
> > Thank you!
> > Yury
> >
> >
> > PEP: 550
> > Title: Execution Context
> > Version: $Revision$
> > Last-Modified: $Date$
> > Author: Yury Selivanov 
> > Status: Draft
> > Type: Standards Track
> > Content-Type: text/x-rst
> > Created: 11-Aug-2017
> > Python-Version: 3.7
> > Post-History: 11-Aug-2017
> >
> >
> > Abstract
> > 
> >
> > This PEP proposes a new mechanism to manage execution state--the
> > logical environment in which a function, a thread, a generator,
> > or a coroutine executes in.
> >
> > A few examples of where having a reliable state storage is required:
> >
> > * Context managers like decimal contexts, ``numpy.errstate``,
> >   and ``warnings.catch_warnings``;
> >
> > * Storing request-related data such as security tokens and request
> >   data in web applications;
> >
> > * Profiling, tracing, and logging in complex and large code bases.
> >
> > The usual solution for storing state is to use a Thread-local Storage
> > (TLS), implemented in the standard library as ``threading.local()``.
> > Unfortunately, TLS does not work for isolating state of generators or
> > asynchronous code because such code shares a single thread.
> >
> >
> > Rationale
> > =
> >
> > Traditionally a Thread-local Storage (TLS) is us

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
On Sat, Aug 12, 2017 at 10:12 AM, Nick Coghlan  wrote:
[..]
>
> 1. Are you sure you want to expose the CoW type to pure Python code?

Ultimately, why not? The execution context object you get with
sys.get_execution_context() is yours to change. Any change to it won't
be propagated anywhere, unless you execute something in that context
with ExecutionContext.run or set it as a current one.

>
> The draft API looks fairly error prone to me, as I'm not sure of the
> intended differences in behaviour between the following:
>
> @contextmanager
> def context(x):
> old_x = sys.get_execution_context_item('x')
> sys.set_execution_context_item('x', x)
> try:
> yield
> finally:
> sys.set_execution_context_item('x', old_x)
>
> @contextmanager
> def context(x):
> old_x = sys.get_execution_context().get('x')
> sys.get_execution_context()['x'] = x
> try:
> yield
> finally:
> sys.get_execution_context()['x'] = old_x

This one (the second example) won't do anything.

>
> @contextmanager
> def context(x):
> ec = sys.get_execution_context()
> old_x = ec.get('x')
> ec['x'] = x
> try:
> yield
> finally:
> ec['x'] = old_x

This one (the third one) won't do anything either.

You can do this:

ec = sys.get_execution_context()
ec['x'] = x
ec.run(my_function)

or `sys.set_execution_context(ec)`


>
> It seems to me that everything would be a lot safer if the *only*
> Python level API was a live dynamic view that completely hid the
> copy-on-write behaviour behind an "ExecutionContextProxy" type, such
> that the last two examples were functionally equivalent to each other
> and to the current PEP's get/set functions (rendering the latter
> redundant, and allowing it to be dropped from the PEP).

So there's no copy-on-write exposed to Python actually. What I am
thinking about, though, is that we might not need the
sys.set_execution_context() function. If you want to run something
with a modified or empty execution context, do it through
ExecutionContext.run method.

> 2. Do we need an ag_isolated_execution_context for asynchronous
> generators? (Modify this question as needed for the answer to the next
> question)

Yes, we'll need it for contextlib.asynccontextmanager at least.

>
> 3. It bothers me that *_execution_context points to an actual
> execution context, while *_isolated_execution_context is a boolean.
> With names that similar I'd expect them to point to the same kind of
> object.

I think we touched upon this in a parallel thread. But I think we can
rename "gi_isolated_execution_context" to
"gi_execution_context_isolated" or something more readable/obvious.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
On Sat, Aug 12, 2017 at 10:56 PM, Nick Coghlan  wrote:
[..]
> As Nathaniel suggestion, getting/setting/deleting individual items in
> the current context would be implemented as methods on the ContextItem
> objects, allowing the return value of "get_context_items" to be a
> plain dictionary, rather than a special type that directly supported
> updates to the underlying context.

The current PEP 550 design returns a "snapshot" of the current EC with
sys.get_execution_context().

I.e. if you do

ec = sys.get_execution_context()
ec['a'] = 'b'

# sys.get_execution_context_item('a') will return None

You did get a snapshot and you modified it -- but your modifications
are not visible anywhere. You can run a function in that modified EC
with `ec.run(function)` and that function will see that new 'a' key,
but that's it. There's no "magical" updates to the underlying context.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
On Sat, Aug 12, 2017 at 9:05 PM, Nick Coghlan  wrote:
> On 13 August 2017 at 12:15, Nathaniel Smith  wrote:
>> On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov  
>> wrote:
>>> Yes, I considered this idea myself, but ultimately rejected it because:
>>>
>>> 1. Current solution makes it easy to introspect things. Get the
>>> current EC and print it out.  Although the context item idea could be
>>> extended to `sys.create_context_item('description')` to allow that.
>>
>> My first draft actually had the description argument :-). But then I
>> deleted it on the grounds that there's also no way to introspect a
>> list of all threading.local objects, and no-one seems to be bothered
>> by that, so why should we bother here.
>
> In the TLS/TSS case, we have the design constraint of wanting to use
> the platform provided TLS/TSS implementation when available, and
> standard C APIs generally aren't designed to support rich runtime
> introspection from regular C code - instead, they expect the debugger,
> compiler, and standard library to be co-developed such that the
> debugger knows how to figure out where the latter two have put things
> at runtime.

Excellent point.

>> Obviously it'd be trivial to
>> add though, yeah; I don't really care either way.
>
> As noted in my other email, I like the idea of making the context
> dependent state introspection API clearly distinct from the core
> context dependent state management API.
>
> That way the API implementation can focus on using the most efficient
> data structures for the purpose, rather than being limited to the most
> efficient data structures that can readily export a Python-style
> mapping interface. The latter can then be provided purely for
> introspection purposes.

Also an excellent point :-).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 13 August 2017 at 12:15, Nathaniel Smith  wrote:
> On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov  
> wrote:
>> Yes, I considered this idea myself, but ultimately rejected it because:
>>
>> 1. Current solution makes it easy to introspect things. Get the
>> current EC and print it out.  Although the context item idea could be
>> extended to `sys.create_context_item('description')` to allow that.
>
> My first draft actually had the description argument :-). But then I
> deleted it on the grounds that there's also no way to introspect a
> list of all threading.local objects, and no-one seems to be bothered
> by that, so why should we bother here.

In the TLS/TSS case, we have the design constraint of wanting to use
the platform provided TLS/TSS implementation when available, and
standard C APIs generally aren't designed to support rich runtime
introspection from regular C code - instead, they expect the debugger,
compiler, and standard library to be co-developed such that the
debugger knows how to figure out where the latter two have put things
at runtime.

> Obviously it'd be trivial to
> add though, yeah; I don't really care either way.

As noted in my other email, I like the idea of making the context
dependent state introspection API clearly distinct from the core
context dependent state management API.

That way the API implementation can focus on using the most efficient
data structures for the purpose, rather than being limited to the most
efficient data structures that can readily export a Python-style
mapping interface. The latter can then be provided purely for
introspection purposes.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
[replying to list]

On Sat, Aug 12, 2017 at 10:56 PM, Nick Coghlan  wrote:
> On 13 August 2017 at 11:27, Yury Selivanov  wrote:
>> Yes, I considered this idea myself, but ultimately rejected it because:
>>
>> 1. Current solution makes it easy to introspect things. Get the
>> current EC and print it out.  Although the context item idea could be
>> extended to `sys.create_context_item('description')` to allow that.
>
> I think the TLS/TSS precedent means we should seriously consider the
> ContextItem + ContextStateToken approach for the core low level API.

I actually like the idea and am fully open to it. I'm also curious if
it's possible to adapt the flat-array/fast access ideas that Nathaniel
mentioned.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 13 August 2017 at 11:27, Yury Selivanov  wrote:
> Yes, I considered this idea myself, but ultimately rejected it because:
>
> 1. Current solution makes it easy to introspect things. Get the
> current EC and print it out.  Although the context item idea could be
> extended to `sys.create_context_item('description')` to allow that.

I think the TLS/TSS precedent means we should seriously consider the
ContextItem + ContextStateToken approach for the core low level API.

We also have a long history of pain and quirks arising from the
locals() builtin being defined as returning a mapping even though
function locals are managed as a linear array, so if we can avoid that
for the execution context, it will likely be beneficial for both end
users (due to less quirky runtime behaviour, especially across
implementations) and language implementation developers (due to a
reduced need to make something behave like an ordinary mapping when it
really isn't).

If we decide we want a separate context introspection API (akin to
inspect.getcouroutinelocals() and inspect.getgeneratorlocals()), then
an otherwise opaque ContextStateToken would be sufficient to enable
that. Even if we don't need it for any other reason, having such an
API available would be desirable for the regression test suite.

For example, if context items are hashable, we could have the
following arrangement:

# Create new context items
sys.create_context_item(name)
# Opaque token for the current execution context
sys.get_context_token()
# Switch the current execution context to the given one
sys.set_context(context_token)
# Snapshot mapping context items to their values in given context
sys.get_context_items(context_token)

As Nathaniel suggestion, getting/setting/deleting individual items in
the current context would be implemented as methods on the ContextItem
objects, allowing the return value of "get_context_items" to be a
plain dictionary, rather than a special type that directly supported
updates to the underlying context.

> 2. What if we want to pickle the EC? If all items in it are
> pickleable, it's possible to dump the EC, send it over the network,
> and re-use in some other process. It's not something I want to
> consider in the PEP right now, but it's something that the current
> design theoretically allows. AFAIU, `ci = sys.create_context_item()`
> context item wouldn't be possible to pickle/unpickle correctly, no?

As Nathaniel notes, cooperative partial pickling will be possible
regardless of how the low level API works, and starting with a simpler
low level API still doesn't rule out adding features like this at a
later date.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Kevin Conway
As far as providing a thread-local like surrogate for coroutine based
systems in Python, we had to solve this for Twisted with
https://bitbucket.org/hipchat/txlocal. Because of the way the Twisted
threadpooling works we also had to make a context system that was both
coroutine and thread safe at the same time.

We have a similar setup for asyncio but it seems we haven't open sourced
it. I'll ask around for it if this group feels that an asyncio example
would be beneficial. We implemented both of these in plain-old Python so
they should be compatible beyond CPython.

It's been over a year since I was directly involved with either of these
projects, but added memory and CPU consumption were stats we watched
closely and we found a negligible increase in both as we rolled out async
context.

On Sat, Aug 12, 2017 at 9:16 PM Nathaniel Smith  wrote:

> On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov 
> wrote:
> > Yes, I considered this idea myself, but ultimately rejected it because:
> >
> > 1. Current solution makes it easy to introspect things. Get the
> > current EC and print it out.  Although the context item idea could be
> > extended to `sys.create_context_item('description')` to allow that.
>
> My first draft actually had the description argument :-). But then I
> deleted it on the grounds that there's also no way to introspect a
> list of all threading.local objects, and no-one seems to be bothered
> by that, so why should we bother here. Obviously it'd be trivial to
> add though, yeah; I don't really care either way.
>
> > 2. What if we want to pickle the EC? If all items in it are
> > pickleable, it's possible to dump the EC, send it over the network,
> > and re-use in some other process. It's not something I want to
> > consider in the PEP right now, but it's something that the current
> > design theoretically allows. AFAIU, `ci = sys.create_context_item()`
> > context item wouldn't be possible to pickle/unpickle correctly, no?
>
> That's true. In this API, supporting pickling would require some kind
> of opt-in on the part of EC users.
>
> But... pickling would actually need to be opt-in anyway. Remember, the
> set of all EC items is a piece of global shared state; we expect new
> entries to appear when random 3rd party libraries are imported. So we
> have no idea what is in there or what it's being used for. Blindly
> pickling the whole context will lead to bugs (when code unexpectedly
> ends up with context that wasn't designed to go across processes) and
> crashes (there's no guarantee that all the objects are even
> pickleable).
>
> If we do decide we want to support this in the future then we could
> add a generic opt-in mechanism something like:
>
> MY_CI = sys.create_context_item(__name__, "MY_CI", pickleable=True)
>
> But I'm not sure that it even make sense to have a global flag
> enabling pickle. Probably it's better to have separate flags to opt-in
> to different libraries that might want to pickle in different
> situations for different reasons: pickleable-by-dask,
> pickleable-by-curio.run_in_process, ... And that's doable without any
> special interpreter support. E.g. you could have
> curio.Local(pickle=True) coordinate with curio.run_in_process.
>
> > Some more comments:
> >
> > On Sat, Aug 12, 2017 at 7:35 PM, Nathaniel Smith  wrote:
> > [..]
> >> The advantages are:
> >> - Eliminates the current PEP's issues with namespace collision; every
> >> context item is automatically distinct from all others.
> >
> > TBH I think that the collision issue is slightly exaggerated.
> >
> >> - Eliminates the need for the None-means-del hack.
> >
> > I consider Execution Context to be an API, not a collection. It's an
> > important distinction, If you view it that way, deletion on None is
> > doesn't look that esoteric.
>
> Deletion on None is still a special case that API users need to
> remember, and it's a small footgun that you can't just take an
> arbitrary Python object and round-trip it through the context.
> Obviously these are both APIs and they can do anything that makes
> sense, but all else being equal I prefer APIs that have fewer special
> cases :-).
>
> >> - Lets the interpreter hide the details of garbage collecting context
> values.
> >
> > I'm not sure I understand how the current PEP design is bad from the
> > GC standpoint. Or how this proposal can be different, FWIW.
>
> When the ContextItem object becomes unreachable and is collected, then
> the interpreter knows that all of the values associated with it in
> different contexts are also unreachable and can be collected.
>
> I mentioned this in my email yesterday -- look at the hoops
> threading.local jumps through to avoid breaking garbage collection.
>
> This is closely related to the previous point, actually -- AFAICT the
> only reason why it *really* matters that None deletes the item is that
> you need to be able to delete to free the item from the dictionary,
> which only matters if you want to dynamically allocate keys an

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov  wrote:
> Yes, I considered this idea myself, but ultimately rejected it because:
>
> 1. Current solution makes it easy to introspect things. Get the
> current EC and print it out.  Although the context item idea could be
> extended to `sys.create_context_item('description')` to allow that.

My first draft actually had the description argument :-). But then I
deleted it on the grounds that there's also no way to introspect a
list of all threading.local objects, and no-one seems to be bothered
by that, so why should we bother here. Obviously it'd be trivial to
add though, yeah; I don't really care either way.

> 2. What if we want to pickle the EC? If all items in it are
> pickleable, it's possible to dump the EC, send it over the network,
> and re-use in some other process. It's not something I want to
> consider in the PEP right now, but it's something that the current
> design theoretically allows. AFAIU, `ci = sys.create_context_item()`
> context item wouldn't be possible to pickle/unpickle correctly, no?

That's true. In this API, supporting pickling would require some kind
of opt-in on the part of EC users.

But... pickling would actually need to be opt-in anyway. Remember, the
set of all EC items is a piece of global shared state; we expect new
entries to appear when random 3rd party libraries are imported. So we
have no idea what is in there or what it's being used for. Blindly
pickling the whole context will lead to bugs (when code unexpectedly
ends up with context that wasn't designed to go across processes) and
crashes (there's no guarantee that all the objects are even
pickleable).

If we do decide we want to support this in the future then we could
add a generic opt-in mechanism something like:

MY_CI = sys.create_context_item(__name__, "MY_CI", pickleable=True)

But I'm not sure that it even make sense to have a global flag
enabling pickle. Probably it's better to have separate flags to opt-in
to different libraries that might want to pickle in different
situations for different reasons: pickleable-by-dask,
pickleable-by-curio.run_in_process, ... And that's doable without any
special interpreter support. E.g. you could have
curio.Local(pickle=True) coordinate with curio.run_in_process.

> Some more comments:
>
> On Sat, Aug 12, 2017 at 7:35 PM, Nathaniel Smith  wrote:
> [..]
>> The advantages are:
>> - Eliminates the current PEP's issues with namespace collision; every
>> context item is automatically distinct from all others.
>
> TBH I think that the collision issue is slightly exaggerated.
>
>> - Eliminates the need for the None-means-del hack.
>
> I consider Execution Context to be an API, not a collection. It's an
> important distinction, If you view it that way, deletion on None is
> doesn't look that esoteric.

Deletion on None is still a special case that API users need to
remember, and it's a small footgun that you can't just take an
arbitrary Python object and round-trip it through the context.
Obviously these are both APIs and they can do anything that makes
sense, but all else being equal I prefer APIs that have fewer special
cases :-).

>> - Lets the interpreter hide the details of garbage collecting context values.
>
> I'm not sure I understand how the current PEP design is bad from the
> GC standpoint. Or how this proposal can be different, FWIW.

When the ContextItem object becomes unreachable and is collected, then
the interpreter knows that all of the values associated with it in
different contexts are also unreachable and can be collected.

I mentioned this in my email yesterday -- look at the hoops
threading.local jumps through to avoid breaking garbage collection.

This is closely related to the previous point, actually -- AFAICT the
only reason why it *really* matters that None deletes the item is that
you need to be able to delete to free the item from the dictionary,
which only matters if you want to dynamically allocate keys and then
throw them away again. In the ContextItem approach, there's no need to
manually delete the entry, you can just drop your reference to the
ContextItem and the the garbage collector take care of it.

>> - Allows for more implementation flexibility. This could be
>> implemented directly on top of Yury's current prototype. But it could
>> also, for example, be implemented by storing the context values in a
>> flat array, where each context item is assigned an index when it's
>> allocated.
>
> You still want to have this optimization only for *some* keys. So I
> think a separate API is still needed.

Wait, why is it a requirement that some keys be slow? That seems like
weird requirement :-).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 13 August 2017 at 03:53, Yury Selivanov  wrote:
> On Sat, Aug 12, 2017 at 1:09 PM, Nick Coghlan  wrote:
>> Now that you raise this point, I think it means that generators need
>> to retain their current context inheritance behaviour, simply for
>> backwards compatibility purposes. This means that the case we need to
>> enable is the one where the generator *doesn't* dynamically adjust its
>> execution context to match that of the calling function.
>
> Nobody *intentionally* iterates a generator manually in different
> decimal contexts (or any other contexts). This is an extremely error
> prone thing to do, because one refactoring of generator -- rearranging
> yields -- would wreck your custom iteration/context logic. I don't
> think that any real code relies on this, and I don't think that we are
> breaking backwards compatibility here in any way. How many users need
> about this?

I think this is a reasonable stance for the PEP to take, but the
hidden execution state around the "isolated or not" behaviour still
bothers me.

In some ways it reminds me of the way function parameters work: the
bound parameters are effectively a *shallow* copy of the passed
arguments, so callers can decide whether or not they want the callee
to be able to modify them based on the arguments' mutability (or lack
thereof).

The execution context proposal uses copy-on-write semantics for
runtime efficiency, but it's essentially the same shallow copy concept
applied to __next__(), send() and throw() operations (and perhaps
__anext__(), asend(), and athrow() - I haven't wrapped my head around
the implications for async generators and context managers yet).

That similarity makes me wonder whether the "isolated or not"
behaviour could be moved from the object being executed and directly
into the key/value pairs themselves based on whether or not the values
were mutable, as that's the way function calls work: if the argument
is immutable, the callee *can't* change it, while if it's mutable, the
callee can mutate it, but it still can't rebind it to refer to a
different object.

The way I'd see that working with an always-reverted copy-on-write
execution context:

1. If a parent context wants child contexts to be able to make
changes, then it should put a *mutable* object in the context (e.g. a
list or class instance)
2. If a parent context *does not* want child contexts to be able to
make changes, then it should put an *immutable* object in the context
(e.g. a tuple or number)
3. If a child context *wants to share a context key with its parent,
then it should *mutate* it in place
4. If a child context *does not* want to share a context key with its
parent, then it should *rebind* it to a different object

That way, instead of reverted-or-not-reverted being an all-or-nothing
interpreter level decision, it can be made on a key-by-key basis by
choosing whether or not to use a mutable value.

To make that a little less abstract, consider a concrete example like
setting a "my_web_framework.request" key:

1. The step of *setting* the key will *not* be shared with the parent
context, as that modifies the underlying copy-on-write namespace, and
will hence be reverted when control is passed back to the parent
2. Any *mutation* of the request object *will* be shared, since
mutating the value doesn't have any effect on the copy-on-write
namespace

Nathaniel's example of wanting stack-like behaviour could be modeled
using tuples as values: when the child context appends to the tuple,
it will necessarily have to create a new tuple and rebind the
corresponding key, causing the changes to be invisible to the parent
context.

The contextlib.contextmanager use case could then be modeled as a
*separate* method that skipped the save/revert context management step
(e.g. "send_with_shared_context", "throw_with_shared_context")

> If someone does need this, it's possible to flip
> `gi_isolated_execution_context` to `False` (as contextmanager does
> now) and get this behaviour. This might be needed for frameworks like
> Tornado which support coroutines via generators without 'yield from',
> but I'll have to verify this.

Working through this above, I think the key points that bother me
about the stateful revert-or-not setting is that whether or not
context reversion is desirable depends mainly on two things:

- the specific key in question (indicated by mutable vs immutable values)
- the intent of the code in the parent context (which could be
indicated by calling different methods)

It *doesn't* seem to be an inherent property of a given generator or
coroutine, except insofar as there's a correlation between the code
that creates generators & coroutines and the code that subsequently
invokes them.

> Another idea: in one of my initial PEP implementations, I exposed
> gen.gi_execution_context (same for coroutines) to python as read/write
> attribute. That allowed to
>
> (a) get the execution context out of generator (for introspection or
> other purposes

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Yes, I considered this idea myself, but ultimately rejected it because:

1. Current solution makes it easy to introspect things. Get the
current EC and print it out.  Although the context item idea could be
extended to `sys.create_context_item('description')` to allow that.

2. What if we want to pickle the EC? If all items in it are
pickleable, it's possible to dump the EC, send it over the network,
and re-use in some other process. It's not something I want to
consider in the PEP right now, but it's something that the current
design theoretically allows. AFAIU, `ci = sys.create_context_item()`
context item wouldn't be possible to pickle/unpickle correctly, no?

Some more comments:

On Sat, Aug 12, 2017 at 7:35 PM, Nathaniel Smith  wrote:
[..]
> The advantages are:
> - Eliminates the current PEP's issues with namespace collision; every
> context item is automatically distinct from all others.

TBH I think that the collision issue is slightly exaggerated.

> - Eliminates the need for the None-means-del hack.

I consider Execution Context to be an API, not a collection. It's an
important distinction, If you view it that way, deletion on None is
doesn't look that esoteric.

> - Lets the interpreter hide the details of garbage collecting context values.

I'm not sure I understand how the current PEP design is bad from the
GC standpoint. Or how this proposal can be different, FWIW.

> - Allows for more implementation flexibility. This could be
> implemented directly on top of Yury's current prototype. But it could
> also, for example, be implemented by storing the context values in a
> flat array, where each context item is assigned an index when it's
> allocated.

You still want to have this optimization only for *some* keys. So I
think a separate API is still needed.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
I had an idea for an alternative API that exposes the same
functionality/semantics as the current draft, but that might have some
advantages. It would look like:

# a "context item" is an object that holds a context-sensitive value
# each call to create_context_item creates a new one
ci = sys.create_context_item()

# Set the value of this item in the current context
ci.set(value)

# Get the value of this item in the current context
value = ci.get()
value = ci.get(default)

# To support async libraries, we need some way to capture the whole context
# But an opaque token representing "all context item values" is enough
state_token = sys.current_context_state_token()
sys.set_context_state_token(state_token)
coro.cr_state_token = state_token
# etc.

The advantages are:
- Eliminates the current PEP's issues with namespace collision; every
context item is automatically distinct from all others.
- Eliminates the need for the None-means-del hack.
- Lets the interpreter hide the details of garbage collecting context values.
- Allows for more implementation flexibility. This could be
implemented directly on top of Yury's current prototype. But it could
also, for example, be implemented by storing the context values in a
flat array, where each context item is assigned an index when it's
allocated. In the current draft this is suggested as a possible
extension for particularly performance-sensitive users, but this way
we'd have the option of making everything fast without changing or
extending the API.

As precedent, this is basically the API that low-level thread-local
storage implementations use; see e.g. pthread_key_create,
pthread_getspecific, pthread_setspecific. (And the
allocate-an-index-in-a-table is the implementation that fast
thread-local storage implementations use too.)

-n

On Fri, Aug 11, 2017 at 3:37 PM, Yury Selivanov  wrote:
> Hi,
>
> This is a new PEP to implement Execution Contexts in Python.
>
> The PEP is in-flight to python.org, and in the meanwhile can
> be read on GitHub:
>
> https://github.com/python/peps/blob/master/pep-0550.rst
>
> (it contains a few diagrams and charts, so please read it there.)
>
> Thank you!
> Yury
>
>
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017
>
>
> Abstract
> 
>
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
>
> A few examples of where having a reliable state storage is required:
>
> * Context managers like decimal contexts, ``numpy.errstate``,
>   and ``warnings.catch_warnings``;
>
> * Storing request-related data such as security tokens and request
>   data in web applications;
>
> * Profiling, tracing, and logging in complex and large code bases.
>
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for isolating state of generators or
> asynchronous code because such code shares a single thread.
>
>
> Rationale
> =
>
> Traditionally a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
>
> def calculate(precision, ...):
> with decimal.localcontext() as ctx:
> # Set the precision for decimal calculations
> # inside this block
> ctx.prec = precision
>
> yield calculate_something()
> yield calculate_something_else()
>
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  The above code will
> not work correctly, if a user iterates over the ``calculate()``
> generator with different precisions in parallel::
>
> g1 = calculate(100)
> g2 = calculate(50)
>
> items = list(zip(g1, g2))
>
> # items[0] will be a tuple of:
> #   first value from g1 calculated with 100 precision,
> #   first value from g2 calculated with 50 precision.
> #
> # items[1] will be a tuple of:
> #   second value from g1 calculated with 50 precision,
> #   second value from g2 calculated with 50 precision.
>
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
>
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performa

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Pau Freixes
Good work Yuri, going for all in one will help to not increase the
diferences btw async and the sync world in Python.

I do really like the idea of the immutable dicts, it makes easy inherit the
context btw tasks/threads/whatever without put in risk the consistency if
there is further key colisions.

Ive just take a look at the asyncio modifications. Correct me if Im wrong,
but the handler strategy has a side effect. The work done to save and
restore the context will be done twice in some situations. It would happen
when the callback is in charge of execute a task step, once by the run in
context method and the other one by the coroutine. Is that correct?

El 12/08/2017 00:38, "Yury Selivanov"  escribió:

Hi,

This is a new PEP to implement Execution Contexts in Python.

The PEP is in-flight to python.org, and in the meanwhile can
be read on GitHub:

https://github.com/python/peps/blob/master/pep-0550.rst

(it contains a few diagrams and charts, so please read it there.)

Thank you!
Yury


PEP: 550
Title: Execution Context
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-Aug-2017
Python-Version: 3.7
Post-History: 11-Aug-2017


Abstract


This PEP proposes a new mechanism to manage execution state--the
logical environment in which a function, a thread, a generator,
or a coroutine executes in.

A few examples of where having a reliable state storage is required:

* Context managers like decimal contexts, ``numpy.errstate``,
  and ``warnings.catch_warnings``;

* Storing request-related data such as security tokens and request
  data in web applications;

* Profiling, tracing, and logging in complex and large code bases.

The usual solution for storing state is to use a Thread-local Storage
(TLS), implemented in the standard library as ``threading.local()``.
Unfortunately, TLS does not work for isolating state of generators or
asynchronous code because such code shares a single thread.


Rationale
=

Traditionally a Thread-local Storage (TLS) is used for storing the
state.  However, the major flaw of using the TLS is that it works only
for multi-threaded code.  It is not possible to reliably contain the
state within a generator or a coroutine.  For example, consider
the following generator::

def calculate(precision, ...):
with decimal.localcontext() as ctx:
# Set the precision for decimal calculations
# inside this block
ctx.prec = precision

yield calculate_something()
yield calculate_something_else()

Decimal context is using a TLS to store the state, and because TLS is
not aware of generators, the state can leak.  The above code will
not work correctly, if a user iterates over the ``calculate()``
generator with different precisions in parallel::

g1 = calculate(100)
g2 = calculate(50)

items = list(zip(g1, g2))

# items[0] will be a tuple of:
#   first value from g1 calculated with 100 precision,
#   first value from g2 calculated with 50 precision.
#
# items[1] will be a tuple of:
#   second value from g1 calculated with 50 precision,
#   second value from g2 calculated with 50 precision.

An even scarier example would be using decimals to represent money
in an async/await application: decimal calculations can suddenly
lose precision in the middle of processing a request.  Currently,
bugs like this are extremely hard to find and fix.

Another common need for web applications is to have access to the
current request object, or security context, or, simply, the request
URL for logging or submitting performance tracing data::

async def handle_http_request(request):
context.current_http_request = request

await ...
# Invoke your framework code, render templates,
# make DB queries, etc, and use the global
# 'current_http_request' in that code.

# This isn't currently possible to do reliably
# in asyncio out of the box.

These examples are just a few out of many, where a reliable way to
store context data is absolutely needed.

The inability to use TLS for asynchronous code has lead to
proliferation of ad-hoc solutions, limited to be supported only by
code that was explicitly enabled to work with them.

Current status quo is that any library, including the standard
library, that uses a TLS, will likely not work as expected in
asynchronous code or with generators (see [3]_ as an example issue.)

Some languages that have coroutines or generators recommend to
manually pass a ``context`` object to every function, see [1]_
describing the pattern for Go.  This approach, however, has limited
use for Python, where we have a huge ecosystem that was built to work
with a TLS-like context.  Moreover, passing the context explicitly
does not work at all for libraries like ``decimal`` or ``numpy``,
which use operator overloading.

.NET runtime, which h

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Nathaniel, Nick,

I'll reply only to point 9 in this email to split this threads into
manageable sub-threads.  I'll cover other points in later emails.

On Sat, Aug 12, 2017 at 3:54 AM, Nathaniel Smith  wrote:
> 9. OK, my big question, about semantics.

FWIW I took me a good hour to fully understand what you are doing with
"fail_after" and what you want from PEP 550, and the actual associated
problems with generators :)

>
> The PEP's design is based on the assumption that all context-local
> state is scalar-like, and contexts split but never join. But there are
> some cases where this isn't true, in particular for values that have
> "stack-like" semantics. These are terms I just made up, but let me
> give some examples. Python's sys.exc_info is one. Another I ran into
> recently is for trio's cancel scopes.

As you yourself show below, it's easy to implement stacks with the
proposed EC spec. A linked list will work good enough.

>
> So basically the background is, in trio you can wrap a context manager
> around any arbitrary chunk of code and then set a timeout or
> explicitly cancel that code. It's called a "cancel scope". These are
> fully nestable. Full details here:
> https://trio.readthedocs.io/en/latest/reference-core.html#cancellation-and-timeouts
>
> Currently, the implementation involves keeping a stack of cancel
> scopes in Task-local storage. This works fine for regular async code
> because when we switch Tasks, we also switch the cancel scope stack.
> But of course it falls apart for generators/async generators:
>
> async def agen():
> with fail_after(10):  # 10 second timeout for finishing this block
> await some_blocking_operation()
> yield
> await another_blocking_operation()
>
> async def caller():
> with fail_after(20):
> ag = agen()
> await ag.__anext__()
> # now that cancel scope is on the stack, even though we're not
> # inside the context manager! this will not end well.
> await some_blocking_operation()  # this might get cancelled
> when it shouldn't
> # even if it doesn't, we'll crash here when exiting the context manager
> # because we try to pop a cancel scope that isn't at the top of the stack
>
> So I was thinking about whether I could implement this using PEP 550.
> It requires some cleverness, but I could switch to representing the
> stack as a singly-linked list, and then snapshot it and pass it back
> to the coroutine runner every time I yield.

Right. So the task always knows the EC at the point of "yield". It can
then get the latest timeout from it and act accordingly if that yield
did not resume in time.  This should work.

> That would fix the case
> above. But, I think there's another case that's kind of a showstopper.
>
> async def agen():
> await some_blocking_operation()
> yield
>
> async def caller():
> ag = agen()  # context is captured here
> with fail_after(10):
> await ag.__anext__()
>
> Currently this case works correctly: the timeout is applied to the
> __anext__ call, as you'd expect. But with PEP 550, it wouldn't work:
> the generator's timeouts would all be fixed when it was instantiated,
> and we wouldn't be able to detect that the second call has a timeout
> imposed on it. So that's a pretty nasty footgun. Any time you have
> code that's supposed to have a timeout applied, but in fact has no
> timeout applied, then that's a really serious bug -- it can lead to
> hangs, trivial DoS, pagers going off, etc.

As I tried to explain in my last email, I generally don't believe that
people would do this partial iteration with timeouts or other contexts
around it.  The only use case I can come up so far is implementing
some sort of receiver using an AG, and then "listening" on it through
"__anext__" calls.

But the case is interesting nevertheless, and maybe we can fix it
without relaxing any guarantees of the PEP.

The idea that I have is to allow linking of ExecutionContext (this is
similar in a way to what Nick proposed, but has a stricter semantics):

1. The internal ExecutionContext object will have a new "back" attribute.

2. For regular code and coroutines everything that is already in the
PEP will stay the same.

3. For generators and asynchronous generators, when a generator is
created, an empty ExecutionContext will be created for it, with its
"back" attribute pointing to the current EC.

4. The lookup function will be adjusted to to check the "EC.back" if
the key is not found in the current EC.

5. The max level of "back" chain will be 1.

6. When a generator is created inside another generator, it will
inherit another generator's EC. Because contexts are immutable this
should be OK.

7. When a coroutine is created inside an EC with a "back" link, it
will merge EC and EC.back in one new EC. Merge can be done very
efficiently for HAMT mappings which I believe we will end up using for
this anyways (an O(log32 N) operation).

An illustration of what it w

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
On Sat, Aug 12, 2017 at 2:28 PM, rym...@gmail.com  wrote:
> So, I'm hardly an expert when it comes to things like this, but there are
> two things about this that don't seem right to me. (Also, I'd love to
> respond inline, but that's kind of difficult from a mobile phone.)
>
> The first is how set/get_execution_context_item take strings. Inevitably,
> people are going to do things like:

Yes, it accepts any hashable Python object as a key.

>
> CONTEXT_ITEM_NAME = 'foo-bar'
> ...
> sys.set_execution_context_item(CONTEXT_ITEM_NAME, 'stuff')
>
> IMO it would be nicer if there could be a key object used instead, e.g.
>
> my_key = sys.execution_context_key('name-here-for-debugging-purposes')
> sys.set_execution_context_item(my_key, 'stuff')

I thought about this, and decided that this is something that can be
easily designed on top of the PEP and put to the 'contextlib' module.

In practice, this issue can be entirely addressed in the
documentation, asking users to prefix their keys with their
library/framework/program name.

>
> The advantage here would be no need for string constants and no potential
> naming conflicts (the string passed to the key creator would be used just
> for debugging, kind of like Thread names).
>
>
> Second thing is this:
>
> def context(x):
> old_x = get_execution_context_item('x')
> set_execution_context_item('x', x)
> try:
> yield
> finally:
> set_execution_context_item('x', old_x)
>
>
>
> If this would be done frequently, a context manager would be a *lot* more
> Pythonic, e.g.:
>
> with sys.temp_change_execution_context('x', new_x):
> # ...

Yes, this is a neat idea and I think we can add such a helper to
contextlib.  I want to focus PEP 550 API on correctness, minimalism,
and performance.  Nice APIs can then be easily developed on top of it
later.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Sure, I'll do.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread rym...@gmail.com
So, I'm hardly an expert when it comes to things like this, but there are
two things about this that don't seem right to me. (Also, I'd love to
respond inline, but that's kind of difficult from a mobile phone.)

The first is how set/get_execution_context_item take strings. Inevitably,
people are going to do things like:

CONTEXT_ITEM_NAME = 'foo-bar'
...
sys.set_execution_context_item(CONTEXT_ITEM_NAME, 'stuff')

IMO it would be nicer if there could be a key object used instead, e.g.

my_key = sys.execution_context_key('name-here-for-debugging-purposes')
sys.set_execution_context_item(my_key, 'stuff')

The advantage here would be no need for string constants and no potential
naming conflicts (the string passed to the key creator would be used just
for debugging, kind of like Thread names).


Second thing is this:

def context(x):
old_x = get_execution_context_item('x')
set_execution_context_item('x', x)
try:
yield
finally:
set_execution_context_item('x', old_x)



If this would be done frequently, a context manager would be a *lot* more
Pythonic, e.g.:

with sys.temp_change_execution_context('x', new_x):
# ...

--
Ryan (ライアン)
Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone
elsehttp://refi64.com

On Aug 11, 2017 at 5:38 PM, >
wrote:

Hi,

This is a new PEP to implement Execution Contexts in Python.

The PEP is in-flight to python.org, and in the meanwhile can
be read on GitHub:

https://github.com/python/peps/blob/master/pep-0550.rst

(it contains a few diagrams and charts, so please read it there.)

Thank you!
Yury


PEP: 550
Title: Execution Context
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-Aug-2017
Python-Version: 3.7
Post-History: 11-Aug-2017


Abstract


This PEP proposes a new mechanism to manage execution state--the
logical environment in which a function, a thread, a generator,
or a coroutine executes in.

A few examples of where having a reliable state storage is required:

* Context managers like decimal contexts, ``numpy.errstate``,
  and ``warnings.catch_warnings``;

* Storing request-related data such as security tokens and request
  data in web applications;

* Profiling, tracing, and logging in complex and large code bases.

The usual solution for storing state is to use a Thread-local Storage
(TLS), implemented in the standard library as ``threading.local()``.
Unfortunately, TLS does not work for isolating state of generators or
asynchronous code because such code shares a single thread.


Rationale
=

Traditionally a Thread-local Storage (TLS) is used for storing the
state.  However, the major flaw of using the TLS is that it works only
for multi-threaded code.  It is not possible to reliably contain the
state within a generator or a coroutine.  For example, consider
the following generator::

def calculate(precision, ...):
with decimal.localcontext() as ctx:
# Set the precision for decimal calculations
# inside this block
ctx.prec = precision

yield calculate_something()
yield calculate_something_else()

Decimal context is using a TLS to store the state, and because TLS is
not aware of generators, the state can leak.  The above code will
not work correctly, if a user iterates over the ``calculate()``
generator with different precisions in parallel::

g1 = calculate(100)
g2 = calculate(50)

items = list(zip(g1, g2))

# items[0] will be a tuple of:
#   first value from g1 calculated with 100 precision,
#   first value from g2 calculated with 50 precision.
#
# items[1] will be a tuple of:
#   second value from g1 calculated with 50 precision,
#   second value from g2 calculated with 50 precision.

An even scarier example would be using decimals to represent money
in an async/await application: decimal calculations can suddenly
lose precision in the middle of processing a request.  Currently,
bugs like this are extremely hard to find and fix.

Another common need for web applications is to have access to the
current request object, or security context, or, simply, the request
URL for logging or submitting performance tracing data::

async def handle_http_request(request):
context.current_http_request = request

await ...
# Invoke your framework code, render templates,
# make DB queries, etc, and use the global
# 'current_http_request' in that code.

# This isn't currently possible to do reliably
# in asyncio out of the box.

These examples are just a few out of many, where a reliable way to
store context data is absolutely needed.

The inability to use TLS for asynchronous code has lead to
proliferation of ad-hoc solutions, limited to be supported only by
code that was explicitly enabled to work with them.

Current status quo is that any library, including th

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Nick, Nathaniel, I'll be replying in full to your emails when I have
time to do some experiments.  Now I just want to address one point
that I think is important:

On Sat, Aug 12, 2017 at 1:09 PM, Nick Coghlan  wrote:
> On 12 August 2017 at 17:54, Nathaniel Smith  wrote:
>> ...and now that I've written that down, I sort of feel like that might
>> be what you want for all the other sorts of context object too? Like,
>> here's a convoluted example:
>>
>> def gen():
>> a = decimal.Decimal("1.111")
>> b = decimal.Decimal("2.222")
>> print(a + b)
>> yield
>> print(a + b)
>>
>> def caller():
>> # let's pretend this context manager exists, the actual API is
>> more complicated
>> with decimal_context_precision(3):
>> g = gen()
>> with decimal_context_precision(2):
>> next(g)
>> with decimal_context_precision(1):
>> next(g)
>>
>> Currently, this will print "3.3 3", because when the generator is
>> resumed it inherits the context of the resuming site. With PEP 550, it
>> would print "3.33 3.33" (or maybe "3.3 3.3"? it's not totally clear
>> from the text), because it inherits the context when the generator is
>> created and then ignores the calling context. It's hard to get strong
>> intuitions, but I feel like the current behavior is actually more
>> sensible -- each time the generator gets resumed, the next bit of code
>> runs in the context of whoever called next(), and the generator is
>> just passively inheriting context, so ... that makes sense.
>
> Now that you raise this point, I think it means that generators need
> to retain their current context inheritance behaviour, simply for
> backwards compatibility purposes. This means that the case we need to
> enable is the one where the generator *doesn't* dynamically adjust its
> execution context to match that of the calling function.

Nobody *intentionally* iterates a generator manually in different
decimal contexts (or any other contexts). This is an extremely error
prone thing to do, because one refactoring of generator -- rearranging
yields -- would wreck your custom iteration/context logic. I don't
think that any real code relies on this, and I don't think that we are
breaking backwards compatibility here in any way. How many users need
about this?

If someone does need this, it's possible to flip
`gi_isolated_execution_context` to `False` (as contextmanager does
now) and get this behaviour. This might be needed for frameworks like
Tornado which support coroutines via generators without 'yield from',
but I'll have to verify this.

What I'm saying here, is that any sort of context leaking *into* or
*out of* generator *while* it is iterating will likely cause only bugs
or undefined behaviour. Take a look at the precision example in the
Rationale section of  the PEP.

Most of the time generators are created and are iterated in the same
spot, you rarely create generator closures. One way the behaviour
could be changed, however, is to capture the execution context when
it's first iterated (as opposed to when it's instantiated), but I
don't think it makes any real difference.

Another idea: in one of my initial PEP implementations, I exposed
gen.gi_execution_context (same for coroutines) to python as read/write
attribute. That allowed to

(a) get the execution context out of generator (for introspection or
other purposes);

(b) inject execution context for event loops; for instance
asyncio.Task could do that for some purpose.

Maybe this would be useful for someone who wants to mess with
generators and contexts.

[..]
>
> def autonomous_generator(gf):
> @functools.wraps(gf)
> def wrapper(*args, **kwds):
> gi = genfunc(*args, **kwds)
> gi.gi_back = gi.gi_frame
> return gi
> return wrapper

Nick, I still have to fully grasp the idea of `gi_back`, but one quick
thing: I specifically designed the PEP to avoid touching frames. The
current design only needs TLS and a little help from the
interpreter/core objects adjusting that TLS. It should be very
straightforward to implement the PEP in any interpreter (with JIT or
without) or compilers like Cython.

[..]
> Given that, you'd have the following initial states for "revert
> context" (currently called "isolated context" in the PEP):
>
> * unawaited coroutines: true (same as PEP)
> * awaited coroutines: false (same as PEP)
> * generators (both sync & async): false (opposite of current PEP)
> * autonomous generators: true (set "gi_revert_context" or
> "ag_revert_context" explicitly)

If generators do not isolate their context, then the example in the
Rationale section will not work as expected (or am I missing
something?). Fixing generators state leak was one of the main goals of
the PEP.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 12 August 2017 at 17:54, Nathaniel Smith  wrote:
> ...and now that I've written that down, I sort of feel like that might
> be what you want for all the other sorts of context object too? Like,
> here's a convoluted example:
>
> def gen():
> a = decimal.Decimal("1.111")
> b = decimal.Decimal("2.222")
> print(a + b)
> yield
> print(a + b)
>
> def caller():
> # let's pretend this context manager exists, the actual API is
> more complicated
> with decimal_context_precision(3):
> g = gen()
> with decimal_context_precision(2):
> next(g)
> with decimal_context_precision(1):
> next(g)
>
> Currently, this will print "3.3 3", because when the generator is
> resumed it inherits the context of the resuming site. With PEP 550, it
> would print "3.33 3.33" (or maybe "3.3 3.3"? it's not totally clear
> from the text), because it inherits the context when the generator is
> created and then ignores the calling context. It's hard to get strong
> intuitions, but I feel like the current behavior is actually more
> sensible -- each time the generator gets resumed, the next bit of code
> runs in the context of whoever called next(), and the generator is
> just passively inheriting context, so ... that makes sense.

Now that you raise this point, I think it means that generators need
to retain their current context inheritance behaviour, simply for
backwards compatibility purposes. This means that the case we need to
enable is the one where the generator *doesn't* dynamically adjust its
execution context to match that of the calling function.

One way that could work (using the cr_back/gi_back convention I suggested):

- generators start with gi_back not set
- if gi_back is NULL/None, gi.send() and gi.throw() set it to the
calling frame for the duration of the synchronous call and *don't*
adjust the execution context (i.e. the inverse of coroutine behaviour)
- if gi_back is already set, then gi.send() and gi.throw() *do* save
and restore the execution context around synchronous calls in to the
generator frame

To create an autonomous generator (i.e. one that didn't dynamically
update its execution context), you'd use a decorator like:

def autonomous_generator(gf):
@functools.wraps(gf)
def wrapper(*args, **kwds):
gi = genfunc(*args, **kwds)
gi.gi_back = gi.gi_frame
return gi
return wrapper

Asynchronous generators would then work like synchronous generators:
ag_back would be NULL/None by default, and dynamically set for the
duration of each __anext__ call. If you wanted to create an autonomous
one, you'd make it's back reference a circular reference to itself to
disable the implicit dynamic updates.

When I put it in those terms though, I think the
cr_back/gi_back/ag_back idea should actually be orthogonal to the
"revert_context" flag (so you can record the link back to the caller
even when maintaining an autonomous context).

Given that, you'd have the following initial states for "revert
context" (currently called "isolated context" in the PEP):

* unawaited coroutines: true (same as PEP)
* awaited coroutines: false (same as PEP)
* generators (both sync & async): false (opposite of current PEP)
* autonomous generators: true (set "gi_revert_context" or
"ag_revert_context" explicitly)

Open question: whether having "yield" inside a with statement implies
the creation of an autonomous generator (synchronous or otherwise), or
whether you'd need a decorator to get your context management right in
such cases.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Guido van Rossum
Thanks for the explanation. Can you make sure this is explained in the PEP?

On Aug 11, 2017 10:43 PM, "Yury Selivanov"  wrote:

> > On Fri, Aug 11, 2017 at 10:17 PM, Guido van Rossum 
> wrote:
> > > I may have missed this (I've just skimmed the doc), but what's the
> rationale
> > > for making the EC an *immutable* mapping? It's impressive that you
> managed
> > > to create a faster immutable dict, but why does the use case need one?
>
> > In this proposal, you have lots and lots of semantically distinct ECs.
> > Potentially every stack frame has its own (at least in async code). So
> > instead of copying the EC every time they create a new one, they want
> > to copy it when it's written to. This is a win if writes are
> > relatively rare compared to the creation of ECs.
>
> Correct. If we decide to use HAMT, the ratio of writes/reads becomes
> less important though.
>
> Yury
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 12 August 2017 at 15:45, Yury Selivanov  wrote:
> Thanks Eric!
>
> PEP 408 -- Standard library __preview__ package?

Typo in the PEP number: PEP 406, which was an ultimately failed
attempt to get away from the reliance on process globals to manage the
import system by encapsulating the top level state as an "Import
Engine": https://www.python.org/dev/peps/pep-0406/

We still like the idea in principle (hence the Withdrawn status rather
then being Rejected), but someone needs to find time to take a run at
designing a new version of it atop the cleaner PEP 451 import plugin
API (hence why the *specific* proposal in PEP 406 has been withdrawn).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 12 August 2017 at 08:37, Yury Selivanov  wrote:
> Hi,
>
> This is a new PEP to implement Execution Contexts in Python.
>
> The PEP is in-flight to python.org, and in the meanwhile can
> be read on GitHub:
>
> https://github.com/python/peps/blob/master/pep-0550.rst
>
> (it contains a few diagrams and charts, so please read it there.)

The fully rendered version is also up now:
https://www.python.org/dev/peps/pep-0550/

Thanks for this! The general approach looks good to me, so I just have
some questions about specifics of the API:

1. Are you sure you want to expose the CoW type to pure Python code?

The draft API looks fairly error prone to me, as I'm not sure of the
intended differences in behaviour between the following:

@contextmanager
def context(x):
old_x = sys.get_execution_context_item('x')
sys.set_execution_context_item('x', x)
try:
yield
finally:
sys.set_execution_context_item('x', old_x)

@contextmanager
def context(x):
old_x = sys.get_execution_context().get('x')
sys.get_execution_context()['x'] = x
try:
yield
finally:
sys.get_execution_context()['x'] = old_x

@contextmanager
def context(x):
ec = sys.get_execution_context()
old_x = ec.get('x')
ec['x'] = x
try:
yield
finally:
ec['x'] = old_x

It seems to me that everything would be a lot safer if the *only*
Python level API was a live dynamic view that completely hid the
copy-on-write behaviour behind an "ExecutionContextProxy" type, such
that the last two examples were functionally equivalent to each other
and to the current PEP's get/set functions (rendering the latter
redundant, and allowing it to be dropped from the PEP).

If Python code wanted a snapshot of the current state, it would need
to call sys.get_execution_context().copy(), which would give it a
plain dictionary containing a shallow copy of the execution context at
that particular point in time.

If there's a genuine need to expose the raw copy-on-write machinery to
Python level code (e.g. for asyncio's benefit), then that could be
more clearly marked as "here be dragons" territory that most folks
aren't going to want to touch (e.g. "sys.get_raw_execution_context()")

2. Do we need an ag_isolated_execution_context for asynchronous
generators? (Modify this question as needed for the answer to the next
question)

3. It bothers me that *_execution_context points to an actual
execution context, while *_isolated_execution_context is a boolean.
With names that similar I'd expect them to point to the same kind of
object.

Would it work to adjust that setting to say that rather than being an
"isolated/not isolated" boolean, we instead made it a cr_back reverse
pointer to the awaiting coroutine (akin to f_back in the frame stack),
such that we had a doubly-linked list that defined the coroutine call
stacks via their cr_await and cr_back attributes?

If we did that, we'd have:

  Top-level Task: cr_back -> NULL (C) or None (Python)
  Awaited coroutine: cr_back -> coroutine that awaited this one (which
would in turn have a cr_await reference back to here)

coroutine.send()/throw() would then save and restore the execution
context around the call if cr_back was NULL/None (equivalent to
isolated==True in the current PEP), and leave it alone otherwise
(equivalent to isolated==False).

For generators, gi_back would normally be NULL/None (since we don't
typically couple regular generators to a single managing object), but
could be set appropriately by types.coroutine when the generator-based
coroutine is awaited, and by contextlib.contextmanager before starting
the underlying generator. (It may even make sense to break the naming
symmetry for that attribute, and call it something like "gi_owner",
since generators don't form a clean await-based logical call chain the
way native coroutines do).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
Hi Yury,

This is really cool. Some notes on a first read:

1. Excellent work on optimizing dict, that seems valuable independent
of the rest of the details here.

2. The text doesn't mention async generators at all. I assume they
also have an agi_isolated_execution_context flag that can be set, to
enable @asyncontextmanager?

2a. Speaking of which I wonder if it's possible for async_generator to
emulate this flag... I don't know if this matters -- at this point the
main reason to use async_generator is for code that wants to support
PyPy. If PyPy gains native async generator support before CPython 3.7
comes out then async_generator may be entirely irrelevant before PEP
550 matters. But right now async_generator is still quite handy...

2b. BTW, the contextmanager trick is quite nice -- I actually noticed
last week that PEP 521 had a problem here, but didn't think of a
solution :-).

3. You're right that numpy is *very* performance sensitive about
accessing the context -- the errstate object is needed extremely
frequently, even on trivial operations like adding two scalars, so a
dict lookup is very noticeable. (Imagine adding a dict lookup to
float.__add__.) Right now, the errstate object get stored in the
threadstate dict, and then there are some dubious-looking hacks
involving a global (not thread-local) counter to let us skip the
lookup entirely if we think that no errstate object has been set.
Really what we ought to be doing (currently, in a non PEP 550 world)
is storing the errstate in a __thread variable -- it'd certainly be
worth it. Adopting PEP 550 would definitely be easier if we knew that
it wasn't ruling out that level of optimization.

4. I'm worried that all of your examples use string keys. One of the
great things about threading.local objects is that each one is a new
namespace, which is a honking great idea -- here it prevents
accidental collisions between unrelated libraries. And while it's
possible to implement threading.local in terms of the threadstate dict
(that's how they work now!), it requires some extremely finicky code
to get the memory management right:

https://github.com/python/cpython/blob/dadca480c5b7c5cf425d423316cd695bc5db3023/Modules/_threadmodule.c#L558-L595

It seems like you're imagining that this API will be used directly by
user code? Is that true? ...Are you sure that's a good idea? Are we
just assuming that not many keys will be used and the keys will
generally be immortal anyway, so leaking entries is OK? Maybe this is
nit-picking, but this is hooking into the language semantics in such a
deep way that I sorta feel like it would be bad to end up with
something where we can never get garbage collection right.

The suggested index-based API for super fast C lookup also has this
problem, but that would be such a low-level API -- and not part of the
language definition -- that the right answer is probably just to
document that there's no way to unallocate indices so any given C
library should only allocate, like... 1 of them. Maybe provide an
explicit API to release an index, if we really want to get fancy.

5. Is there some performance-related reason that the API for
getting/setting isn't just sys.get_execution_context()[...] = ...? Or
even sys.execution_context[...]?

5a. Speaking of which I'm not a big fan of the None-means-delete
behavior. Not only does Python have a nice standard way to describe
all the mapping operations without such hacks, but you're actually
implementing that whole interface anyway. Why not use it?

6. Should Thread.start inherit the execution context from the spawning thread?

7. Compatibility: it does sort of break 3rd party contextmanager
implementations (contextlib2, asyncio_extras's acontextmanager, trio's
internal acontextmanager, ...). This is extremely minor though.

8. You discuss how this works for asyncio and gevent. Have you looked
at how it will interact with tornado's context handling system? Can
they use this? It's the most important extant context implementation I
can think of (aside from thread local storage itself).

9. OK, my big question, about semantics.

The PEP's design is based on the assumption that all context-local
state is scalar-like, and contexts split but never join. But there are
some cases where this isn't true, in particular for values that have
"stack-like" semantics. These are terms I just made up, but let me
give some examples. Python's sys.exc_info is one. Another I ran into
recently is for trio's cancel scopes.

So basically the background is, in trio you can wrap a context manager
around any arbitrary chunk of code and then set a timeout or
explicitly cancel that code. It's called a "cancel scope". These are
fully nestable. Full details here:
https://trio.readthedocs.io/en/latest/reference-core.html#cancellation-and-timeouts

Currently, the implementation involves keeping a stack of cancel
scopes in Task-local storage. This works fine for regular async code
because when we switch Tasks, we also switch t

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Yury Selivanov
Thanks Eric!

PEP 408 -- Standard library __preview__ package?

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Yury Selivanov
> On Fri, Aug 11, 2017 at 10:17 PM, Guido van Rossum  wrote:
> > I may have missed this (I've just skimmed the doc), but what's the rationale
> > for making the EC an *immutable* mapping? It's impressive that you managed
> > to create a faster immutable dict, but why does the use case need one?

> In this proposal, you have lots and lots of semantically distinct ECs.
> Potentially every stack frame has its own (at least in async code). So
> instead of copying the EC every time they create a new one, they want
> to copy it when it's written to. This is a win if writes are
> relatively rare compared to the creation of ECs.

Correct. If we decide to use HAMT, the ratio of writes/reads becomes
less important though.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Yury Selivanov
[replying to the list]

> I may have missed this (I've just skimmed the doc), but what's the rationale 
> for making the EC an *immutable* mapping?

It's possible to implement Execution Context with a mutable mapping
and copy-on-write (as it's done in .NET) This is one of the approaches
that I tried and I discovered that it causes a bunch of subtle
inconsistencies in contexts for generators and coroutines. I've tried
to cover this here:
https://www.python.org/dev/peps/pep-0550/#copy-on-write-execution-context

All in all, I believe that the immutable mapping approach gives the
most predictable and easy to reason about model. If its performance on
large number of items in EC is a concern, I'll be happy to implement
it using HAMT (also covered in the PEP).

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Nathaniel Smith
On Fri, Aug 11, 2017 at 10:17 PM, Guido van Rossum  wrote:
> I may have missed this (I've just skimmed the doc), but what's the rationale
> for making the EC an *immutable* mapping? It's impressive that you managed
> to create a faster immutable dict, but why does the use case need one?

In this proposal, you have lots and lots of semantically distinct ECs.
Potentially every stack frame has its own (at least in async code). So
instead of copying the EC every time they create a new one, they want
to copy it when it's written to. This is a win if writes are
relatively rare compared to the creation of ECs.

You could probably optimize it a bit more by checking the refcnt
before writing, and skipping the copy if it's exactly 1. But even
simpler is to just always copy and throw away the old version.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Guido van Rossum
I may have missed this (I've just skimmed the doc), but what's the
rationale for making the EC an *immutable* mapping? It's impressive that
you managed to create a faster immutable dict, but why does the use case
need one?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Eric Snow
On Aug 11, 2017 16:38, "Yury Selivanov"  wrote:

Hi,

This is a new PEP to implement Execution Contexts in Python.


Nice!  I've had something like this on the back burner for a while as it
helps solve some problems with encapsulating the import state (e.g. PEP
408).

-eric
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Yury Selivanov
> This is exciting and I'm happy that you're addressing this problem.

Thank you!

> Some of our use cases can't be implemented using this PEP; notably, we use a 
> timing context that times how long an asynchronous function takes by 
> repeatedly pausing and resuming the timer.

Measuring performance of coroutines is a bit different kind of
problem. With PEP 550 you will be able to decouple context management
from collecting performance data. That would allow you to subclass
asyncio.Task (let's call it InstrumentedTask) and implement all extra
tracing functionality on it (by overriding its _send method for
example). Then you could set a custom task factory that would use
InstrumentedTask only for a fraction of requests. That would make it
possible to collect performance metrics even in production (my 2c).

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Jelle Zijlstra
This is exciting and I'm happy that you're addressing this problem.

We've solved a similar problem in our asynchronous programming framework,
asynq. Our solution (implemented at
https://github.com/quora/asynq/blob/master/asynq/contexts.py) is similar to
that in PEP 521: we enhance the context manager protocol with pause/resume
methods instead of using an enhanced form of thread-local state.

Some of our use cases can't be implemented using this PEP; notably, we use
a timing context that times how long an asynchronous function takes by
repeatedly pausing and resuming the timer. However, this timing context
adds significant overhead because we have to call the pause/resume methods
so often. Overall, your approach is almost certainly more performant.

2017-08-11 15:37 GMT-07:00 Yury Selivanov :

> Hi,
>
> This is a new PEP to implement Execution Contexts in Python.
>
> The PEP is in-flight to python.org, and in the meanwhile can
> be read on GitHub:
>
> https://github.com/python/peps/blob/master/pep-0550.rst
>
> (it contains a few diagrams and charts, so please read it there.)
>
> Thank you!
> Yury
>
>
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017
>
>
> Abstract
> 
>
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
>
> A few examples of where having a reliable state storage is required:
>
> * Context managers like decimal contexts, ``numpy.errstate``,
>   and ``warnings.catch_warnings``;
>
> * Storing request-related data such as security tokens and request
>   data in web applications;
>
> * Profiling, tracing, and logging in complex and large code bases.
>
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for isolating state of generators or
> asynchronous code because such code shares a single thread.
>
>
> Rationale
> =
>
> Traditionally a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
>
> def calculate(precision, ...):
> with decimal.localcontext() as ctx:
> # Set the precision for decimal calculations
> # inside this block
> ctx.prec = precision
>
> yield calculate_something()
> yield calculate_something_else()
>
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  The above code will
> not work correctly, if a user iterates over the ``calculate()``
> generator with different precisions in parallel::
>
> g1 = calculate(100)
> g2 = calculate(50)
>
> items = list(zip(g1, g2))
>
> # items[0] will be a tuple of:
> #   first value from g1 calculated with 100 precision,
> #   first value from g2 calculated with 50 precision.
> #
> # items[1] will be a tuple of:
> #   second value from g1 calculated with 50 precision,
> #   second value from g2 calculated with 50 precision.
>
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
>
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performance tracing data::
>
> async def handle_http_request(request):
> context.current_http_request = request
>
> await ...
> # Invoke your framework code, render templates,
> # make DB queries, etc, and use the global
> # 'current_http_request' in that code.
>
> # This isn't currently possible to do reliably
> # in asyncio out of the box.
>
> These examples are just a few out of many, where a reliable way to
> store context data is absolutely needed.
>
> The inability to use TLS for asynchronous code has lead to
> proliferation of ad-hoc solutions, limited to be supported only by
> code that was explicitly enabled to work with them.
>
> Current status quo is that any library, including the standard
> library, that uses a TLS, will likely not work as expected in
> asynchronous code or with generators (see [3]_ as an example issue.)
>
> Some languages that have coroutines or generators recommend to
> manually pass a ``context`` object to every function, see [1]_
> describing the pattern for

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Yury Selivanov
[duplicating my reply cc-ing python-ideas]

> Is a new EC type really needed? Cannot this be done with collections.ChainMap?

No, not really. ChainMap will have O(N) lookup performance where N
is the number of contexts you have in the chain. This will degrade
performance of lookups, which isn't acceptable for some potential
EC users like decimal/numpy/etc.

Inventing heuristics to manage the chain size is harder than making
an immutable dict (which is easy to reason about.)

Chaining contexts will also force then to reference each other, creating
cycles that GC won't be able to break.

Besides just performance considerations, with ChainMap design
of contexts it's not possible to properly isolate state changes
inside of generators or coroutines/tasks as it's done in the PEP.

All in all, I don't think that chaining can solve the problem. It will likely
lead to a more complicated solution in the end (this was my initial
approach FWIW).

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-11 Thread Antoine Rozo
Hi,

Is a new EC type really needed? Cannot this be done with
collections.ChainMap?

2017-08-12 0:37 GMT+02:00 Yury Selivanov :

> Hi,
>
> This is a new PEP to implement Execution Contexts in Python.
>
> The PEP is in-flight to python.org, and in the meanwhile can
> be read on GitHub:
>
> https://github.com/python/peps/blob/master/pep-0550.rst
>
> (it contains a few diagrams and charts, so please read it there.)
>
> Thank you!
> Yury
>
>
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017
>
>
> Abstract
> 
>
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
>
> A few examples of where having a reliable state storage is required:
>
> * Context managers like decimal contexts, ``numpy.errstate``,
>   and ``warnings.catch_warnings``;
>
> * Storing request-related data such as security tokens and request
>   data in web applications;
>
> * Profiling, tracing, and logging in complex and large code bases.
>
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for isolating state of generators or
> asynchronous code because such code shares a single thread.
>
>
> Rationale
> =
>
> Traditionally a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
>
> def calculate(precision, ...):
> with decimal.localcontext() as ctx:
> # Set the precision for decimal calculations
> # inside this block
> ctx.prec = precision
>
> yield calculate_something()
> yield calculate_something_else()
>
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  The above code will
> not work correctly, if a user iterates over the ``calculate()``
> generator with different precisions in parallel::
>
> g1 = calculate(100)
> g2 = calculate(50)
>
> items = list(zip(g1, g2))
>
> # items[0] will be a tuple of:
> #   first value from g1 calculated with 100 precision,
> #   first value from g2 calculated with 50 precision.
> #
> # items[1] will be a tuple of:
> #   second value from g1 calculated with 50 precision,
> #   second value from g2 calculated with 50 precision.
>
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
>
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performance tracing data::
>
> async def handle_http_request(request):
> context.current_http_request = request
>
> await ...
> # Invoke your framework code, render templates,
> # make DB queries, etc, and use the global
> # 'current_http_request' in that code.
>
> # This isn't currently possible to do reliably
> # in asyncio out of the box.
>
> These examples are just a few out of many, where a reliable way to
> store context data is absolutely needed.
>
> The inability to use TLS for asynchronous code has lead to
> proliferation of ad-hoc solutions, limited to be supported only by
> code that was explicitly enabled to work with them.
>
> Current status quo is that any library, including the standard
> library, that uses a TLS, will likely not work as expected in
> asynchronous code or with generators (see [3]_ as an example issue.)
>
> Some languages that have coroutines or generators recommend to
> manually pass a ``context`` object to every function, see [1]_
> describing the pattern for Go.  This approach, however, has limited
> use for Python, where we have a huge ecosystem that was built to work
> with a TLS-like context.  Moreover, passing the context explicitly
> does not work at all for libraries like ``decimal`` or ``numpy``,
> which use operator overloading.
>
> .NET runtime, which has support for async/await, has a generic
> solution of this problem, called ``ExecutionContext`` (see [2]_).
> On the surface, working with it is very similar to working with a TLS,
> but the former explicitly supports asynchronous code.
>
>
> Goals
> =
>
> The goal of this PEP is to provide a more reliable alternative to
> ``threading.local()``.  It should b