Re: [Python-ideas] PEP 550 v2

2017-08-23 Thread Guido van Rossum
OK, I get it now. I really liked the analysis of existing uses in Django.
So no worries about this.

On Wed, Aug 23, 2017 at 5:36 PM, Yury Selivanov 
wrote:

> There's another "major" problem with theading.local()-like API for PEP
> 550: C API.
>
> threading.local() in C right now is PyThreadState_GetDict(), which
> returns a dictionary for the current thread, that can be
> queried/modified with PyDict_* functions.  For PEP 550 this would not
> work.
>
> The advantage of the current ContextKey solution is that the Python
> API and C API are essentially the same: [1]
>
> Another advantage, is that ContextKey implements a better caching,
> because it can have only one value cached in it, see [2] for details.
>
> [1] https://www.python.org/dev/peps/pep-0550/#new-apis
> [2] https://www.python.org/dev/peps/pep-0550/#contextkey-get-cache
>
> Yury
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-23 Thread Yury Selivanov
There's another "major" problem with theading.local()-like API for PEP
550: C API.

threading.local() in C right now is PyThreadState_GetDict(), which
returns a dictionary for the current thread, that can be
queried/modified with PyDict_* functions.  For PEP 550 this would not
work.

The advantage of the current ContextKey solution is that the Python
API and C API are essentially the same: [1]

Another advantage, is that ContextKey implements a better caching,
because it can have only one value cached in it, see [2] for details.

[1] https://www.python.org/dev/peps/pep-0550/#new-apis
[2] https://www.python.org/dev/peps/pep-0550/#contextkey-get-cache

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-23 Thread Nathaniel Smith
On Wed, Aug 23, 2017 at 8:41 AM, Guido van Rossum  wrote:
> If we're extending the analogy with thread-locals we should at least
> consider making each instantiation return a namespace rather than something
> holding a single value. We have
>
> log_state = threading.local()
> log_state.verbose = False
>
> def action(x):
> if log_state.verbose:
> print(x)
>
> def make_verbose():
> log_state.verbose = True
>
> It would be nice if we could upgrade this to make it PEP 550-aware so that
> only the first line needs to change:
>
> log_state = sys.AsyncLocal("log state")
> # The rest is the same

You can mostly implement this on top of the current PEP 550. Something like:

_tombstone = object()

class AsyncLocal:
def __getattribute__(self, name):
# if this raises AttributeError, we let it propagate
key = object.__getattribute__(self, name)
value = key.get()
if value is _tombstone:
raise AttributeError(name)
return value

def __setattr__(self, name, value):
try:
key = object.__getattribute__(self, name)
except AttributeError:
with some_lock:
 # double-checked locking pattern
 try:
 key = object.__getattribute__(self, name)
 except AttributeError:
 key = new_context_key()
 object.__setattr__(self, name, key)
key.set(value)

def __delattr__(self, name):
self.__setattr__(name, _tombstone)

def __dir__(self):
# filter out tombstoned values
return [name for name in object.__dir__(self) if hasattr(self, name)]

Issues:

Minor problem: On threading.local you can use .__dict__ to get the
dict. That doesn't work here. But this could be done by returning a
mapping proxy type, or maybe it's better not to support at all -- I
don't think it's a big issue.

Major problem: An attribute setting/getting API doesn't give any way
to solve the save/restore problem [1]. PEP 550 v3 doesn't have a
solution to this yet either, but we know we can do it by adding some
methods to context-key. Supporting this in AsyncLocal is kinda
awkward, since you can't use methods on the object -- I guess you
could have some staticmethods, like
AsyncLocal.save_state(my_async_local, name) and
AsyncLocal.restore_state(my_async_local, name, value)? In any case
this kinda spoils the sense of like "oh it's just an object with
attributes, I already know how this works".

Major problem: There are two obvious implementations. The above uses a
separate ContextKey for each entry in the dict; the other way would be
to have a single ContextKey that holds a dict. They have subtly
different semantics. Suppose you have a generator and inside it you
assign to my_async_local.a but not to my_async_local.b, then yield,
and then the caller assigns to my_async_local.b. Is this visible
inside the generator? In the ContextKey-holds-an-attribute approach,
the answer is "yes": each AsyncLocal is a bag of independent
attributes. In the ContextKey-holds-a-dict approach, the answer is
"no": each AsyncLocal is a single container holding a single piece of
(complex) state. It isn't obvious to me which of these semantics is
preferable – maybe it is if you're Dutch :-). But there's a danger
that either option leaves a bunch of people confused.

(Tangent: in the ContextKey-holds-a-dict approach, currently you have
to copy the dict before mutating it every time, b/c PEP 550 currently
doesn't provide a way to tell whether the value returned by get() came
from the top of the stack, and thus is private to you and can be
mutated in place, or somewhere deeper, and thus is shared and
shouldn't be mutated. But we should fix that anyway, and anyway
copy-the-mutate is a viable approach.)

Observation: I don't think there's any simpler way to implement
AsyncLocal other than to start with machinery like what PEP 550
already proposes, and then layer something like the above on top of
it. We could potentially hide the layers inside the interpreter and
only expose AsyncLocal, but I don't think it really simplifies the
implementation any.

Observation: I feel like many users of threading.local -- possibly the
majority -- only put a single attribute on each object anyway, so for
those users a raw ContextKey API is actually more natural and faster.
For example, looking through the core django repo, I see thread locals
in

- django.utils.timezone._active
- django.utils.translation.trans_real._active
- django.urls.base._prefixes
- django.urls.base._urlconfs
- django.core.cache._caches
- django.urls.resolvers.RegexURLResolver._local
- django.contrib.gis.geos.prototypes.threadsafe.thread_context
- django.contrib.gis.geos.prototypes.io.thread_context
- django.db.utils.ConnectionHandler._connections

Of these 9 thread-local objects, 7 of them have only a single
attribute; only the last 2 use multiple attributes. For the first 4,
that 

Re: [Python-ideas] PEP 550 v2

2017-08-23 Thread Ethan Furman

On 08/23/2017 08:41 AM, Guido van Rossum wrote:


If we're extending the analogy with thread-locals we should at least consider 
making each instantiation return a
namespace rather than something holding a single value.


+1

--
~Ethan~
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-23 Thread Guido van Rossum
On Wed, Aug 23, 2017 at 2:00 AM, Nick Coghlan  wrote:

> On 21 August 2017 at 07:01, Barry  wrote:
> > I'm not clear why there is a new_context_key which seems not to be a key.
> > It seems that the object is a container for a single value.
> >
> > Key.set( value ) does not feel right.
>
> It's basically borrowed from procedural thread local APIs, which tend
> to use APIs like "tss_set(key, value)".
>
> That said, in a separate discussion, Caleb Hattingh mentioned C#'s
> AsyncLocal API, and it occurred to me that "context local" might work
> well as the name of the context access API:
>
> my_implicit_state = sys.new_context_local('my_state')
> my_implicit_state.set('spam')
>
> # Later, to access the value of my_implicit_state:
> print(my_implicit_state.get())
>
> That way, we'd have 3 clearly defined kinds of local variables:
>
> * frame locals (the regular kind)
> * thread locals (threading.locals() et al)
> * context locals (PEP 550)
>
> The fact contexts can be nested, and a failed lookup in the active
> implicit context may then query outer namespaces in the current
> execution context would then be directly analogous to the way name
> lookups are resolved for frame locals.


If we're extending the analogy with thread-locals we should at least
consider making each instantiation return a namespace rather than something
holding a single value. We have

log_state = threading.local()
log_state.verbose = False

def action(x):
if log_state.verbose:
print(x)

def make_verbose():
log_state.verbose = True

It would be nice if we could upgrade this to make it PEP 550-aware so that
only the first line needs to change:

log_state = sys.AsyncLocal("log state")
# The rest is the same

We might even support the alternative notation where you can provide
default values and suggest a schema, similar to to threading.local:

class LogState(threading.local):
verbose = False

log_state = LogState()

(I think that for calls that construct empty instances of various types we
should just use the class name rather than some factory function. I also
think none of this should live in sys but that's separate.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-23 Thread Nick Coghlan
On 21 August 2017 at 07:01, Barry  wrote:
> I'm not clear why there is a new_context_key which seems not to be a key.
> It seems that the object is a container for a single value.
>
> Key.set( value ) does not feel right.

It's basically borrowed from procedural thread local APIs, which tend
to use APIs like "tss_set(key, value)".

That said, in a separate discussion, Caleb Hattingh mentioned C#'s
AsyncLocal API, and it occurred to me that "context local" might work
well as the name of the context access API:

my_implicit_state = sys.new_context_local('my_state')
my_implicit_state.set('spam')

# Later, to access the value of my_implicit_state:
print(my_implicit_state.get())

That way, we'd have 3 clearly defined kinds of local variables:

* frame locals (the regular kind)
* thread locals (threading.locals() et al)
* context locals (PEP 550)

The fact contexts can be nested, and a failed lookup in the active
implicit context may then query outer namespaces in the current
execution context would then be directly analogous to the way name
lookups are resolved for frame locals.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-20 Thread Barry
I'm not clear why there is a new_context_key which seems not to be a key.
It seems that the object is a container for a single value.

Key.set( value ) does not feel right.

Container.set( value ) is fine.

Barry


> On 16 Aug 2017, at 00:55, Yury Selivanov  wrote:
> 
> Hi,
> 
> Here's the PEP 550 version 2.  Thanks to a very active and insightful
> discussion here on Python-ideas, we've discovered a number of
> problems with the first version of the PEP.  This version is a complete
> rewrite (only Abstract, Rationale, and Goals sections were not updated).
> 
> The updated PEP is live on python.org:
> https://www.python.org/dev/peps/pep-0550/
> 
> There is no reference implementation at this point, but I'm confident
> that this version of the spec will have the same extremely low
> runtime overhead as the first version.  Thanks to the new ContextItem
> design, accessing values in the context is even faster now.
> 
> Thank you!
> 
> 
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017, 15-Aug-2017
> 
> 
> Abstract
> 
> 
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
> 
> A few examples of where having a reliable state storage is required:
> 
> * Context managers like decimal contexts, ``numpy.errstate``,
>  and ``warnings.catch_warnings``;
> 
> * Storing request-related data such as security tokens and request
>  data in web applications, implementing i18n;
> 
> * Profiling, tracing, and logging in complex and large code bases.
> 
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for the purpose of state isolation
> for generators or asynchronous code, because such code executes
> concurrently in a single thread.
> 
> 
> Rationale
> =
> 
> Traditionally, a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
> 
>def calculate(precision, ...):
>with decimal.localcontext() as ctx:
># Set the precision for decimal calculations
># inside this block
>ctx.prec = precision
> 
>yield calculate_something()
>yield calculate_something_else()
> 
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  If a user iterates over
> the ``calculate()`` generator with different precisions one by one
> using a ``zip()`` built-in, the above code will not work correctly.
> For example::
> 
>g1 = calculate(precision=100)
>g2 = calculate(precision=50)
> 
>items = list(zip(g1, g2))
> 
># items[0] will be a tuple of:
>#   first value from g1 calculated with 100 precision,
>#   first value from g2 calculated with 50 precision.
>#
># items[1] will be a tuple of:
>#   second value from g1 calculated with 50 precision (!!!),
>#   second value from g2 calculated with 50 precision.
> 
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
> 
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performance tracing data::
> 
>async def handle_http_request(request):
>context.current_http_request = request
> 
>await ...
># Invoke your framework code, render templates,
># make DB queries, etc, and use the global
># 'current_http_request' in that code.
> 
># This isn't currently possible to do reliably
># in asyncio out of the box.
> 
> These examples are just a few out of many, where a reliable way to
> store context data is absolutely needed.
> 
> The inability to use TLS for asynchronous code has lead to
> proliferation of ad-hoc solutions, which are limited in scope and
> do not support all required use cases.
> 
> Current status quo is that any library, including the standard
> library, that uses a TLS, will likely not work as expected in
> asynchronous code or with generators (see [3]_ as an example issue.)
> 
> Some languages that have coroutines or generators recommend to
> manually pass a ``context`` object to every function, see [1]_
> describing the pattern for Go.  

Re: [Python-ideas] PEP 550 v2

2017-08-19 Thread Neil Girdhar
Cool to see this on python-ideas.  I'm really looking forward to this PEP 
550 or 521.

On Wednesday, August 16, 2017 at 3:19:29 AM UTC-4, Nathaniel Smith wrote:
>
> On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov  > wrote: 
> > Hi, 
> > 
> > Here's the PEP 550 version 2. 
>
> Awesome! 
>
> Some of the changes from v1 to v2 might be a bit confusing -- in 
> particular the thing where ExecutionContext is now a stack of 
> LocalContext objects instead of just being a mapping. So here's the 
> big picture as I understand it: 
>
> In discussions on the mailing list and off-line, we realized that the 
> main reason people use "thread locals" is to implement fake dynamic 
> scoping. Of course, generators/async/await mean that currently it's 
> impossible to *really* fake dynamic scoping in Python -- that's what 
> PEP 550 is trying to fix. So PEP 550 v1 essentially added "generator 
> locals" as a refinement of "thread locals". But... it turns out that 
> "generator locals" aren't enough to properly implement dynamic scoping 
> either! So the goal in PEP 550 v2 is to provide semantics strong 
> enough to *really* get this right. 
>
> I wrote up some notes on what I mean by dynamic scoping, and why 
> neither thread-locals nor generator-locals can fake it: 
>
> 
> https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope.ipynb 
>
> > Specification 
> > = 
> > 
> > Execution Context is a mechanism of storing and accessing data specific 
> > to a logical thread of execution.  We consider OS threads, 
> > generators, and chains of coroutines (such as ``asyncio.Task``) 
> > to be variants of a logical thread. 
> > 
> > In this specification, we will use the following terminology: 
> > 
> > * **Local Context**, or LC, is a key/value mapping that stores the 
> >   context of a logical thread. 
>
> If you're more familiar with dynamic scoping, then you can think of an 
> LC as a single dynamic scope... 
>
> > * **Execution Context**, or EC, is an OS-thread-specific dynamic 
> >   stack of Local Contexts. 
>
> ...and an EC as a stack of scopes. Looking up a ContextItem in an EC 
> proceeds by checking the first LC (innermost scope), then if it 
> doesn't find what it's looking for it checks the second LC (the 
> next-innermost scope), etc. 
>
> > ``ContextItem`` objects have the following methods and attributes: 
> > 
> > * ``.description``: read-only description; 
> > 
> > * ``.set(o)`` method: set the value to ``o`` for the context item 
> >   in the execution context. 
> > 
> > * ``.get()`` method: return the current EC value for the context item. 
> >   Context items are initialized with ``None`` when created, so 
> >   this method call never fails. 
>
> Two issues here, that both require some expansion of this API to 
> reveal a *bit* more information about the EC structure. 
>
> 1) For trio's cancel scope use case I described in the last, I 
> actually need some way to read out all the values on the LocalContext 
> stack. (It would also be helpful if there were some fast way to check 
> the depth of the ExecutionContext stack -- or at least tell whether 
> it's 1 deep or more-than-1 deep. I know that any cancel scopes that 
> are in the bottommost LC will always be attached to the given Task, so 
> I can set up the scope->task mapping once and re-use it indefinitely. 
> OTOH for scopes that are stored in higher LCs, I have to check at 
> every yield whether they're currently in effect. And I want to 
> minimize the per-yield workload as much as possible.) 
>
> 2) For classic decimal.localcontext context managers, the idea is 
> still that you save/restore the value, so that you can nest multiple 
> context managers without having to push/pop LCs all the time. But the 
> above API is not actually sufficient to implement a proper 
> save/restore, for a subtle reason: if you do 
>
> ci.set(ci.get()) 
>
> then you just (potentially) moved the value from a lower LC up to the top 
> LC. 
>

I agree with Nathaniel that this is an issue with the current API.  I don't 
think it's a good idea to have set and get methods.  It would be much 
better to reflect the underlying ExecutionContext *stack* in the API by 
exposing a mutating *context manager* on the Context Key object instead of 
set.  For example,


my_context = sys.new_context_key('my_context')

options = my_context.get()
options.some_mutating_method()

with my_context.mutate(options):
# Do whatever you want with the mutated context
# Now, the context is reverted.

Similarly, instead of 

my_context.set('spam')

you would do

with my_context.mutate('spam'):
# Do whatever you want with the mutated context
# Now, the context is reverted.

 

>
> Here's an example of a case where this can produce user-visible effects: 
>
>
> https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope-on-top-of-pep-550-draft-2.py
>  
>
> There are probably a bunch of options for fixing this. But basically 
> we need some API 

Re: [Python-ideas] PEP 550 v2

2017-08-19 Thread Nathaniel Smith
On Fri, Aug 18, 2017 at 6:25 PM, Ethan Furman  wrote:
> On 08/17/2017 02:40 AM, Nick Coghlan wrote:
>>
>> On 17 August 2017 at 04:38, Yury Selivanov wrote:
>
>
>>  ck.get_value() attempts to look up the value for that key in the
>> currently active execution context.
>>  If it doesn't find one, it then tries each of the execution
>> contexts in the currently active dynamic context.
>>  If it *still* doesn't find one, then it will set the default value
>> in the outermost execution context and then return that value.
>
>
> For what it's worth, I find the term DynamicContext much easier to
> understand with relation to these concepts.

I really like DynamicContext -- if you know the classic dynamic/static
terminology in language design then it works as a precise technical
description, but it also makes sense as plain non-technical English.
And it avoids the confusingly overloaded word "scope".

Apropos Guido's point about container naming, how about DynamicContext
and DynamicContextStack? That's only 3 letters longer than
ExecutionContext.

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-18 Thread Guido van Rossum
I'm also confused by these, because they share the noun part of their name,
but their use and meaning is quite different. The PEP defines an EC as a
stack of LCs, and (apart from strings :-) it's usually not a good idea to
use the same term for a container and its items.

On Fri, Aug 18, 2017 at 6:41 PM, Ethan Furman  wrote:

> On 08/16/2017 08:43 AM, Yury Selivanov wrote:
>
> To be honest, I really like Execution Context and Local Context names.
>> I'm curious if other people are confused with them.
>>
>
> +1 confused  :/
>
> --
> ~Ethan~
>
>
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-18 Thread Ethan Furman

On 08/16/2017 08:43 AM, Yury Selivanov wrote:


To be honest, I really like Execution Context and Local Context names.
I'm curious if other people are confused with them.


+1 confused  :/

--
~Ethan~

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-18 Thread Ethan Furman

On 08/17/2017 02:40 AM, Nick Coghlan wrote:

On 17 August 2017 at 04:38, Yury Selivanov wrote:



 ck.get_value() attempts to look up the value for that key in the
currently active execution context.
 If it doesn't find one, it then tries each of the execution
contexts in the currently active dynamic context.
 If it *still* doesn't find one, then it will set the default value
in the outermost execution context and then return that value.


For what it's worth, I find the term DynamicContext much easier to understand 
with relation to these concepts.

--
~Ethan~

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-18 Thread Yury Selivanov
On Fri, Aug 18, 2017 at 1:09 AM, Nick Coghlan  wrote:
> On 17 August 2017 at 01:22, Yury Selivanov  wrote:
>> On Wed, Aug 16, 2017 at 4:07 AM, Nick Coghlan  wrote:
 Coroutine Object Modifications
 ^^

 To achieve this, a small set of modifications to the coroutine object
 is needed:

 * New ``cr_local_context`` attribute.  This attribute is readable
   and writable for Python code.
>>>
>>> For ease of introspection, it's probably worth using a common
>>> `__local_context__` attribute name across all the different types that
>>> support one, and encouraging other object implementations to do the
>>> same.
>>>
>>> This isn't like cr_await and gi_yieldfrom, where we wanted to use
>>> different names because they refer to different kinds of objects.
>>
>> We also have cr_code and gi_code, which are used for introspection
>> purposes but refer to CodeObject.
>
> Right, hence https://bugs.python.org/issue31230 :)
>
> (That suggestion is prompted by the fact that if we'd migrated gi_code
> to __code__ in 3.0, the same way we migrated func_code, then cr_code
> and ag_code would almost certainly have followed the same
> dunder-naming convention, and
> https://github.com/python/cpython/pull/3077 would never have been
> necessary)
>
>> I myself don't like the mess the C-style convention created for our
>> Python code (think of what the "dis" and "inspect" modules have to go
>> through), so I'm +0 for having "__local_context__".
>
> I'm starting to think this should be __private_context__ (to convey
> the *intent* of the attribute), rather than naming it after the type
> that it's expected to store.

I've been thinking a lot about the terminology, and I have another
variant to consider:  ExecutionContext is a stack of LogicalContexts.
Coroutines/generators will thus have a __logical_context__ attribute.
I think that the "logical" term better conveys the meaning than
"private" or "dynamic".

>
> Thinking about this particular attribute name did prompt the question
> of how we want PEP 550 to interact with the exec builtin, though, as
> well as raising some questions around a number of other code execution
> cases:
>
> 1. What is the execution context for top level code in a module?

Whatever the execution context of the current thread that is importing
the code is. Which would usually be the main thread.

> 2. What is the execution context for the import machinery in an import
> statement?
> 3. What is the execution context for the import machinery when invoked
> via importlib?

Whatever the execution context that invoked the import machinery, be
it "__import__()" or "import" statement or "importlib.load_module"

> 4. What is the execution context for the import machinery when invoked
> via the C API?
> 5. What is the execution context for the import machinery when invoked
> via the runpy module?
> 6. What is the execution context for things like the timeit module,
> templating engines, etc?
> 7. What is the execution context for codecs and codec error handlers?
> 8. What is the execution context for __del__ methods and weakref callbacks?

In general, EC behaves just like TLS for all these cases, there's
literally no difference.

> 9. What is the execution context for trace hooks and other really low
> level machinery?
> 10. What is the execution context for displayhook and excepthook?

Speaking of sys.displayhook and sys.stdio -- this API is fundamentally
incompatible with PEP 550 or any possible context isolation.  These
things are essentially *global* variables in the sys module, and
there's tons of code out there that *expects* them to behave like
globals.  If a user changes displayhook they expect it to work across
all threads.

If we want to make displayhooks/sys.stdio to become context-aware we
will need new APIs for them with new properties/expectations.  Simply
forcing them to use execution context would be backwards incompatible.

PEP 550 won't try to change how displayhooks, excepthooks, trace
functions, sys.stdout etc work -- this is out of its scope.  We can't
refactor half of sys module as part of one PEP.

>
> I think a number of those (top level module code executed via the
> import system, the timeit module, templating engines) can be addressed
> by saying that the exec builtin always creates a completely fresh
> execution context by default (with no access to the parent's execution
> context), and will gain a new keyword-only parameter that allows you
> to specify an execution context to use. That way, exec'ed code will be
> independent by default, but users of exec() will be able to opt in to
> handing it like a normal function call by passing in the current
> context.

"exec" uses outer globals/locals if you don't pass them explicitly --
the code isn't isolated by default. Isolation for "exec" is opt-in:

   ]]] a = 1
   ]]] exec('print(a); b = 2')
   1
   ]]] b
   2


Re: [Python-ideas] PEP 550 v2

2017-08-18 Thread Yury Selivanov
On Fri, Aug 18, 2017 at 2:12 AM, Stefan Behnel  wrote:
> Nathaniel Smith schrieb am 16.08.2017 um 09:18:
>> On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov wrote:
>>> Here's the PEP 550 version 2.
>> Awesome!
>
> +1
>
>>> Backwards Compatibility
>>> ===
>>>
>>> This proposal preserves 100% backwards compatibility.
>>
>> While this is mostly true in the strict sense, in practice this PEP is
>> useless if existing thread-local users like decimal and numpy can't
>> migrate to it without breaking backcompat. So maybe this section
>> should discuss that?
>>
>> (For example, one constraint on the design is that we can't provide
>> only a pure push/pop API, even though that's what would be most
>> convenient context managers like decimal.localcontext or
>> numpy.errstate, because we also need to provide some backcompat story
>> for legacy functions like decimal.setcontext and numpy.seterr.)
>
> I agree with Nathaniel that many projects that can benefit from this
> feature will need to keep supporting older Python versions as well. In the
> case of Cython, that's Py2.6+. We already have the problem that the
> asynchronous finalisation of async generators cannot be supported in older
> Python versions ("old" as in Py3.5 and before), so we end up with a
> language feature that people can use in Py2.6, but not completely/safely.
>
> I can't say yet how difficult it will be to integrate the new
> infrastructure that this PEP proposes into a backwards compatible code
> base, but if there's something we can think of now in order to help
> projects keep supporting older Python versions in the same code base, given
> the constraints of their existing APIs and semantics - that would be great.

I think it's Cython's quest to try to backport support of all new
Python 3.x language features to be 2.6-compatible, which sometimes can
be questionable.  You can add support of PEP 550 semantics to code
that was compiled with Cython, but pure Python code won't be able to
support it.  This, in my opinion, could cause more confusion than
benefit, so for Cython I think the solution is to do nothing in this
case.

We'll (maybe) backport some functionality to contextlib2. In my
opinion, any code that uses contextlib2 in Python should work exactly
the same when it's compiled with Cython.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-18 Thread Nick Coghlan
On 18 August 2017 at 16:12, Stefan Behnel  wrote:
> Nathaniel Smith schrieb am 16.08.2017 um 09:18:
>> On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov wrote:
>>> Backwards Compatibility
>>> ===
>>>
>>> This proposal preserves 100% backwards compatibility.
>>
>> While this is mostly true in the strict sense, in practice this PEP is
>> useless if existing thread-local users like decimal and numpy can't
>> migrate to it without breaking backcompat. So maybe this section
>> should discuss that?
>>
>> (For example, one constraint on the design is that we can't provide
>> only a pure push/pop API, even though that's what would be most
>> convenient context managers like decimal.localcontext or
>> numpy.errstate, because we also need to provide some backcompat story
>> for legacy functions like decimal.setcontext and numpy.seterr.)
>
> I agree with Nathaniel that many projects that can benefit from this
> feature will need to keep supporting older Python versions as well. In the
> case of Cython, that's Py2.6+. We already have the problem that the
> asynchronous finalisation of async generators cannot be supported in older
> Python versions ("old" as in Py3.5 and before), so we end up with a
> language feature that people can use in Py2.6, but not completely/safely.
>
> I can't say yet how difficult it will be to integrate the new
> infrastructure that this PEP proposes into a backwards compatible code
> base, but if there's something we can think of now in order to help
> projects keep supporting older Python versions in the same code base, given
> the constraints of their existing APIs and semantics - that would be great.

One aspect of this that we're considering is to put the Python level
API in contextlib rather than in sys.

That has the pragmatic benefit that contextlib2 then becomes the
natural home for an API backport, and we should be able to get the
full *explicit* API working on older versions (even if it means
introducing an optional C extension module as a dependency to get that
part of the API working fully).

To backport the isolation of generators, we'd likely be able to
provide a decorator that explicitly isolated generators, but it
wouldn't be feasible to backport implicit isolation. The same would go
for the various other proposals for implicit isolation - when running
on older versions, the general principle would be "if you (or a
library/framework you're using) didn't explicitly isolate the
execution context, assume it's not isolated".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-18 Thread Stefan Behnel
Nathaniel Smith schrieb am 16.08.2017 um 09:18:
> On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov wrote:
>> Here's the PEP 550 version 2.
> Awesome!

+1

>> Backwards Compatibility
>> ===
>>
>> This proposal preserves 100% backwards compatibility.
> 
> While this is mostly true in the strict sense, in practice this PEP is
> useless if existing thread-local users like decimal and numpy can't
> migrate to it without breaking backcompat. So maybe this section
> should discuss that?
> 
> (For example, one constraint on the design is that we can't provide
> only a pure push/pop API, even though that's what would be most
> convenient context managers like decimal.localcontext or
> numpy.errstate, because we also need to provide some backcompat story
> for legacy functions like decimal.setcontext and numpy.seterr.)

I agree with Nathaniel that many projects that can benefit from this
feature will need to keep supporting older Python versions as well. In the
case of Cython, that's Py2.6+. We already have the problem that the
asynchronous finalisation of async generators cannot be supported in older
Python versions ("old" as in Py3.5 and before), so we end up with a
language feature that people can use in Py2.6, but not completely/safely.

I can't say yet how difficult it will be to integrate the new
infrastructure that this PEP proposes into a backwards compatible code
base, but if there's something we can think of now in order to help
projects keep supporting older Python versions in the same code base, given
the constraints of their existing APIs and semantics - that would be great.

Stefan

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-17 Thread Nick Coghlan
On 17 August 2017 at 01:22, Yury Selivanov  wrote:
> On Wed, Aug 16, 2017 at 4:07 AM, Nick Coghlan  wrote:
>>> Coroutine Object Modifications
>>> ^^
>>>
>>> To achieve this, a small set of modifications to the coroutine object
>>> is needed:
>>>
>>> * New ``cr_local_context`` attribute.  This attribute is readable
>>>   and writable for Python code.
>>
>> For ease of introspection, it's probably worth using a common
>> `__local_context__` attribute name across all the different types that
>> support one, and encouraging other object implementations to do the
>> same.
>>
>> This isn't like cr_await and gi_yieldfrom, where we wanted to use
>> different names because they refer to different kinds of objects.
>
> We also have cr_code and gi_code, which are used for introspection
> purposes but refer to CodeObject.

Right, hence https://bugs.python.org/issue31230 :)

(That suggestion is prompted by the fact that if we'd migrated gi_code
to __code__ in 3.0, the same way we migrated func_code, then cr_code
and ag_code would almost certainly have followed the same
dunder-naming convention, and
https://github.com/python/cpython/pull/3077 would never have been
necessary)

> I myself don't like the mess the C-style convention created for our
> Python code (think of what the "dis" and "inspect" modules have to go
> through), so I'm +0 for having "__local_context__".

I'm starting to think this should be __private_context__ (to convey
the *intent* of the attribute), rather than naming it after the type
that it's expected to store.

Thinking about this particular attribute name did prompt the question
of how we want PEP 550 to interact with the exec builtin, though, as
well as raising some questions around a number of other code execution
cases:

1. What is the execution context for top level code in a module?
2. What is the execution context for the import machinery in an import
statement?
3. What is the execution context for the import machinery when invoked
via importlib?
4. What is the execution context for the import machinery when invoked
via the C API?
5. What is the execution context for the import machinery when invoked
via the runpy module?
6. What is the execution context for things like the timeit module,
templating engines, etc?
7. What is the execution context for codecs and codec error handlers?
8. What is the execution context for __del__ methods and weakref callbacks?
9. What is the execution context for trace hooks and other really low
level machinery?
10. What is the execution context for displayhook and excepthook?

I think a number of those (top level module code executed via the
import system, the timeit module, templating engines) can be addressed
by saying that the exec builtin always creates a completely fresh
execution context by default (with no access to the parent's execution
context), and will gain a new keyword-only parameter that allows you
to specify an execution context to use. That way, exec'ed code will be
independent by default, but users of exec() will be able to opt in to
handing it like a normal function call by passing in the current
context. The default REPL, the code module and the IDLE shell window
would need to be updated so that they use a shared context for
evaluating the user supplied code snippets, while keeping their own
context separate.

While top-level code would always run in a completely fresh context
for imports, the runpy module would expose the same setting as the
exec builtin, so the executed code would be isolated by default, but
you could opt in to using a particular execution context if you wanted
to.

Codecs and codec error handlers I think will be best handled in a way
similar to generators, where they have their own private context (so
they can't alter the caller's context), but can *read* the caller's
context (so the context can be used as a way of providing
context-dependent codec settings).

That "read-only" access model also feels like the right option for the
import machinery - regardless of whether it's accessed via the import
statement, importlib, the C API, or the runpy module, the import
machinery should be able to *read* the dynamic context, but not make
persistent changes to it.

Since they can be executed at arbitrary points in the code, it feels
to me that __del__ methods and weakref callbacks should *always* be
executed in a completely pristine execution context, with no access
whatsoever to any thread's dynamic context.

I think we should leave the execution context alone for the really low
level hooks, and simply point out that yes, these have the ability to
do weird things to the execution context, just as they have the power
to do weird things to local variables, so they need to be handles with
care.

For displayhook and excepthook, I don't have a particularly strong
intuition, so my default recommendation would be the read-only access
proposed for generators, 

Re: [Python-ideas] PEP 550 v2

2017-08-17 Thread Nick Coghlan
On 17 August 2017 at 02:55, Yury Selivanov  wrote:
> And immediately after I hit "send" I realized that this is a bit more
> complicated.
>
> In order for Tasks to remember the full execution context of where
> they were created, we need a new method that would allow to run with
> *both* exec and local contexts:
>
> class Task:
>
>def __init__(self, coro):
>...
>self.local_context = sys.new_local_context()
>self.exec_context = sys.get_execution_context()
>
>def step():
>
>  sys.run_with_contexts(self.exec_context, self.local_context,
> self.coro.send)

I don't think that's entirely true, since you can nest the calls even
without a combined API:

sys.run_with_execution_context(self.exec_context,
sys.run_with_local_context, self.local_context, self.coro.send)

Offering a combined API may still make sense for usability and
efficiency reasons, but it isn't strictly necessary.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-17 Thread Nick Coghlan
On 17 August 2017 at 04:38, Yury Selivanov  wrote:
> On Wed, Aug 16, 2017 at 1:13 PM, Stefan Krah  wrote:
> While I'm trying to avoid using scoping terminology for PEP 550, there's
> one parallel -- as with regular Python scoping you have global variables
> and you have local variables.
>
> You can use the locals() to access to your local scope, and you can use
> globals() to access to your global scope.

To be honest, the difference between LocalContext and ExecutionContext
feels more like the difference between locals() and lexical closure
variables than it does the difference between between locals() and
globals().

It's just that where the scoping rules are a compile time thing
related to lexical closures, PEP 550 is about defining a dynamic
context.

> Similarly in PEP 550, you have your LocalContext and ExecutionContext.
> We don't want to call ExecutionContext a "Global Context" because
> it is fundamentally OS-thread-specific (contrary to Python globals).

In addition to it being different from the way the decimal module
already uses the phrase, one of the reasons I don't want to call it a
LocalContext is because doing so brings in the suggestion that it is
somehow connected to the locals() scope, and it isn't - there are
plenty of things (most notably, function calls) that will change the
active local namespace, but *won't* change the active execution
context.

> LocalContexts are created for threads, generators, coroutines and are
> really similar to local scoping.  Adding more names for local contexts
> like CoroutineLocalContext, GeneratorLocalContext won't solve anything
> either.  All in all, Local Context is what its name stands for -- it's a
> local context for your current logical scope, be it a coroutine or a
> generator.

But unlike locals() itself, it *isn't* linked to a specific frame of
execution - it's deliberately designed to be shared *between* frames.

If you don't like either of the ExecutionContext/ExecutionEnvironment
or ExecutionContext/ExecutionContextChain combinations, how would you
feel about ExecutionContext + DynamicContext?

Saying that "ck.set_value(value) sets the value corresponding to the
given context key in the currently active execution context" is still
my preferred terminology for setting values, and I think the following
would work well for reading values:

ck.get_value() attempts to look up the value for that key in the
currently active execution context.
If it doesn't find one, it then tries each of the execution
contexts in the currently active dynamic context.
If it *still* doesn't find one, then it will set the default value
in the outermost execution context and then return that value.

One thing I like about that phrasing is that we'd be using the word
dynamic in exactly the same sense that dynamic scoping uses it, and
the dynamic context mechanism would become PEP 550's counterpart to
the lexical closure support in Python's normal scoping rules.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-17 Thread Nick Coghlan
On 17 August 2017 at 02:36, Yury Selivanov  wrote:
> Yeah, this is tricky. The main issue is indeed the confusion of what
> methods you need to call -- "get/set" or
> "get_local_state/set_local_state".
>
> On some level the problem is very similar to regular Python scoping rules:
>
> 1. we have local hames
> 2. we have global names
> 3. we nave 'nonlocal' modifier
>
> IOW scoping isn't easy, and you need to be conscious of what you do.
> It's just that we are so used to these scoping rules that they have a
> low cognitive effort for us.
>
> One of the ideas that I have in mind is to add another level of
> indirection to separate "global get" from "local set/get":
>
> 1. Rename ContextItem to ContextKey (reasoning for that in parallel thread)
>
> 2. Remove ContextKey.set() method
>
> 3. Add a new ContextKey.value() -> ContextValue
>
> ck = ContextKey()
>
> with ck.value() as val:
> val.set(spam)
> yield
>
> or
>
>  val = ck.value()
>  val.set(spam)
>  try:
>   yield
>  finally:
>   val.clear()
>
> Essentially ContextValue will be the only API to set values in
> execution context. ContextKey.get() will be used to get them.
>
> Nathaniel, Nick, what do you guys think?

I think I don't want to have try to explain to anyone what happens if
I get a context value in my current execution environment and then
send that value reference into a different execution context :)

So I'd prefer my earlier proposal of:

# Resolve key in current execution environment
ck.get_value()
# Assign to key in current execution context
ck.set_value(value)
# Assign to key in specific execution context
sys.run_with_active_context(ec, ck.set_value, value)

One suggestion I do like is Stefan's one of using "ExecutionContext"
to refer to the namespace that ck.set_value() writes to, and then
"ExecutionEnvironment" for the whole chain that ck.get_value() reads.

Similar to "generator" and "package", we'd still end up with "context"
being inherently ambiguous when used without qualification:

- PEP 550 execution context
- exception handling context (for chained exceptions)
- with statement context
- various context objects, like the decimal context

But we wouldn't have two different kinds of context within PEP 550
itself. Instead, we'd have to start disambiguating the word
environment:

- PEP 550 execution environment
- process environment (i.e. os.environ)

The analogy between process environments and execution environments
wouldn't be exact (since the key-value pairs in process environments
are copied eagerly rather than via lazily chained lookups), but once
you account for that, the parallels between an operating system level
process environment tree and a Python level execution environment tree
as proposed in PEP 550 seem like they would be helpful rather than
confusing.

> [..]
>>> * ``sys.get_execution_context()`` function.  The function returns a
>>>   copy of the current EC: an ``ExecutionContext`` instance.
>>
>> If there are enough of these functions then it might make sense to
>> stick them in their own module instead of adding more stuff to sys. I
>> guess worrying about that can wait until the API details are more firm
>> though.
>
> I'm OK with this idea -- pystate.c becomes way too crowded.
>
> Maybe we should just put this stuff in _contextlib.c and expose in the
> contextlib module.

Yeah, I'd be OK with that - if we're going to reuse the word, it makes
sense to reuse the module to expose the related machinery.

That said, if we do go that way *and* we decide to offer a
coroutine-only backport, I see an offer of contextlib2
co-maintainership in your future ;)

>>>   * If ``coro.cr_local_context`` is an empty ``LocalContext`` object
>>> that ``coro`` was created with, the interpreter will set
>>> ``coro.cr_local_context`` to ``None``.
>>
>> I like all the ideas in this section, but this specific point feels a
>> bit weird. Coroutine objects need a second hidden field somewhere to
>> keep track of whether the object they end up with is the same one they
>> were created with?
>
> Yes, I planned to have a second hidden field, as Coroutines will have
> their cr_local_context set to NULL, and that will be their empty LC.
> So a second internal field is needed to disambiguate NULL -- meaning
> an "empty context" and NULL meaning "use outside local context".
>
> I omitted this from the PEP to make it a bit easier to digest, as this
> seemed to be a low-level implementation detail.

Given that the field is writable, I think it makes more sense to just
choose a suitable default, and then rely on other code changing that
default when its not right.

For generators: set it to an empty context by default, have
contextlib.contextmanager (and similar wrapper) clear it
For coroutines: set it to None by default, have async task managers
give top level coroutines their own private context

No hidden flags, no magic value 

Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 12:55 PM, Yury Selivanov
[..]
> And immediately after I hit "send" I realized that this is a bit more
> complicated.
>
> In order for Tasks to remember the full execution context of where
> they were created, we need a new method that would allow to run with
> *both* exec and local contexts:

Never mind, the actual implementation would be as simple as:

 class Task:

def __init__(self, coro):
...
coro.cr_local_context = sys.new_local_context()
self.exec_context = sys.get_execution_context()

def step():

  sys.run_with_execution_context(self.exec_contex , self.coro.send)

No need for another "run_with_context" function.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 4:12 PM, Antoine Pitrou  wrote:
>
>
> Hi,
>
>> * ``sys.get_execution_context()`` function.  The function returns a
>>   copy of the current EC: an ``ExecutionContext`` instance.
>
> Can you explain the requirement for it being a copy?

When the execution context is used to schedule a function call in a
thread, or an asyncio callback in the futures, we want to take a
snapshot of all items in the EC. In general the recommendation will be
to store immutable data in the context (same as in .NET EC
implementation, or whenever you have some potentially shared state).

>  What do you call a copy exactly?  Does it shallow-copy the stack or does it 
> deep copy the
> context items?

Execution Context is conceptually a stack of Local Contexts. Each
local context is a weak key mapping.  We need a shallow copy of the
EC, which is semantically equivalent to the below snippet:

   new_lc = {}
   for lc in execution_context:
  new_lc.update(lc)
   return ExecutionContext(new_lc)

>
>> * ``uint64_t PyThreadState->unique_id``: a globally unique
>>   thread state identifier (we can add a counter to
>>   ``PyInterpreterState`` and increment it when a new thread state is
>>   created.)
>
> How does this interact with sub-interpreters? (same question for rest of
> the PEP :-))

As long as PyThreadState_Get() works with sub-interpreters, all of the
PEP machinery will work too.

>
>> * O(N) for ``sys.get_execution_context()``, where ``N`` is the
>>   total number of items in the current **execution** context.
>
> Right... but if this is a simple list copy, we are talking about an
> extremely fast O(N):
>
 l = [None] * 1000
 %timeit l.copy()
> 3.76 µs ± 17.5 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)
>
> (what is "number of items"? number of local contexts? number of
> individual context items?)

"Number of items in the current **execution** context" =
   sum(len(local_context) for local_context in current_execution_context)

Yes, even though making a new list + merging all LCs is a relatively
fast operation, it will need to be performed on *every*
asyncio.call_soon and create_task.  The immutable stack/mappings
solution simply elminates the problem because you can just copy by
reference which is fast.

The #3 approach is implementable with regular dicts + copy() too, it
will be just slower in some cases (explained below).

>
>> We believe that approach #3 enables an efficient and complete Execution
> Context implementation, with excellent runtime performance.
>
> What about the maintenance and debugging cost, though?

Contrary to Python dicts, the implementation scope for hamt mapping is
much smaller -- we only need get, set, and merge operations. No split
dicts, no ordering, etc. With the help of fuzz-testing and out
ref-counting test mode I hope that we'll be able to catch most of the
bugs.

Any solution adds to the total debugging and maintenance cost, but I
believe that in this specific case, the benefits outweigh that cost:

1. Sometimes we'll need to merge many dicts in places like
asyncio.call_soon or async Task objects.

2. "set" operation might resize the dict, making it slower.

3. The "dict.copy()" optimization that the PEP mentions won't be able
to always help us, as we will likely need to often resize the dict.

>
>> Immutable mappings implemented with HAMT have O(log32N) performance
> for both set(), get(), and merge() operations, which is essentially O(1)
> for relatively small mappings
>
> But, for relatively small mappings, regular dicts would also be fast
> enough, right?

If all mappings are relatively small than the answer is close to "yes".

We might want to periodically "squash" (or merge or compact) the chain
of Local Contexts, in which case merging dicts will be more expensive
than merging hamt.

>
> It would be helpful for the PEP to estimate reasonable parameter sizes:
> - reasonable number of context items in a local context

I assume that the number of context items will be relatively low. It's
hard for me to imagine having more than a thousand of them.

> - reasonable number of local contexts in an execution stack

In a simple multi-threaded code we will only have one local context
per execution context.  Every time you run a generator or an
asynchronous task you push a local context to the stack.

Generators will have an optimization -- they will push NULL to the
stack and it will be a NULL until a generator writes to its local
context. It's possible to imagine a degenerative case when a generator
recurses in, say, a 'decimal context' with block, which can
potentially create a long chain of LCs.

Long chains of LCs are not a problem in general -- once the generator
is done, it pops its LCs, thus decreasing the stack size.

Long chains of LCs might become a problem if, deep into recursion, a
generator needs to capture the execution context (say it makes an
asyncio.call_soon() call).  In which case the solution is simple -- we

Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Stefan Krah
On Wed, Aug 16, 2017 at 12:40:26PM -0400, Yury Selivanov wrote:
> On Wed, Aug 16, 2017 at 12:08 PM, Stefan Krah  wrote:
> > On Wed, Aug 16, 2017 at 11:00:43AM -0400, Yury Selivanov wrote:
> >> "Context" is an established term for what PEP 550 tries to accomplish.
> >> It's used in multiple languages and runtimes, and while researching
> >> this topic I didn't see anybody confused with the concept on
> >> StackOverflow/etc.
> >
> > For me a context is a "single thing" that is usually used to thread state
> > through functions.
> >
> > I guess I'd call "environment" what you call "context".
> 
> "environment" is also an overloaded term, and when I hear it I usually
> think about os.getenv().

Yeah, I usually think about symbol tables.  FWIW, I find this terminology
quite reasonable:

https://hackernoon.com/execution-context-in-javascript-319dd72e8e2c


The main points are ExecutionContextStack/FunctionalExecutionContext

vs. ExecutionContext/LocalContext.



Stefan Krah



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 12:51 PM, Yury Selivanov
 wrote:
> On Wed, Aug 16, 2017 at 5:36 AM, Nick Coghlan  wrote:
>> On 16 August 2017 at 17:18, Nathaniel Smith  wrote:
>> [Yury wrote]
> [..]
   * If ``coro.cr_local_context`` is an empty ``LocalContext`` object
 that ``coro`` was created with, the interpreter will set
 ``coro.cr_local_context`` to ``None``.
>>>
>>> I like all the ideas in this section, but this specific point feels a
>>> bit weird. Coroutine objects need a second hidden field somewhere to
>>> keep track of whether the object they end up with is the same one they
>>> were created with?
>>
>> It feels odd to me as well, and I'm wondering if we can actually
>> simplify this by saying:
>>
>> 1. Generator contexts (both sync and async) are isolated by default
>> (__local_context__ = LocalContext())
>> 2. Coroutine contexts are *not* isolated by default (__local_context__ = 
>> None)
>>
>> Running top level task coroutines in separate execution contexts then
>> becomes the responsibility of the event loop, which the PEP already
>> lists as a required change in 3rd party libraries to get this all to
>> work properly.
>
> This is an interesting twist, and I like it.
>
> This will change asyncio.Task from:
>
> class Task:
>
>def __init__(self, coro):
>...
>self.exec_context = sys.get_execution_context()
>
>def step():
>
>  sys.run_with_execution_context(self.coro.send)
>
>
> to:
>
> class Task:
>
>def __init__(self, coro):
>...
>self.local_context = sys.new_local_context()
>
>def step():
>
>  sys.run_with_local_context(self.local_context, self.coro.send)
>
> And we don't need ceval to do anything for "await", which means that
> with this approach we won't touch ceval.c at all.


And immediately after I hit "send" I realized that this is a bit more
complicated.

In order for Tasks to remember the full execution context of where
they were created, we need a new method that would allow to run with
*both* exec and local contexts:

class Task:

   def __init__(self, coro):
   ...
   self.local_context = sys.new_local_context()
   self.exec_context = sys.get_execution_context()

   def step():

 sys.run_with_contexts(self.exec_context, self.local_context,
self.coro.send)

This is needed for the following PEP example to work properly:

current_request = sys.new_context_item(description='request')

async def child():
print('current request:', repr(current_request.get()))

async def handle_request(request):
current_request.set(request)
event_loop.create_task(child)

run(top_coro())

See https://www.python.org/dev/peps/pep-0550/#tasks

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 5:36 AM, Nick Coghlan  wrote:
> On 16 August 2017 at 17:18, Nathaniel Smith  wrote:
> [Yury wrote]
[..]
>>>   * If ``coro.cr_local_context`` is an empty ``LocalContext`` object
>>> that ``coro`` was created with, the interpreter will set
>>> ``coro.cr_local_context`` to ``None``.
>>
>> I like all the ideas in this section, but this specific point feels a
>> bit weird. Coroutine objects need a second hidden field somewhere to
>> keep track of whether the object they end up with is the same one they
>> were created with?
>
> It feels odd to me as well, and I'm wondering if we can actually
> simplify this by saying:
>
> 1. Generator contexts (both sync and async) are isolated by default
> (__local_context__ = LocalContext())
> 2. Coroutine contexts are *not* isolated by default (__local_context__ = None)
>
> Running top level task coroutines in separate execution contexts then
> becomes the responsibility of the event loop, which the PEP already
> lists as a required change in 3rd party libraries to get this all to
> work properly.

This is an interesting twist, and I like it.

This will change asyncio.Task from:

class Task:

   def __init__(self, coro):
   ...
   self.exec_context = sys.get_execution_context()

   def step():

 sys.run_with_execution_context(self.coro.send)


to:

class Task:

   def __init__(self, coro):
   ...
   self.local_context = sys.new_local_context()

   def step():

 sys.run_with_local_context(self.local_context, self.coro.send)

And we don't need ceval to do anything for "await", which means that
with this approach we won't touch ceval.c at all.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 12:08 PM, Stefan Krah  wrote:
> On Wed, Aug 16, 2017 at 11:00:43AM -0400, Yury Selivanov wrote:
>> "Context" is an established term for what PEP 550 tries to accomplish.
>> It's used in multiple languages and runtimes, and while researching
>> this topic I didn't see anybody confused with the concept on
>> StackOverflow/etc.
>
> For me a context is a "single thing" that is usually used to thread state
> through functions.
>
> I guess I'd call "environment" what you call "context".

"environment" is also an overloaded term, and when I hear it I usually
think about os.getenv().

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 3:18 AM, Nathaniel Smith  wrote:
> On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov  
> wrote:
>> Hi,
>>
>> Here's the PEP 550 version 2.
>
> Awesome!

Thanks!

[..]
>>
>> * **Local Context**, or LC, is a key/value mapping that stores the
>>   context of a logical thread.
>
> If you're more familiar with dynamic scoping, then you can think of an
> LC as a single dynamic scope...
>
>> * **Execution Context**, or EC, is an OS-thread-specific dynamic
>>   stack of Local Contexts.
>
> ...and an EC as a stack of scopes. Looking up a ContextItem in an EC
> proceeds by checking the first LC (innermost scope), then if it
> doesn't find what it's looking for it checks the second LC (the
> next-innermost scope), etc.

Yes. We touched upon this topic in parallel threads, so I'll just
briefly mention this here: I deliberately avoided using "scope" in PEP
550 naming, as "scoping" in Python is usually associated with
names/globals/locals/nonlocals etc.  Adding another "level" of scoping
will be very confusing for users (IMO).

>
>> ``ContextItem`` objects have the following methods and attributes:
>>
>> * ``.description``: read-only description;
>>
>> * ``.set(o)`` method: set the value to ``o`` for the context item
>>   in the execution context.
>>
>> * ``.get()`` method: return the current EC value for the context item.
>>   Context items are initialized with ``None`` when created, so
>>   this method call never fails.
>
> Two issues here, that both require some expansion of this API to
> reveal a *bit* more information about the EC structure.
>
> 1) For trio's cancel scope use case I described in the last, I
> actually need some way to read out all the values on the LocalContext
> stack. (It would also be helpful if there were some fast way to check
> the depth of the ExecutionContext stack -- or at least tell whether
> it's 1 deep or more-than-1 deep. I know that any cancel scopes that
> are in the bottommost LC will always be attached to the given Task, so
> I can set up the scope->task mapping once and re-use it indefinitely.
> OTOH for scopes that are stored in higher LCs, I have to check at
> every yield whether they're currently in effect. And I want to
> minimize the per-yield workload as much as possible.)

We can add an API for returning the full stack of values for a CI:

   ContextItem.iter_stack() -> Iterator
   # or
   ContextItem.get_stack() -> List

Because some of the LC will be empty, what you'll get is a list with
some None values in it, like:

   [None, val1, None, None, val2]

The length of the list will tell you how deep the stack is.

>
> 2) For classic decimal.localcontext context managers, the idea is
> still that you save/restore the value, so that you can nest multiple
> context managers without having to push/pop LCs all the time. But the
> above API is not actually sufficient to implement a proper
> save/restore, for a subtle reason: if you do
>
> ci.set(ci.get())
>
> then you just (potentially) moved the value from a lower LC up to the top LC.
>
> Here's an example of a case where this can produce user-visible effects:
>
> https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope-on-top-of-pep-550-draft-2.py
>
> There are probably a bunch of options for fixing this. But basically
> we need some API that makes it possible to temporarily set a value in
> the top LC, and then restore that value to what it was before (either
> the previous value, or 'unset' to unshadow a value in a lower LC). One
> simple option would be to make the idiom be something like:
>
> @contextmanager
> def local_value(new_value):
> state = ci.get_local_state()
> ci.set(new_value)
> try:
> yield
> finally:
> ci.set_local_state(state)
>
> where 'state' is something like a tuple (ci in EC[-1],
> EC[-1].get(ci)). A downside with this is that it's a bit error-prone
> (very easy for an unwary user to accidentally use get/set instead of
> get_local_state/set_local_state). But I'm sure we can come up with
> something.

Yeah, this is tricky. The main issue is indeed the confusion of what
methods you need to call -- "get/set" or
"get_local_state/set_local_state".

On some level the problem is very similar to regular Python scoping rules:

1. we have local hames
2. we have global names
3. we nave 'nonlocal' modifier

IOW scoping isn't easy, and you need to be conscious of what you do.
It's just that we are so used to these scoping rules that they have a
low cognitive effort for us.

One of the ideas that I have in mind is to add another level of
indirection to separate "global get" from "local set/get":

1. Rename ContextItem to ContextKey (reasoning for that in parallel thread)

2. Remove ContextKey.set() method

3. Add a new ContextKey.value() -> ContextValue

ck = ContextKey()

with ck.value() as val:
val.set(spam)
yield

or

 val = ck.value()
 val.set(spam)
 try:
  yield
 

Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Stefan Krah
On Thu, Aug 17, 2017 at 01:03:21AM +1000, Nick Coghlan wrote:
> For "ContextItem" for example, we may actually be better off calling
> it "ContextKey", and have the methods be "ck.get_value()" and
> "ck.set_value()". That would get us closer to the POSIX TSS
> terminology, and emphasises that the objects themselves are best seen
> as opaque references to a key that lets you get and set the
> corresponding value in the active execution context.

+1 for "key".  One is using a key to look up an item.


> Avoiding a naming collision with decimal.localcontext() would also be 
> desirable.
> 
> Yury, what do you think about moving the ExecutionContext name to what
> the PEP currently calls LocalContext, and renaming the current
> ExecutionContext type to ExecutionContextChain?

For me this is already a lot clearer. Otherwise I'd call it 
ExecutionEnvironment.



Stefan Krah



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Stefan Krah
On Wed, Aug 16, 2017 at 11:00:43AM -0400, Yury Selivanov wrote:
> "Context" is an established term for what PEP 550 tries to accomplish.
> It's used in multiple languages and runtimes, and while researching
> this topic I didn't see anybody confused with the concept on
> StackOverflow/etc.

For me a context is a "single thing" that is usually used to thread state
through functions.

I guess I'd call "environment" what you call "context".


> In C:
> 
>PyContextItem * _current_ctx = PyContext_NewItem("decimal context");
>if (_current_ctx == NULL) { /* error */ }
> 
># later when you set decimal context
>PyDecContextObject *ctx;
>...
>if (PyContext_SetItem(_current_ctx, (PyObject*)ctx)) { /* error */ }
> 
># whenever you need to get the current context
>PyDecContextObject *ctx = PyContext_GetItem(_current_ctx);
>if (ctx == NULL) { /* error */ }
>if (ctx == Py_None) { /* not initialized, nothing is there */ }

Thanks! This makes it a lot clearer.


I'd probably use (stealing Nick's key suggestion):

PyEnvKey *_current_context_key = PyEnv_NewKey("___DECIMAL_CONTEXT__");

...

PyDecContextObject *ctx = PyEnv_GetItem(_current_ctx_key);



Stefan Krah



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 11:03 AM, Nick Coghlan  wrote:
> On 17 August 2017 at 00:25, Stefan Krah  wrote:
>> Perhaps it would be possible to name the data structures by their 
>> functionality.
>> E.g. if ExecutionContext is a stack, use ExecutionStack?
>>
>> Or if the dynamic scope angle should be highlighted, perhaps ExecutionScope
>> or even DynamicScope.
>>
>> This sounds like bikeshedding, but I find it difficult to have 
>> ExecutionContext,
>> ContextItem, LocalContext in addition to the actual decimal.localcontext()
>> and PyDecContext.
>>
>> For example, should PyDecContext inherit from ContextItem?  I don't fully
>> understand. :-/
>
> Agreed, I don't think we have the terminology quite right yet.
>
> For "ContextItem" for example, we may actually be better off calling
> it "ContextKey", and have the methods be "ck.get_value()" and
> "ck.set_value()". That would get us closer to the POSIX TSS
> terminology, and emphasises that the objects themselves are best seen
> as opaque references to a key that lets you get and set the
> corresponding value in the active execution context.

With the confusion of what "empty ExecutionContext" and "ContextItem
is set to None by default", I tend to agree that "ContextKey" might be
a better name.

A default for "ContextKey" means something that will be returned if
the lookup failed, plain and simple.

>
> I do think we should stick with "context" rather than bringing dynamic
> scopes into the mix - while dynamic scoping *is* an accurate term for
> what we're doing at a computer science level, Python itself tends to
> reserve the term scoping for the way the compiler resolves names,
> which we're deliberately *not* touching here.

+1, I feel the same about this.

>
> Avoiding a naming collision with decimal.localcontext() would also be 
> desirable.

The ContextItem (or ContextKey) that decimal will be using will be an
implementation detail, and it must not be exposed to the public API of
the module.

>
> Yury, what do you think about moving the ExecutionContext name to what
> the PEP currently calls LocalContext, and renaming the current
> ExecutionContext type to ExecutionContextChain?

While I think that the naming issue is important, the API that will be
used most of the time is ContextItem.  That's the name in the
spotlight.

>
> The latter name then hints at the collections.ChainMap style behaviour
> of ck.get_value() lookups, without making any particular claims about
> what the internal implementation data structures actually are.
>
> The run methods could then be sys.run_with_context_chain() (to ignore
> the current context entirely and use a completely separate context
> chain) and sys.run_with_active_context() (to append a single execution
> context onto the end of the current context chain)

sys.run_with_context_chain and sys.run_with_active_context sound
*really* confusing to me.  Maybe it's because I spent too much time
thinking about the current PEP 550 naming.

To be honest, I really like Execution Context and Local Context names.
I'm curious if other people are confused with them.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 4:07 AM, Nick Coghlan  wrote:
> TLDR: I really like this version, and the tweaks I suggest below are
> just cosmetic.

Thanks, Nick!

> I figure if there are any major technical traps
> lurking, you'll find them as you work through updating the reference
> implementation.

FWIW I've implemented 3-5 different variations of PEP 550 (along with
HAMT) and I'm fairly confident that datastructures and optimizations
will work, so no major traps there are really expected.  The risk that
we need to manage now is getting the API design "right".

>
> On 16 August 2017 at 09:55, Yury Selivanov  wrote:
>> Context Item Object
>> ---
>>
>> The ``sys.new_context_item(description)`` function creates a
>> new ``ContextItem`` object.  The ``description`` parameter is a
>> ``str``, explaining the nature of the context key for introspection
>> and debugging purposes.
>>
>> ``ContextItem`` objects have the following methods and attributes:
>>
>> * ``.description``: read-only description;
>
> It may be worth having separate "name" and "description" attributes,
> similar to __name__ and __doc__ being separate on things like
> functions. That way, error messages can just show "name", while
> debuggers and other introspection tools can include a more detailed
> description.

Initially I wanted to have "sys.new_context_item(name)" signature, but
then I thought that some users might be confused what "name" actually
means.  In some contexts you might say that the "name" of the CI is
the name of the variable it is bound to, IOW, for "foo =
CI(name="bar")' the name is "foo". But some users might think that
it's "bar".

OTOH, PEP 550 doesn't have any introspection APIs at this point, and
the final version of it will have to have them.  If we add something
like "sys.get_execution_context_as_dict()", then it would be
preferable for CIs to have short name-like descriptions, as opposed to
multiline docstrings.

So in the end, I think that we should adopt a namedtuple solution, and
just make the first "ContextItem" parameter a positional-only "name":

   ContextItem(name: str, /)

>
>> Coroutine Object Modifications
>> ^^
>>
>> To achieve this, a small set of modifications to the coroutine object
>> is needed:
>>
>> * New ``cr_local_context`` attribute.  This attribute is readable
>>   and writable for Python code.
>
> For ease of introspection, it's probably worth using a common
> `__local_context__` attribute name across all the different types that
> support one, and encouraging other object implementations to do the
> same.
>
> This isn't like cr_await and gi_yieldfrom, where we wanted to use
> different names because they refer to different kinds of objects.

We also have cr_code and gi_code, which are used for introspection
purposes but refer to CodeObject.

I myself don't like the mess the C-style convention created for our
Python code (think of what the "dis" and "inspect" modules have to go
through), so I'm +0 for having "__local_context__".

>
>> Acknowledgments
>> ===
> [snip]
>
>> Thanks to Nick Coghlan for numerous suggestions and ideas on the
>> mailing list, and for coming up with a case that cause the complete
>> rewrite of the initial PEP version [19]_.
> [snip]
>
>> .. [19] 
>> https://mail.python.org/pipermail/python-ideas/2017-August/046780.html
>
> The threading in pipermail makes it difficult to get from your reply
> back to my original comment, so it may be better to link directly to
> the latter: 
> https://mail.python.org/pipermail/python-ideas/2017-August/046775.html
>
> And to be completely explicit about: I like your proposed approach of
> leaving it up to iterator developers to decide whether or not to run
> with a local context or not. If they don't manipulate any context
> items, it won't matter, and if they do, it's straightforward to add a
> suitable call to sys.run_in_local_context().

Fixed the link, and will update the Acknowledgments section with your
paragraph (thanks!)

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Nick Coghlan
On 17 August 2017 at 00:25, Stefan Krah  wrote:
> Perhaps it would be possible to name the data structures by their 
> functionality.
> E.g. if ExecutionContext is a stack, use ExecutionStack?
>
> Or if the dynamic scope angle should be highlighted, perhaps ExecutionScope
> or even DynamicScope.
>
> This sounds like bikeshedding, but I find it difficult to have 
> ExecutionContext,
> ContextItem, LocalContext in addition to the actual decimal.localcontext()
> and PyDecContext.
>
> For example, should PyDecContext inherit from ContextItem?  I don't fully
> understand. :-/

Agreed, I don't think we have the terminology quite right yet.

For "ContextItem" for example, we may actually be better off calling
it "ContextKey", and have the methods be "ck.get_value()" and
"ck.set_value()". That would get us closer to the POSIX TSS
terminology, and emphasises that the objects themselves are best seen
as opaque references to a key that lets you get and set the
corresponding value in the active execution context.

I do think we should stick with "context" rather than bringing dynamic
scopes into the mix - while dynamic scoping *is* an accurate term for
what we're doing at a computer science level, Python itself tends to
reserve the term scoping for the way the compiler resolves names,
which we're deliberately *not* touching here.

Avoiding a naming collision with decimal.localcontext() would also be desirable.

Yury, what do you think about moving the ExecutionContext name to what
the PEP currently calls LocalContext, and renaming the current
ExecutionContext type to ExecutionContextChain?

The latter name then hints at the collections.ChainMap style behaviour
of ck.get_value() lookups, without making any particular claims about
what the internal implementation data structures actually are.

The run methods could then be sys.run_with_context_chain() (to ignore
the current context entirely and use a completely separate context
chain) and sys.run_with_active_context() (to append a single execution
context onto the end of the current context chain)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 10:25 AM, Stefan Krah  wrote:
> On Wed, Aug 16, 2017 at 12:18:23AM -0700, Nathaniel Smith wrote:
>> > Here's the PEP 550 version 2.
>>
>> Awesome!
>>
>> Some of the changes from v1 to v2 might be a bit confusing -- in
>> particular the thing where ExecutionContext is now a stack of
>> LocalContext objects instead of just being a mapping. So here's the
>> big picture as I understand it:
>
> I'm still trying to digest this with very little time for it. It *is*
> slightly confusing.
>
>
> Perhaps it would be possible to name the data structures by their 
> functionality.
> E.g. if ExecutionContext is a stack, use ExecutionStack?
>
> Or if the dynamic scope angle should be highlighted, perhaps ExecutionScope
> or even DynamicScope.

I'm -1 on calling this thing a "scope" or "dynamic scope", as I think
it will be even more confusing to Python users. When I think of
"scoping" I usually think about Python name scopes -- locals, globals,
nonlocals, etc.  I'm afraid that adding another dimension to this
vocabulary won't help anyone.

"Context" is an established term for what PEP 550 tries to accomplish.
It's used in multiple languages and runtimes, and while researching
this topic I didn't see anybody confused with the concept on
StackOverflow/etc.

> This sounds like bikeshedding, but I find it difficult to have 
> ExecutionContext,
> ContextItem, LocalContext in addition to the actual decimal.localcontext()
> and PyDecContext.
>
>
> For example, should PyDecContext inherit from ContextItem?  I don't fully
> understand. :-/

No, you wouldn't be able to extend ContextItem type.

The way for decimal it so simply do the following:

In Python:
   _current_ctx = sys.ContextItem('decimal context')

   # later when you set decimal context
   _current_ctx.set(DecimalContext)

   # whenever you need to get the current context
   dc = _current_ctx.get()

In C:

   PyContextItem * _current_ctx = PyContext_NewItem("decimal context");
   if (_current_ctx == NULL) { /* error */ }

   # later when you set decimal context
   PyDecContextObject *ctx;
   ...
   if (PyContext_SetItem(_current_ctx, (PyObject*)ctx)) { /* error */ }

   # whenever you need to get the current context
   PyDecContextObject *ctx = PyContext_GetItem(_current_ctx);
   if (ctx == NULL) { /* error */ }
   if (ctx == Py_None) { /* not initialized, nothing is there */ }

We didn't really discuss C APIs at this point, and it's very likely
that they will be adjusted, but the general idea should stay the same.

All in all, the complexity of _decimal.c will only decrease with PEP
550, while getting better support for generators/async.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Yury Selivanov
On Wed, Aug 16, 2017 at 2:53 AM, Jelle Zijlstra
 wrote:
[..]
>>
>> The below is an example of how context items can be used::
>>
>> my_context = sys.new_context_item(description='mylib.context')
>> my_context.set('spam')
>
>
> Minor suggestion: Could we allow something like
> `sys.set_new_context_item(description='mylib.context',
> initial_value='spam')`? That would make it easier for type checkers to infer
> the type of a ContextItem, and it would save a line of code in the common
> case.
>
> With this modification, the type of new_context_item would be
>
> @overload
> def new_context_item(*, description: str, initial_value: T) ->
> ContextItem[T]: ...
> @overload
> def new_context_item(*, description: str) -> ContextItem[Any]: ...
>
> If we only allow the second variant, type checkers would need some sort of
> special casing to figure out that after .set(), .get() will return the same
> type.

I think that trying to infer the type of CI values by its default
value is not the way to go:

   ci = sys.ContextItem(default=1)

Is CI an int? Likely. Can it be set to None? Maybe, for some use-cases
it might be what you want.

The correct way IMO is to extend the typing module:

ci1: typing.ContextItem[int] = sys.ContextItem(default=1)
# ci1: is an int, and can't be anything else.

ci2: typing.ContextItem[typing.Optional[int]] = sys.ContextItem(default=42)
# ci2 is 42 by default, but can be reset to None.

ci3: typing.ContextItem[typing.Union[int, str]] =
sys.ContextItem(default='spam')
# ci3 can be an int or str, can't be None.

This is also forward compatible with proposals to add a
`default_factory` or `initializer` parameter to ContextItems.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Stefan Krah
On Wed, Aug 16, 2017 at 12:18:23AM -0700, Nathaniel Smith wrote:
> > Here's the PEP 550 version 2.
> 
> Awesome!
> 
> Some of the changes from v1 to v2 might be a bit confusing -- in
> particular the thing where ExecutionContext is now a stack of
> LocalContext objects instead of just being a mapping. So here's the
> big picture as I understand it:

I'm still trying to digest this with very little time for it. It *is*
slightly confusing.


Perhaps it would be possible to name the data structures by their functionality.
E.g. if ExecutionContext is a stack, use ExecutionStack?

Or if the dynamic scope angle should be highlighted, perhaps ExecutionScope
or even DynamicScope.


This sounds like bikeshedding, but I find it difficult to have ExecutionContext,
ContextItem, LocalContext in addition to the actual decimal.localcontext()
and PyDecContext.


For example, should PyDecContext inherit from ContextItem?  I don't fully
understand. :-/



Stefan Krah



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Nick Coghlan
On 16 August 2017 at 18:37, Nathaniel Smith  wrote:
> On Tue, Aug 15, 2017 at 11:53 PM, Jelle Zijlstra
>  wrote:
>> Minor suggestion: Could we allow something like
>> `sys.set_new_context_item(description='mylib.context',
>> initial_value='spam')`? That would make it easier for type checkers to infer
>> the type of a ContextItem, and it would save a line of code in the common
>> case.
>
> This is a really handy feature in general, actually! In fact all of
> asyncio's thread-locals define initial values (using a trick involving
> subclassing threading.local), and I recently added this feature to
> trio.TaskLocal as well just because it's so convenient.
>
> However, something that you realize almost immediately when trying to
> use this is that in many cases, what you actually want is an initial
> value *factory*. Like, if you write new_context_item(initial_value=[])
> then you're going to have a bad time. So, should we support something
> like new_context_item(initializer=lambda: [])?
>
> The semantics are a little bit subtle. I guess it would be something
> like: if ci.get() goes to find the value and fails at all levels, then
> we call the factory function and assign its return value to the
> *deepest* LC, EC[0]. The idea being that we're pretending that the
> value was there all along in the outermost scope, you just didn't
> notice before now.

I actually wondered about this in the context of the PEP saying that
"context items are set to None by default", as it isn't clear what
that means for the behaviour of sys.new_execution_context().

The PEP states that the latter API creates an "empty" execution
context, but the notion of a fresh EC being truly empty conflicts with
the notion of all defined config items having a default value of None.

I think your idea resolves that nicely: if context_item.get() failed
to find a suitable context entry, it would do:

base_context = ec.local_contexts[0]
default_value = sys.run_with_local_context(base_context,
self.default_factory)
sys.run_with_local_context(base_context, self.set, default_value)

The default setting for default_factory could then be to raise
RuntimeError complaining that the context item isn't set in the
current context.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Nick Coghlan
On 16 August 2017 at 17:18, Nathaniel Smith  wrote:
[Yury wrote]
>> For these purposes we add a set of new APIs (they will be used in
>> later sections of this specification):
>>
>> * ``sys.new_local_context()``: create an empty ``LocalContext``
>>   object.
>>
>> * ``sys.new_execution_context()``: create an empty
>>   ``ExecutionContext`` object.
>>
>> * Both ``LocalContext`` and ``ExecutionContext`` objects are opaque
>>   to Python code, and there are no APIs to modify them.
>>
>> * ``sys.get_execution_context()`` function.  The function returns a
>>   copy of the current EC: an ``ExecutionContext`` instance.
>
> If there are enough of these functions then it might make sense to
> stick them in their own module instead of adding more stuff to sys. I
> guess worrying about that can wait until the API details are more firm
> though.

I'm actually wondering if it may be worth defining a _contextlib
module (to export the interpreter level APIs to Python code), and
making contextlib the official home of the user facing API.

That we we can use contextlib2 to at least attempt to polyfill the
coroutine parts of the proposal for 3.5+, even if the implicit
generator changes are restricted to 3.7+ .

>>   * If ``coro.cr_local_context`` is an empty ``LocalContext`` object
>> that ``coro`` was created with, the interpreter will set
>> ``coro.cr_local_context`` to ``None``.
>
> I like all the ideas in this section, but this specific point feels a
> bit weird. Coroutine objects need a second hidden field somewhere to
> keep track of whether the object they end up with is the same one they
> were created with?

It feels odd to me as well, and I'm wondering if we can actually
simplify this by saying:

1. Generator contexts (both sync and async) are isolated by default
(__local_context__ = LocalContext())
2. Coroutine contexts are *not* isolated by default (__local_context__ = None)

Running top level task coroutines in separate execution contexts then
becomes the responsibility of the event loop, which the PEP already
lists as a required change in 3rd party libraries to get this all to
work properly.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Jelle Zijlstra
2017-08-16 10:37 GMT+02:00 Nathaniel Smith :

> On Tue, Aug 15, 2017 at 11:53 PM, Jelle Zijlstra
>  wrote:
> > Minor suggestion: Could we allow something like
> > `sys.set_new_context_item(description='mylib.context',
> > initial_value='spam')`? That would make it easier for type checkers to
> infer
> > the type of a ContextItem, and it would save a line of code in the common
> > case.
>
> This is a really handy feature in general, actually! In fact all of
> asyncio's thread-locals define initial values (using a trick involving
> subclassing threading.local), and I recently added this feature to
> trio.TaskLocal as well just because it's so convenient.
>
> However, something that you realize almost immediately when trying to
> use this is that in many cases, what you actually want is an initial
> value *factory*. Like, if you write new_context_item(initial_value=[])
> then you're going to have a bad time. So, should we support something
> like new_context_item(initializer=lambda: [])?
>
> The semantics are a little bit subtle. I guess it would be something
> like: if ci.get() goes to find the value and fails at all levels, then
> we call the factory function and assign its return value to the
> *deepest* LC, EC[0]. The idea being that we're pretending that the
> value was there all along in the outermost scope, you just didn't
> notice before now.
>
> > With this modification, the type of new_context_item would be
> >
> > @overload
> > def new_context_item(*, description: str, initial_value: T) ->
> > ContextItem[T]: ...
> > @overload
> > def new_context_item(*, description: str) -> ContextItem[Any]: ...
> >
> > If we only allow the second variant, type checkers would need some sort
> of
> > special casing to figure out that after .set(), .get() will return the
> same
> > type.
>
> I'm not super familiar with PEP 484.
>
> Would using a factory function instead of an initial value break this
> type inference?
>
> If you want to automatically infer that whatever type I use to
> initialize the value is the only type it can ever have, is there a way
> for users to easily override that? Like could I write something like
>
> my_ci: ContextItem[int, str] = new_context_item(initial_value=0)
>
> It would be `ContextItem[Union[int, str]]`, but yes, that should work.


> ?
>
> -n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Nathaniel Smith
On Tue, Aug 15, 2017 at 11:53 PM, Jelle Zijlstra
 wrote:
> Minor suggestion: Could we allow something like
> `sys.set_new_context_item(description='mylib.context',
> initial_value='spam')`? That would make it easier for type checkers to infer
> the type of a ContextItem, and it would save a line of code in the common
> case.

This is a really handy feature in general, actually! In fact all of
asyncio's thread-locals define initial values (using a trick involving
subclassing threading.local), and I recently added this feature to
trio.TaskLocal as well just because it's so convenient.

However, something that you realize almost immediately when trying to
use this is that in many cases, what you actually want is an initial
value *factory*. Like, if you write new_context_item(initial_value=[])
then you're going to have a bad time. So, should we support something
like new_context_item(initializer=lambda: [])?

The semantics are a little bit subtle. I guess it would be something
like: if ci.get() goes to find the value and fails at all levels, then
we call the factory function and assign its return value to the
*deepest* LC, EC[0]. The idea being that we're pretending that the
value was there all along in the outermost scope, you just didn't
notice before now.

> With this modification, the type of new_context_item would be
>
> @overload
> def new_context_item(*, description: str, initial_value: T) ->
> ContextItem[T]: ...
> @overload
> def new_context_item(*, description: str) -> ContextItem[Any]: ...
>
> If we only allow the second variant, type checkers would need some sort of
> special casing to figure out that after .set(), .get() will return the same
> type.

I'm not super familiar with PEP 484.

Would using a factory function instead of an initial value break this
type inference?

If you want to automatically infer that whatever type I use to
initialize the value is the only type it can ever have, is there a way
for users to easily override that? Like could I write something like

my_ci: ContextItem[int, str] = new_context_item(initial_value=0)

?

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Nick Coghlan
TLDR: I really like this version, and the tweaks I suggest below are
just cosmetic. I figure if there are any major technical traps
lurking, you'll find them as you work through updating the reference
implementation.

On 16 August 2017 at 09:55, Yury Selivanov  wrote:
> Context Item Object
> ---
>
> The ``sys.new_context_item(description)`` function creates a
> new ``ContextItem`` object.  The ``description`` parameter is a
> ``str``, explaining the nature of the context key for introspection
> and debugging purposes.
>
> ``ContextItem`` objects have the following methods and attributes:
>
> * ``.description``: read-only description;

It may be worth having separate "name" and "description" attributes,
similar to __name__ and __doc__ being separate on things like
functions. That way, error messages can just show "name", while
debuggers and other introspection tools can include a more detailed
description.

> Coroutine Object Modifications
> ^^
>
> To achieve this, a small set of modifications to the coroutine object
> is needed:
>
> * New ``cr_local_context`` attribute.  This attribute is readable
>   and writable for Python code.

For ease of introspection, it's probably worth using a common
`__local_context__` attribute name across all the different types that
support one, and encouraging other object implementations to do the
same.

This isn't like cr_await and gi_yieldfrom, where we wanted to use
different names because they refer to different kinds of objects.

> Acknowledgments
> ===
[snip]

> Thanks to Nick Coghlan for numerous suggestions and ideas on the
> mailing list, and for coming up with a case that cause the complete
> rewrite of the initial PEP version [19]_.
[snip]

> .. [19] https://mail.python.org/pipermail/python-ideas/2017-August/046780.html

The threading in pipermail makes it difficult to get from your reply
back to my original comment, so it may be better to link directly to
the latter: 
https://mail.python.org/pipermail/python-ideas/2017-August/046775.html

And to be completely explicit about: I like your proposed approach of
leaving it up to iterator developers to decide whether or not to run
with a local context or not. If they don't manipulate any context
items, it won't matter, and if they do, it's straightforward to add a
suitable call to sys.run_in_local_context().

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Nathaniel Smith
On Tue, Aug 15, 2017 at 4:55 PM, Yury Selivanov  wrote:
> Hi,
>
> Here's the PEP 550 version 2.

Awesome!

Some of the changes from v1 to v2 might be a bit confusing -- in
particular the thing where ExecutionContext is now a stack of
LocalContext objects instead of just being a mapping. So here's the
big picture as I understand it:

In discussions on the mailing list and off-line, we realized that the
main reason people use "thread locals" is to implement fake dynamic
scoping. Of course, generators/async/await mean that currently it's
impossible to *really* fake dynamic scoping in Python -- that's what
PEP 550 is trying to fix. So PEP 550 v1 essentially added "generator
locals" as a refinement of "thread locals". But... it turns out that
"generator locals" aren't enough to properly implement dynamic scoping
either! So the goal in PEP 550 v2 is to provide semantics strong
enough to *really* get this right.

I wrote up some notes on what I mean by dynamic scoping, and why
neither thread-locals nor generator-locals can fake it:

https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope.ipynb

> Specification
> =
>
> Execution Context is a mechanism of storing and accessing data specific
> to a logical thread of execution.  We consider OS threads,
> generators, and chains of coroutines (such as ``asyncio.Task``)
> to be variants of a logical thread.
>
> In this specification, we will use the following terminology:
>
> * **Local Context**, or LC, is a key/value mapping that stores the
>   context of a logical thread.

If you're more familiar with dynamic scoping, then you can think of an
LC as a single dynamic scope...

> * **Execution Context**, or EC, is an OS-thread-specific dynamic
>   stack of Local Contexts.

...and an EC as a stack of scopes. Looking up a ContextItem in an EC
proceeds by checking the first LC (innermost scope), then if it
doesn't find what it's looking for it checks the second LC (the
next-innermost scope), etc.

> ``ContextItem`` objects have the following methods and attributes:
>
> * ``.description``: read-only description;
>
> * ``.set(o)`` method: set the value to ``o`` for the context item
>   in the execution context.
>
> * ``.get()`` method: return the current EC value for the context item.
>   Context items are initialized with ``None`` when created, so
>   this method call never fails.

Two issues here, that both require some expansion of this API to
reveal a *bit* more information about the EC structure.

1) For trio's cancel scope use case I described in the last, I
actually need some way to read out all the values on the LocalContext
stack. (It would also be helpful if there were some fast way to check
the depth of the ExecutionContext stack -- or at least tell whether
it's 1 deep or more-than-1 deep. I know that any cancel scopes that
are in the bottommost LC will always be attached to the given Task, so
I can set up the scope->task mapping once and re-use it indefinitely.
OTOH for scopes that are stored in higher LCs, I have to check at
every yield whether they're currently in effect. And I want to
minimize the per-yield workload as much as possible.)

2) For classic decimal.localcontext context managers, the idea is
still that you save/restore the value, so that you can nest multiple
context managers without having to push/pop LCs all the time. But the
above API is not actually sufficient to implement a proper
save/restore, for a subtle reason: if you do

ci.set(ci.get())

then you just (potentially) moved the value from a lower LC up to the top LC.

Here's an example of a case where this can produce user-visible effects:

https://github.com/njsmith/pep-550-notes/blob/master/dynamic-scope-on-top-of-pep-550-draft-2.py

There are probably a bunch of options for fixing this. But basically
we need some API that makes it possible to temporarily set a value in
the top LC, and then restore that value to what it was before (either
the previous value, or 'unset' to unshadow a value in a lower LC). One
simple option would be to make the idiom be something like:

@contextmanager
def local_value(new_value):
state = ci.get_local_state()
ci.set(new_value)
try:
yield
finally:
ci.set_local_state(state)

where 'state' is something like a tuple (ci in EC[-1],
EC[-1].get(ci)). A downside with this is that it's a bit error-prone
(very easy for an unwary user to accidentally use get/set instead of
get_local_state/set_local_state). But I'm sure we can come up with
something.

> Manual Context Management
> -
>
> Execution Context is generally managed by the Python interpreter,
> but sometimes it is desirable for the user to take the control
> over it.  A few examples when this is needed:
>
> * running a computation in ``concurrent.futures.ThreadPoolExecutor``
>   with the current EC;
>
> * reimplementing generators with iterators (more on that later);
>
> * managing contexts in 

Re: [Python-ideas] PEP 550 v2

2017-08-16 Thread Jelle Zijlstra
2017-08-16 1:55 GMT+02:00 Yury Selivanov :

> Hi,
>
> Here's the PEP 550 version 2.  Thanks to a very active and insightful
> discussion here on Python-ideas, we've discovered a number of
> problems with the first version of the PEP.  This version is a complete
> rewrite (only Abstract, Rationale, and Goals sections were not updated).
>
> The updated PEP is live on python.org:
> https://www.python.org/dev/peps/pep-0550/
>
> There is no reference implementation at this point, but I'm confident
> that this version of the spec will have the same extremely low
> runtime overhead as the first version.  Thanks to the new ContextItem
> design, accessing values in the context is even faster now.
>
> Thank you!
>
>
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017, 15-Aug-2017
>
>
> Abstract
> 
>
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
>
> A few examples of where having a reliable state storage is required:
>
> * Context managers like decimal contexts, ``numpy.errstate``,
>   and ``warnings.catch_warnings``;
>
> * Storing request-related data such as security tokens and request
>   data in web applications, implementing i18n;
>
> * Profiling, tracing, and logging in complex and large code bases.
>
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for the purpose of state isolation
> for generators or asynchronous code, because such code executes
> concurrently in a single thread.
>
>
> Rationale
> =
>
> Traditionally, a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
>
> def calculate(precision, ...):
> with decimal.localcontext() as ctx:
> # Set the precision for decimal calculations
> # inside this block
> ctx.prec = precision
>
> yield calculate_something()
> yield calculate_something_else()
>
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  If a user iterates over
> the ``calculate()`` generator with different precisions one by one
> using a ``zip()`` built-in, the above code will not work correctly.
> For example::
>
> g1 = calculate(precision=100)
> g2 = calculate(precision=50)
>
> items = list(zip(g1, g2))
>
> # items[0] will be a tuple of:
> #   first value from g1 calculated with 100 precision,
> #   first value from g2 calculated with 50 precision.
> #
> # items[1] will be a tuple of:
> #   second value from g1 calculated with 50 precision (!!!),
> #   second value from g2 calculated with 50 precision.
>
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
>
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performance tracing data::
>
> async def handle_http_request(request):
> context.current_http_request = request
>
> await ...
> # Invoke your framework code, render templates,
> # make DB queries, etc, and use the global
> # 'current_http_request' in that code.
>
> # This isn't currently possible to do reliably
> # in asyncio out of the box.
>
> These examples are just a few out of many, where a reliable way to
> store context data is absolutely needed.
>
> The inability to use TLS for asynchronous code has lead to
> proliferation of ad-hoc solutions, which are limited in scope and
> do not support all required use cases.
>
> Current status quo is that any library, including the standard
> library, that uses a TLS, will likely not work as expected in
> asynchronous code or with generators (see [3]_ as an example issue.)
>
> Some languages that have coroutines or generators recommend to
> manually pass a ``context`` object to every function, see [1]_
> describing the pattern for Go.  This approach, however, has limited
> use for Python, where we have a huge ecosystem that was built to work
> with a TLS-like context.  Moreover, passing the context explicitly
> does not work at all for libraries like ``decimal``