Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
Hi Yury,

This is really cool. Some notes on a first read:

1. Excellent work on optimizing dict, that seems valuable independent
of the rest of the details here.

2. The text doesn't mention async generators at all. I assume they
also have an agi_isolated_execution_context flag that can be set, to
enable @asyncontextmanager?

2a. Speaking of which I wonder if it's possible for async_generator to
emulate this flag... I don't know if this matters -- at this point the
main reason to use async_generator is for code that wants to support
PyPy. If PyPy gains native async generator support before CPython 3.7
comes out then async_generator may be entirely irrelevant before PEP
550 matters. But right now async_generator is still quite handy...

2b. BTW, the contextmanager trick is quite nice -- I actually noticed
last week that PEP 521 had a problem here, but didn't think of a
solution :-).

3. You're right that numpy is *very* performance sensitive about
accessing the context -- the errstate object is needed extremely
frequently, even on trivial operations like adding two scalars, so a
dict lookup is very noticeable. (Imagine adding a dict lookup to
float.__add__.) Right now, the errstate object get stored in the
threadstate dict, and then there are some dubious-looking hacks
involving a global (not thread-local) counter to let us skip the
lookup entirely if we think that no errstate object has been set.
Really what we ought to be doing (currently, in a non PEP 550 world)
is storing the errstate in a __thread variable -- it'd certainly be
worth it. Adopting PEP 550 would definitely be easier if we knew that
it wasn't ruling out that level of optimization.

4. I'm worried that all of your examples use string keys. One of the
great things about threading.local objects is that each one is a new
namespace, which is a honking great idea -- here it prevents
accidental collisions between unrelated libraries. And while it's
possible to implement threading.local in terms of the threadstate dict
(that's how they work now!), it requires some extremely finicky code
to get the memory management right:

https://github.com/python/cpython/blob/dadca480c5b7c5cf425d423316cd695bc5db3023/Modules/_threadmodule.c#L558-L595

It seems like you're imagining that this API will be used directly by
user code? Is that true? ...Are you sure that's a good idea? Are we
just assuming that not many keys will be used and the keys will
generally be immortal anyway, so leaking entries is OK? Maybe this is
nit-picking, but this is hooking into the language semantics in such a
deep way that I sorta feel like it would be bad to end up with
something where we can never get garbage collection right.

The suggested index-based API for super fast C lookup also has this
problem, but that would be such a low-level API -- and not part of the
language definition -- that the right answer is probably just to
document that there's no way to unallocate indices so any given C
library should only allocate, like... 1 of them. Maybe provide an
explicit API to release an index, if we really want to get fancy.

5. Is there some performance-related reason that the API for
getting/setting isn't just sys.get_execution_context()[...] = ...? Or
even sys.execution_context[...]?

5a. Speaking of which I'm not a big fan of the None-means-delete
behavior. Not only does Python have a nice standard way to describe
all the mapping operations without such hacks, but you're actually
implementing that whole interface anyway. Why not use it?

6. Should Thread.start inherit the execution context from the spawning thread?

7. Compatibility: it does sort of break 3rd party contextmanager
implementations (contextlib2, asyncio_extras's acontextmanager, trio's
internal acontextmanager, ...). This is extremely minor though.

8. You discuss how this works for asyncio and gevent. Have you looked
at how it will interact with tornado's context handling system? Can
they use this? It's the most important extant context implementation I
can think of (aside from thread local storage itself).

9. OK, my big question, about semantics.

The PEP's design is based on the assumption that all context-local
state is scalar-like, and contexts split but never join. But there are
some cases where this isn't true, in particular for values that have
"stack-like" semantics. These are terms I just made up, but let me
give some examples. Python's sys.exc_info is one. Another I ran into
recently is for trio's cancel scopes.

So basically the background is, in trio you can wrap a context manager
around any arbitrary chunk of code and then set a timeout or
explicitly cancel that code. It's called a "cancel scope". These are
fully nestable. Full details here:
https://trio.readthedocs.io/en/latest/reference-core.html#cancellation-and-timeouts

Currently, the implementation involves keeping a stack of cancel
scopes in Task-local storage. This works fine for regular async code
because when we switch Tasks, we also switch t

[Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Stefan Krah

Yury Selivanov wrote:

> This is a new PEP to implement Execution Contexts in Python.

The idea is of course great!


A couple of issues for decimal:

> Moreover, passing the context explicitly does not work at all for
> libraries like ``decimal`` or ``numpy``, which use operator overloading.

Instead of "with localcontext() ...", each coroutine can create a new
Context() and use its methods, without any loss of functionality.

All one loses is the inline operator syntax sugar.

I'm aware you know all this, but the entire decimal paragraph sounds a bit
as if this option did not exist.



> Fast C API for packages like ``decimal`` and ``numpy``.

_decimal relies on caching the most recently used thread-local context,
which gives a speedup of about 25% for inline operators:

https://github.com/python/cpython/blob/master/Modules/_decimal/_decimal.c#L1639


Can this speed be achieved with the execution contexts? IOW, can the lookup
of an excecution context be as fast as PyThreadState_GET()?



Stefan Krah



___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Towards harmony with JavaScript?

2017-08-12 Thread Alberto Berti
> "Chris" == Chris Angelico  writes:

Chris> On Sat, Aug 12, 2017 at 6:31 AM, Alberto Berti 
 wrote:

>> As of now, I do nothing. As I said, the goal of the tool is not to
>> shield you from JS, for this reason it's not meant for beginners (in
>> both JS or Python). You always manipulate JS objects, but allows you to
>> to be naive on all that plethora of JS idiosyncrasies (from a Python pow
>> at least) that you have to think about when you frequently switch from
>> python to js.

Chris> Do you "retain most of Python language semantics", or do you "always
Chris> manipulate JS objects"? As shown in a previous post, there are some
Chris> subtle and very dangerous semantic differences between the languages.
Chris> You can't have it both ways.

that's right you can't have both ways. That's the difficult decision to
make, because as you add more and more Python APIs to those supported,
probably you'll end up creating your "Python island in JS" where you
need to transform the objects you manipulate from/to JS on the functions
that are called by external JS code (either manually or automatically).

And on the other end, if don't add any Pythonic API you will end up with
ugly Python code that yes, is valid Python code, but that's nothing you
would like to see.

JavaScripthon was and is an experiment to see how much of the "Pythonic
way of expressing alogrithms" can be retained adding as less "runtime"
as possible. That's the reason why it targets ES6+ JavaScript, because
the "point of contacts" between the two languages are much higher in
number.

As an example let's take the following simple code:

  def test():
  a = 'foo'   
  d = {a: 'bar'}
  return d[a]

one can naively translate it to:

  function test() {
  var a, d;
  a = 'foo';
  d = {a: 'bar'};
  return d[a];
  }

but it returs 'bar' in Python and undefined in JS. But even if it's just
a simple case expressed in a four lines function, it one of those things
that can slip through when coding in both languages at the same time (at
least for me). So I asked myself if it was worthwhile to have a tool
that:

* allows me to use Python syntax to write some amount of JS code. I'm
  more accustomed to Python syntax and I like it more. It's generally
  more terse and has less distractions (like variable declarations and
  line terminations);

* fixes as many as possible of this things automatically, without having
  to precisely remember that this is a "corner case" in JS and that must
  be handled with care (so that reduces the "context-switching" effort);

* produces a good looking JS code that's still possible to read and
  follow without much trouble.

How many "corner cases" like this there are in JS? In my coding
experience, "thanks" to the fact that JS is much less "harmonious" than
Python ( my opinion ) I've encountered many of those, and also there are
many simple Python coding habits that are translatable in a simple
way. So what the tools does in this case?

 $ pj -s -
   def test():
   a = 'foo'   
   d = {a: 'bar'}
   return d[a]

 function test() {
 var a, d;
 a = "foo";
 d = {[a]: "bar"};
 return d[a];
 }

It turns out that ES6 has a special notation for wath JS calls "computed
property names", keys in object literals that aren't strings

does it evaluates the way a Python developer expects when run? let's see

$ pj -s - -e
  def test():
  a = 'foo'   
  d = {a: 'bar'}
  return d[a]
  test()
  
bar

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Towards harmony with JavaScript?

2017-08-12 Thread Alberto Berti
> "Carl" == Carl Smith  writes:

Carl> Using lambdas doesn't solve the problem. I just kept the example 
short, but
Carl> had I used more than one expression in each function, you'd be back to
Carl> square one. You took advantage of the brevity of the example, but 
it's not
Carl> realistic.

Let me elaborate more on this...

yes, i took "advantage" of the brevity of your example, but there's a
another side of it. In my JS coding I usually avoid non trivial
anonymous functions in real applications. The reason is that if an error
happens inside an anonymous function and maybe it was the last one in a
series of anonymous functions the stack trace of that error will end up
with references like "in anonymous function at line xy of 'foo.js'" and
that doesn't allows me get a first idea of what the code was doing when
the error was thrown.

That's why I don't like them and why I don't have a great opinion of
large codebases making extensive usage of them.

It also appears to me that the trend in some (relevant) part of the JS
community if to refrain from use them when possible towards a more
structured approach to coding that resembles more of a class based
componentization, like in react.

cheers,

Alberto

___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 12 August 2017 at 08:37, Yury Selivanov  wrote:
> Hi,
>
> This is a new PEP to implement Execution Contexts in Python.
>
> The PEP is in-flight to python.org, and in the meanwhile can
> be read on GitHub:
>
> https://github.com/python/peps/blob/master/pep-0550.rst
>
> (it contains a few diagrams and charts, so please read it there.)

The fully rendered version is also up now:
https://www.python.org/dev/peps/pep-0550/

Thanks for this! The general approach looks good to me, so I just have
some questions about specifics of the API:

1. Are you sure you want to expose the CoW type to pure Python code?

The draft API looks fairly error prone to me, as I'm not sure of the
intended differences in behaviour between the following:

@contextmanager
def context(x):
old_x = sys.get_execution_context_item('x')
sys.set_execution_context_item('x', x)
try:
yield
finally:
sys.set_execution_context_item('x', old_x)

@contextmanager
def context(x):
old_x = sys.get_execution_context().get('x')
sys.get_execution_context()['x'] = x
try:
yield
finally:
sys.get_execution_context()['x'] = old_x

@contextmanager
def context(x):
ec = sys.get_execution_context()
old_x = ec.get('x')
ec['x'] = x
try:
yield
finally:
ec['x'] = old_x

It seems to me that everything would be a lot safer if the *only*
Python level API was a live dynamic view that completely hid the
copy-on-write behaviour behind an "ExecutionContextProxy" type, such
that the last two examples were functionally equivalent to each other
and to the current PEP's get/set functions (rendering the latter
redundant, and allowing it to be dropped from the PEP).

If Python code wanted a snapshot of the current state, it would need
to call sys.get_execution_context().copy(), which would give it a
plain dictionary containing a shallow copy of the execution context at
that particular point in time.

If there's a genuine need to expose the raw copy-on-write machinery to
Python level code (e.g. for asyncio's benefit), then that could be
more clearly marked as "here be dragons" territory that most folks
aren't going to want to touch (e.g. "sys.get_raw_execution_context()")

2. Do we need an ag_isolated_execution_context for asynchronous
generators? (Modify this question as needed for the answer to the next
question)

3. It bothers me that *_execution_context points to an actual
execution context, while *_isolated_execution_context is a boolean.
With names that similar I'd expect them to point to the same kind of
object.

Would it work to adjust that setting to say that rather than being an
"isolated/not isolated" boolean, we instead made it a cr_back reverse
pointer to the awaiting coroutine (akin to f_back in the frame stack),
such that we had a doubly-linked list that defined the coroutine call
stacks via their cr_await and cr_back attributes?

If we did that, we'd have:

  Top-level Task: cr_back -> NULL (C) or None (Python)
  Awaited coroutine: cr_back -> coroutine that awaited this one (which
would in turn have a cr_await reference back to here)

coroutine.send()/throw() would then save and restore the execution
context around the call if cr_back was NULL/None (equivalent to
isolated==True in the current PEP), and leave it alone otherwise
(equivalent to isolated==False).

For generators, gi_back would normally be NULL/None (since we don't
typically couple regular generators to a single managing object), but
could be set appropriately by types.coroutine when the generator-based
coroutine is awaited, and by contextlib.contextmanager before starting
the underlying generator. (It may even make sense to break the naming
symmetry for that attribute, and call it something like "gi_owner",
since generators don't form a clean await-based logical call chain the
way native coroutines do).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 12 August 2017 at 15:45, Yury Selivanov  wrote:
> Thanks Eric!
>
> PEP 408 -- Standard library __preview__ package?

Typo in the PEP number: PEP 406, which was an ultimately failed
attempt to get away from the reliance on process globals to manage the
import system by encapsulating the top level state as an "Import
Engine": https://www.python.org/dev/peps/pep-0406/

We still like the idea in principle (hence the Withdrawn status rather
then being Rejected), but someone needs to find time to take a run at
designing a new version of it atop the cleaner PEP 451 import plugin
API (hence why the *specific* proposal in PEP 406 has been withdrawn).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Towards harmony with JavaScript?

2017-08-12 Thread Nick Coghlan
On 12 August 2017 at 06:10, Chris Barker  wrote:
>
>> > Taking this off the list as it's no longer on topic.
>
>
> not totally -- I'm going to add my thoughts:
>
> 1) If you want a smoother transition between server-side Python and
> in-browser code, maybe you're  better off using one of the "python in the
> browser" solutions -- there are at least a few viable ones.

More experimentally, there's also toga's "web" backend (which allows
you to take an application you developed with the primary intention of
running it as a rich client application on mobile or desktop devices,
and instead publishing it as a Django web application with a
JavaScript frontend).

Essentially, the relationship we see between Python and JavaScript is
similar to the one that exists between Python and C/C++/Rust/Go/etc,
just on the side that sits between the Python code and the GUI, rather
than between the Python code and the compute & storage systems.

As such, there are various libraries and transpilers that are designed
to handle writing the JavaScript *for* you (bokeh, toga,
JavaScripthon, etc), and the emergence of WASM as a frontend
equivalent to machine code on the backend is only going to make the
similarities in those dynamics more pronounced.

In that vein, it's highly *un*likely we'd add any redundant constructs
to Python purely to make it easier for JS developers to use JS idioms
in Python instead of Pythonic ones, but JavaScript *is* one of the
languages we look at for syntactic consistency when considering
potential new additions to Python.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Guido van Rossum
Thanks for the explanation. Can you make sure this is explained in the PEP?

On Aug 11, 2017 10:43 PM, "Yury Selivanov"  wrote:

> > On Fri, Aug 11, 2017 at 10:17 PM, Guido van Rossum 
> wrote:
> > > I may have missed this (I've just skimmed the doc), but what's the
> rationale
> > > for making the EC an *immutable* mapping? It's impressive that you
> managed
> > > to create a faster immutable dict, but why does the use case need one?
>
> > In this proposal, you have lots and lots of semantically distinct ECs.
> > Potentially every stack frame has its own (at least in async code). So
> > instead of copying the EC every time they create a new one, they want
> > to copy it when it's written to. This is a win if writes are
> > relatively rare compared to the creation of ECs.
>
> Correct. If we decide to use HAMT, the ratio of writes/reads becomes
> less important though.
>
> Yury
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 12 August 2017 at 17:54, Nathaniel Smith  wrote:
> ...and now that I've written that down, I sort of feel like that might
> be what you want for all the other sorts of context object too? Like,
> here's a convoluted example:
>
> def gen():
> a = decimal.Decimal("1.111")
> b = decimal.Decimal("2.222")
> print(a + b)
> yield
> print(a + b)
>
> def caller():
> # let's pretend this context manager exists, the actual API is
> more complicated
> with decimal_context_precision(3):
> g = gen()
> with decimal_context_precision(2):
> next(g)
> with decimal_context_precision(1):
> next(g)
>
> Currently, this will print "3.3 3", because when the generator is
> resumed it inherits the context of the resuming site. With PEP 550, it
> would print "3.33 3.33" (or maybe "3.3 3.3"? it's not totally clear
> from the text), because it inherits the context when the generator is
> created and then ignores the calling context. It's hard to get strong
> intuitions, but I feel like the current behavior is actually more
> sensible -- each time the generator gets resumed, the next bit of code
> runs in the context of whoever called next(), and the generator is
> just passively inheriting context, so ... that makes sense.

Now that you raise this point, I think it means that generators need
to retain their current context inheritance behaviour, simply for
backwards compatibility purposes. This means that the case we need to
enable is the one where the generator *doesn't* dynamically adjust its
execution context to match that of the calling function.

One way that could work (using the cr_back/gi_back convention I suggested):

- generators start with gi_back not set
- if gi_back is NULL/None, gi.send() and gi.throw() set it to the
calling frame for the duration of the synchronous call and *don't*
adjust the execution context (i.e. the inverse of coroutine behaviour)
- if gi_back is already set, then gi.send() and gi.throw() *do* save
and restore the execution context around synchronous calls in to the
generator frame

To create an autonomous generator (i.e. one that didn't dynamically
update its execution context), you'd use a decorator like:

def autonomous_generator(gf):
@functools.wraps(gf)
def wrapper(*args, **kwds):
gi = genfunc(*args, **kwds)
gi.gi_back = gi.gi_frame
return gi
return wrapper

Asynchronous generators would then work like synchronous generators:
ag_back would be NULL/None by default, and dynamically set for the
duration of each __anext__ call. If you wanted to create an autonomous
one, you'd make it's back reference a circular reference to itself to
disable the implicit dynamic updates.

When I put it in those terms though, I think the
cr_back/gi_back/ag_back idea should actually be orthogonal to the
"revert_context" flag (so you can record the link back to the caller
even when maintaining an autonomous context).

Given that, you'd have the following initial states for "revert
context" (currently called "isolated context" in the PEP):

* unawaited coroutines: true (same as PEP)
* awaited coroutines: false (same as PEP)
* generators (both sync & async): false (opposite of current PEP)
* autonomous generators: true (set "gi_revert_context" or
"ag_revert_context" explicitly)

Open question: whether having "yield" inside a with statement implies
the creation of an autonomous generator (synchronous or otherwise), or
whether you'd need a decorator to get your context management right in
such cases.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] Towards harmony with JavaScript?

2017-08-12 Thread Carl Smith
Alberto,

CoffeeScript is a popular language that is widely considered to represent
JavaScript's best bits, and it only has anonymous functions, so there's a
large part of the JS community that disagrees with you there.

Browsers actually do identify anonymous functions, based on the
variable/property names that reference them, so the following function
would be identified as `square` in tracebacks:

let square = function(x) { return x * x };

In any case, passing anonymous functions to higher order functions is
commonplace in real-world JS. Chris may be right about using decorators as
a Pythonic alternative [I haven't really considered that properly to be
honest], but you can't just tell people not to do something that they see
as elegant and idiomatic.

Best -- Carl Smith


-- Carl Smith
carl.in...@gmail.com

On 12 August 2017 at 17:22, Nick Coghlan  wrote:

> On 12 August 2017 at 06:10, Chris Barker  wrote:
> >
> >> > Taking this off the list as it's no longer on topic.
> >
> >
> > not totally -- I'm going to add my thoughts:
> >
> > 1) If you want a smoother transition between server-side Python and
> > in-browser code, maybe you're  better off using one of the "python in the
> > browser" solutions -- there are at least a few viable ones.
>
> More experimentally, there's also toga's "web" backend (which allows
> you to take an application you developed with the primary intention of
> running it as a rich client application on mobile or desktop devices,
> and instead publishing it as a Django web application with a
> JavaScript frontend).
>
> Essentially, the relationship we see between Python and JavaScript is
> similar to the one that exists between Python and C/C++/Rust/Go/etc,
> just on the side that sits between the Python code and the GUI, rather
> than between the Python code and the compute & storage systems.
>
> As such, there are various libraries and transpilers that are designed
> to handle writing the JavaScript *for* you (bokeh, toga,
> JavaScripthon, etc), and the emergence of WASM as a frontend
> equivalent to machine code on the backend is only going to make the
> similarities in those dynamics more pronounced.
>
> In that vein, it's highly *un*likely we'd add any redundant constructs
> to Python purely to make it easier for JS developers to use JS idioms
> in Python instead of Pythonic ones, but JavaScript *is* one of the
> languages we look at for syntactic consistency when considering
> potential new additions to Python.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-ideas mailing list
> Python-ideas@python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Nick, Nathaniel, I'll be replying in full to your emails when I have
time to do some experiments.  Now I just want to address one point
that I think is important:

On Sat, Aug 12, 2017 at 1:09 PM, Nick Coghlan  wrote:
> On 12 August 2017 at 17:54, Nathaniel Smith  wrote:
>> ...and now that I've written that down, I sort of feel like that might
>> be what you want for all the other sorts of context object too? Like,
>> here's a convoluted example:
>>
>> def gen():
>> a = decimal.Decimal("1.111")
>> b = decimal.Decimal("2.222")
>> print(a + b)
>> yield
>> print(a + b)
>>
>> def caller():
>> # let's pretend this context manager exists, the actual API is
>> more complicated
>> with decimal_context_precision(3):
>> g = gen()
>> with decimal_context_precision(2):
>> next(g)
>> with decimal_context_precision(1):
>> next(g)
>>
>> Currently, this will print "3.3 3", because when the generator is
>> resumed it inherits the context of the resuming site. With PEP 550, it
>> would print "3.33 3.33" (or maybe "3.3 3.3"? it's not totally clear
>> from the text), because it inherits the context when the generator is
>> created and then ignores the calling context. It's hard to get strong
>> intuitions, but I feel like the current behavior is actually more
>> sensible -- each time the generator gets resumed, the next bit of code
>> runs in the context of whoever called next(), and the generator is
>> just passively inheriting context, so ... that makes sense.
>
> Now that you raise this point, I think it means that generators need
> to retain their current context inheritance behaviour, simply for
> backwards compatibility purposes. This means that the case we need to
> enable is the one where the generator *doesn't* dynamically adjust its
> execution context to match that of the calling function.

Nobody *intentionally* iterates a generator manually in different
decimal contexts (or any other contexts). This is an extremely error
prone thing to do, because one refactoring of generator -- rearranging
yields -- would wreck your custom iteration/context logic. I don't
think that any real code relies on this, and I don't think that we are
breaking backwards compatibility here in any way. How many users need
about this?

If someone does need this, it's possible to flip
`gi_isolated_execution_context` to `False` (as contextmanager does
now) and get this behaviour. This might be needed for frameworks like
Tornado which support coroutines via generators without 'yield from',
but I'll have to verify this.

What I'm saying here, is that any sort of context leaking *into* or
*out of* generator *while* it is iterating will likely cause only bugs
or undefined behaviour. Take a look at the precision example in the
Rationale section of  the PEP.

Most of the time generators are created and are iterated in the same
spot, you rarely create generator closures. One way the behaviour
could be changed, however, is to capture the execution context when
it's first iterated (as opposed to when it's instantiated), but I
don't think it makes any real difference.

Another idea: in one of my initial PEP implementations, I exposed
gen.gi_execution_context (same for coroutines) to python as read/write
attribute. That allowed to

(a) get the execution context out of generator (for introspection or
other purposes);

(b) inject execution context for event loops; for instance
asyncio.Task could do that for some purpose.

Maybe this would be useful for someone who wants to mess with
generators and contexts.

[..]
>
> def autonomous_generator(gf):
> @functools.wraps(gf)
> def wrapper(*args, **kwds):
> gi = genfunc(*args, **kwds)
> gi.gi_back = gi.gi_frame
> return gi
> return wrapper

Nick, I still have to fully grasp the idea of `gi_back`, but one quick
thing: I specifically designed the PEP to avoid touching frames. The
current design only needs TLS and a little help from the
interpreter/core objects adjusting that TLS. It should be very
straightforward to implement the PEP in any interpreter (with JIT or
without) or compilers like Cython.

[..]
> Given that, you'd have the following initial states for "revert
> context" (currently called "isolated context" in the PEP):
>
> * unawaited coroutines: true (same as PEP)
> * awaited coroutines: false (same as PEP)
> * generators (both sync & async): false (opposite of current PEP)
> * autonomous generators: true (set "gi_revert_context" or
> "ag_revert_context" explicitly)

If generators do not isolate their context, then the example in the
Rationale section will not work as expected (or am I missing
something?). Fixing generators state leak was one of the main goals of
the PEP.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread rym...@gmail.com
So, I'm hardly an expert when it comes to things like this, but there are
two things about this that don't seem right to me. (Also, I'd love to
respond inline, but that's kind of difficult from a mobile phone.)

The first is how set/get_execution_context_item take strings. Inevitably,
people are going to do things like:

CONTEXT_ITEM_NAME = 'foo-bar'
...
sys.set_execution_context_item(CONTEXT_ITEM_NAME, 'stuff')

IMO it would be nicer if there could be a key object used instead, e.g.

my_key = sys.execution_context_key('name-here-for-debugging-purposes')
sys.set_execution_context_item(my_key, 'stuff')

The advantage here would be no need for string constants and no potential
naming conflicts (the string passed to the key creator would be used just
for debugging, kind of like Thread names).


Second thing is this:

def context(x):
old_x = get_execution_context_item('x')
set_execution_context_item('x', x)
try:
yield
finally:
set_execution_context_item('x', old_x)



If this would be done frequently, a context manager would be a *lot* more
Pythonic, e.g.:

with sys.temp_change_execution_context('x', new_x):
# ...

--
Ryan (ライアン)
Yoko Shimomura, ryo (supercell/EGOIST), Hiroyuki Sawano >> everyone
elsehttp://refi64.com

On Aug 11, 2017 at 5:38 PM, >
wrote:

Hi,

This is a new PEP to implement Execution Contexts in Python.

The PEP is in-flight to python.org, and in the meanwhile can
be read on GitHub:

https://github.com/python/peps/blob/master/pep-0550.rst

(it contains a few diagrams and charts, so please read it there.)

Thank you!
Yury


PEP: 550
Title: Execution Context
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-Aug-2017
Python-Version: 3.7
Post-History: 11-Aug-2017


Abstract


This PEP proposes a new mechanism to manage execution state--the
logical environment in which a function, a thread, a generator,
or a coroutine executes in.

A few examples of where having a reliable state storage is required:

* Context managers like decimal contexts, ``numpy.errstate``,
  and ``warnings.catch_warnings``;

* Storing request-related data such as security tokens and request
  data in web applications;

* Profiling, tracing, and logging in complex and large code bases.

The usual solution for storing state is to use a Thread-local Storage
(TLS), implemented in the standard library as ``threading.local()``.
Unfortunately, TLS does not work for isolating state of generators or
asynchronous code because such code shares a single thread.


Rationale
=

Traditionally a Thread-local Storage (TLS) is used for storing the
state.  However, the major flaw of using the TLS is that it works only
for multi-threaded code.  It is not possible to reliably contain the
state within a generator or a coroutine.  For example, consider
the following generator::

def calculate(precision, ...):
with decimal.localcontext() as ctx:
# Set the precision for decimal calculations
# inside this block
ctx.prec = precision

yield calculate_something()
yield calculate_something_else()

Decimal context is using a TLS to store the state, and because TLS is
not aware of generators, the state can leak.  The above code will
not work correctly, if a user iterates over the ``calculate()``
generator with different precisions in parallel::

g1 = calculate(100)
g2 = calculate(50)

items = list(zip(g1, g2))

# items[0] will be a tuple of:
#   first value from g1 calculated with 100 precision,
#   first value from g2 calculated with 50 precision.
#
# items[1] will be a tuple of:
#   second value from g1 calculated with 50 precision,
#   second value from g2 calculated with 50 precision.

An even scarier example would be using decimals to represent money
in an async/await application: decimal calculations can suddenly
lose precision in the middle of processing a request.  Currently,
bugs like this are extremely hard to find and fix.

Another common need for web applications is to have access to the
current request object, or security context, or, simply, the request
URL for logging or submitting performance tracing data::

async def handle_http_request(request):
context.current_http_request = request

await ...
# Invoke your framework code, render templates,
# make DB queries, etc, and use the global
# 'current_http_request' in that code.

# This isn't currently possible to do reliably
# in asyncio out of the box.

These examples are just a few out of many, where a reliable way to
store context data is absolutely needed.

The inability to use TLS for asynchronous code has lead to
proliferation of ad-hoc solutions, limited to be supported only by
code that was explicitly enabled to work with them.

Current status quo is that any library, including th

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Sure, I'll do.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
On Sat, Aug 12, 2017 at 2:28 PM, rym...@gmail.com  wrote:
> So, I'm hardly an expert when it comes to things like this, but there are
> two things about this that don't seem right to me. (Also, I'd love to
> respond inline, but that's kind of difficult from a mobile phone.)
>
> The first is how set/get_execution_context_item take strings. Inevitably,
> people are going to do things like:

Yes, it accepts any hashable Python object as a key.

>
> CONTEXT_ITEM_NAME = 'foo-bar'
> ...
> sys.set_execution_context_item(CONTEXT_ITEM_NAME, 'stuff')
>
> IMO it would be nicer if there could be a key object used instead, e.g.
>
> my_key = sys.execution_context_key('name-here-for-debugging-purposes')
> sys.set_execution_context_item(my_key, 'stuff')

I thought about this, and decided that this is something that can be
easily designed on top of the PEP and put to the 'contextlib' module.

In practice, this issue can be entirely addressed in the
documentation, asking users to prefix their keys with their
library/framework/program name.

>
> The advantage here would be no need for string constants and no potential
> naming conflicts (the string passed to the key creator would be used just
> for debugging, kind of like Thread names).
>
>
> Second thing is this:
>
> def context(x):
> old_x = get_execution_context_item('x')
> set_execution_context_item('x', x)
> try:
> yield
> finally:
> set_execution_context_item('x', old_x)
>
>
>
> If this would be done frequently, a context manager would be a *lot* more
> Pythonic, e.g.:
>
> with sys.temp_change_execution_context('x', new_x):
> # ...

Yes, this is a neat idea and I think we can add such a helper to
contextlib.  I want to focus PEP 550 API on correctness, minimalism,
and performance.  Nice APIs can then be easily developed on top of it
later.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Nathaniel, Nick,

I'll reply only to point 9 in this email to split this threads into
manageable sub-threads.  I'll cover other points in later emails.

On Sat, Aug 12, 2017 at 3:54 AM, Nathaniel Smith  wrote:
> 9. OK, my big question, about semantics.

FWIW I took me a good hour to fully understand what you are doing with
"fail_after" and what you want from PEP 550, and the actual associated
problems with generators :)

>
> The PEP's design is based on the assumption that all context-local
> state is scalar-like, and contexts split but never join. But there are
> some cases where this isn't true, in particular for values that have
> "stack-like" semantics. These are terms I just made up, but let me
> give some examples. Python's sys.exc_info is one. Another I ran into
> recently is for trio's cancel scopes.

As you yourself show below, it's easy to implement stacks with the
proposed EC spec. A linked list will work good enough.

>
> So basically the background is, in trio you can wrap a context manager
> around any arbitrary chunk of code and then set a timeout or
> explicitly cancel that code. It's called a "cancel scope". These are
> fully nestable. Full details here:
> https://trio.readthedocs.io/en/latest/reference-core.html#cancellation-and-timeouts
>
> Currently, the implementation involves keeping a stack of cancel
> scopes in Task-local storage. This works fine for regular async code
> because when we switch Tasks, we also switch the cancel scope stack.
> But of course it falls apart for generators/async generators:
>
> async def agen():
> with fail_after(10):  # 10 second timeout for finishing this block
> await some_blocking_operation()
> yield
> await another_blocking_operation()
>
> async def caller():
> with fail_after(20):
> ag = agen()
> await ag.__anext__()
> # now that cancel scope is on the stack, even though we're not
> # inside the context manager! this will not end well.
> await some_blocking_operation()  # this might get cancelled
> when it shouldn't
> # even if it doesn't, we'll crash here when exiting the context manager
> # because we try to pop a cancel scope that isn't at the top of the stack
>
> So I was thinking about whether I could implement this using PEP 550.
> It requires some cleverness, but I could switch to representing the
> stack as a singly-linked list, and then snapshot it and pass it back
> to the coroutine runner every time I yield.

Right. So the task always knows the EC at the point of "yield". It can
then get the latest timeout from it and act accordingly if that yield
did not resume in time.  This should work.

> That would fix the case
> above. But, I think there's another case that's kind of a showstopper.
>
> async def agen():
> await some_blocking_operation()
> yield
>
> async def caller():
> ag = agen()  # context is captured here
> with fail_after(10):
> await ag.__anext__()
>
> Currently this case works correctly: the timeout is applied to the
> __anext__ call, as you'd expect. But with PEP 550, it wouldn't work:
> the generator's timeouts would all be fixed when it was instantiated,
> and we wouldn't be able to detect that the second call has a timeout
> imposed on it. So that's a pretty nasty footgun. Any time you have
> code that's supposed to have a timeout applied, but in fact has no
> timeout applied, then that's a really serious bug -- it can lead to
> hangs, trivial DoS, pagers going off, etc.

As I tried to explain in my last email, I generally don't believe that
people would do this partial iteration with timeouts or other contexts
around it.  The only use case I can come up so far is implementing
some sort of receiver using an AG, and then "listening" on it through
"__anext__" calls.

But the case is interesting nevertheless, and maybe we can fix it
without relaxing any guarantees of the PEP.

The idea that I have is to allow linking of ExecutionContext (this is
similar in a way to what Nick proposed, but has a stricter semantics):

1. The internal ExecutionContext object will have a new "back" attribute.

2. For regular code and coroutines everything that is already in the
PEP will stay the same.

3. For generators and asynchronous generators, when a generator is
created, an empty ExecutionContext will be created for it, with its
"back" attribute pointing to the current EC.

4. The lookup function will be adjusted to to check the "EC.back" if
the key is not found in the current EC.

5. The max level of "back" chain will be 1.

6. When a generator is created inside another generator, it will
inherit another generator's EC. Because contexts are immutable this
should be OK.

7. When a coroutine is created inside an EC with a "back" link, it
will merge EC and EC.back in one new EC. Merge can be done very
efficiently for HAMT mappings which I believe we will end up using for
this anyways (an O(log32 N) operation).

An illustration of what it w

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Pau Freixes
Good work Yuri, going for all in one will help to not increase the
diferences btw async and the sync world in Python.

I do really like the idea of the immutable dicts, it makes easy inherit the
context btw tasks/threads/whatever without put in risk the consistency if
there is further key colisions.

Ive just take a look at the asyncio modifications. Correct me if Im wrong,
but the handler strategy has a side effect. The work done to save and
restore the context will be done twice in some situations. It would happen
when the callback is in charge of execute a task step, once by the run in
context method and the other one by the coroutine. Is that correct?

El 12/08/2017 00:38, "Yury Selivanov"  escribió:

Hi,

This is a new PEP to implement Execution Contexts in Python.

The PEP is in-flight to python.org, and in the meanwhile can
be read on GitHub:

https://github.com/python/peps/blob/master/pep-0550.rst

(it contains a few diagrams and charts, so please read it there.)

Thank you!
Yury


PEP: 550
Title: Execution Context
Version: $Revision$
Last-Modified: $Date$
Author: Yury Selivanov 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-Aug-2017
Python-Version: 3.7
Post-History: 11-Aug-2017


Abstract


This PEP proposes a new mechanism to manage execution state--the
logical environment in which a function, a thread, a generator,
or a coroutine executes in.

A few examples of where having a reliable state storage is required:

* Context managers like decimal contexts, ``numpy.errstate``,
  and ``warnings.catch_warnings``;

* Storing request-related data such as security tokens and request
  data in web applications;

* Profiling, tracing, and logging in complex and large code bases.

The usual solution for storing state is to use a Thread-local Storage
(TLS), implemented in the standard library as ``threading.local()``.
Unfortunately, TLS does not work for isolating state of generators or
asynchronous code because such code shares a single thread.


Rationale
=

Traditionally a Thread-local Storage (TLS) is used for storing the
state.  However, the major flaw of using the TLS is that it works only
for multi-threaded code.  It is not possible to reliably contain the
state within a generator or a coroutine.  For example, consider
the following generator::

def calculate(precision, ...):
with decimal.localcontext() as ctx:
# Set the precision for decimal calculations
# inside this block
ctx.prec = precision

yield calculate_something()
yield calculate_something_else()

Decimal context is using a TLS to store the state, and because TLS is
not aware of generators, the state can leak.  The above code will
not work correctly, if a user iterates over the ``calculate()``
generator with different precisions in parallel::

g1 = calculate(100)
g2 = calculate(50)

items = list(zip(g1, g2))

# items[0] will be a tuple of:
#   first value from g1 calculated with 100 precision,
#   first value from g2 calculated with 50 precision.
#
# items[1] will be a tuple of:
#   second value from g1 calculated with 50 precision,
#   second value from g2 calculated with 50 precision.

An even scarier example would be using decimals to represent money
in an async/await application: decimal calculations can suddenly
lose precision in the middle of processing a request.  Currently,
bugs like this are extremely hard to find and fix.

Another common need for web applications is to have access to the
current request object, or security context, or, simply, the request
URL for logging or submitting performance tracing data::

async def handle_http_request(request):
context.current_http_request = request

await ...
# Invoke your framework code, render templates,
# make DB queries, etc, and use the global
# 'current_http_request' in that code.

# This isn't currently possible to do reliably
# in asyncio out of the box.

These examples are just a few out of many, where a reliable way to
store context data is absolutely needed.

The inability to use TLS for asynchronous code has lead to
proliferation of ad-hoc solutions, limited to be supported only by
code that was explicitly enabled to work with them.

Current status quo is that any library, including the standard
library, that uses a TLS, will likely not work as expected in
asynchronous code or with generators (see [3]_ as an example issue.)

Some languages that have coroutines or generators recommend to
manually pass a ``context`` object to every function, see [1]_
describing the pattern for Go.  This approach, however, has limited
use for Python, where we have a huge ecosystem that was built to work
with a TLS-like context.  Moreover, passing the context explicitly
does not work at all for libraries like ``decimal`` or ``numpy``,
which use operator overloading.

.NET runtime, which h

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
I had an idea for an alternative API that exposes the same
functionality/semantics as the current draft, but that might have some
advantages. It would look like:

# a "context item" is an object that holds a context-sensitive value
# each call to create_context_item creates a new one
ci = sys.create_context_item()

# Set the value of this item in the current context
ci.set(value)

# Get the value of this item in the current context
value = ci.get()
value = ci.get(default)

# To support async libraries, we need some way to capture the whole context
# But an opaque token representing "all context item values" is enough
state_token = sys.current_context_state_token()
sys.set_context_state_token(state_token)
coro.cr_state_token = state_token
# etc.

The advantages are:
- Eliminates the current PEP's issues with namespace collision; every
context item is automatically distinct from all others.
- Eliminates the need for the None-means-del hack.
- Lets the interpreter hide the details of garbage collecting context values.
- Allows for more implementation flexibility. This could be
implemented directly on top of Yury's current prototype. But it could
also, for example, be implemented by storing the context values in a
flat array, where each context item is assigned an index when it's
allocated. In the current draft this is suggested as a possible
extension for particularly performance-sensitive users, but this way
we'd have the option of making everything fast without changing or
extending the API.

As precedent, this is basically the API that low-level thread-local
storage implementations use; see e.g. pthread_key_create,
pthread_getspecific, pthread_setspecific. (And the
allocate-an-index-in-a-table is the implementation that fast
thread-local storage implementations use too.)

-n

On Fri, Aug 11, 2017 at 3:37 PM, Yury Selivanov  wrote:
> Hi,
>
> This is a new PEP to implement Execution Contexts in Python.
>
> The PEP is in-flight to python.org, and in the meanwhile can
> be read on GitHub:
>
> https://github.com/python/peps/blob/master/pep-0550.rst
>
> (it contains a few diagrams and charts, so please read it there.)
>
> Thank you!
> Yury
>
>
> PEP: 550
> Title: Execution Context
> Version: $Revision$
> Last-Modified: $Date$
> Author: Yury Selivanov 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 11-Aug-2017
> Python-Version: 3.7
> Post-History: 11-Aug-2017
>
>
> Abstract
> 
>
> This PEP proposes a new mechanism to manage execution state--the
> logical environment in which a function, a thread, a generator,
> or a coroutine executes in.
>
> A few examples of where having a reliable state storage is required:
>
> * Context managers like decimal contexts, ``numpy.errstate``,
>   and ``warnings.catch_warnings``;
>
> * Storing request-related data such as security tokens and request
>   data in web applications;
>
> * Profiling, tracing, and logging in complex and large code bases.
>
> The usual solution for storing state is to use a Thread-local Storage
> (TLS), implemented in the standard library as ``threading.local()``.
> Unfortunately, TLS does not work for isolating state of generators or
> asynchronous code because such code shares a single thread.
>
>
> Rationale
> =
>
> Traditionally a Thread-local Storage (TLS) is used for storing the
> state.  However, the major flaw of using the TLS is that it works only
> for multi-threaded code.  It is not possible to reliably contain the
> state within a generator or a coroutine.  For example, consider
> the following generator::
>
> def calculate(precision, ...):
> with decimal.localcontext() as ctx:
> # Set the precision for decimal calculations
> # inside this block
> ctx.prec = precision
>
> yield calculate_something()
> yield calculate_something_else()
>
> Decimal context is using a TLS to store the state, and because TLS is
> not aware of generators, the state can leak.  The above code will
> not work correctly, if a user iterates over the ``calculate()``
> generator with different precisions in parallel::
>
> g1 = calculate(100)
> g2 = calculate(50)
>
> items = list(zip(g1, g2))
>
> # items[0] will be a tuple of:
> #   first value from g1 calculated with 100 precision,
> #   first value from g2 calculated with 50 precision.
> #
> # items[1] will be a tuple of:
> #   second value from g1 calculated with 50 precision,
> #   second value from g2 calculated with 50 precision.
>
> An even scarier example would be using decimals to represent money
> in an async/await application: decimal calculations can suddenly
> lose precision in the middle of processing a request.  Currently,
> bugs like this are extremely hard to find and fix.
>
> Another common need for web applications is to have access to the
> current request object, or security context, or, simply, the request
> URL for logging or submitting performa

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
Yes, I considered this idea myself, but ultimately rejected it because:

1. Current solution makes it easy to introspect things. Get the
current EC and print it out.  Although the context item idea could be
extended to `sys.create_context_item('description')` to allow that.

2. What if we want to pickle the EC? If all items in it are
pickleable, it's possible to dump the EC, send it over the network,
and re-use in some other process. It's not something I want to
consider in the PEP right now, but it's something that the current
design theoretically allows. AFAIU, `ci = sys.create_context_item()`
context item wouldn't be possible to pickle/unpickle correctly, no?

Some more comments:

On Sat, Aug 12, 2017 at 7:35 PM, Nathaniel Smith  wrote:
[..]
> The advantages are:
> - Eliminates the current PEP's issues with namespace collision; every
> context item is automatically distinct from all others.

TBH I think that the collision issue is slightly exaggerated.

> - Eliminates the need for the None-means-del hack.

I consider Execution Context to be an API, not a collection. It's an
important distinction, If you view it that way, deletion on None is
doesn't look that esoteric.

> - Lets the interpreter hide the details of garbage collecting context values.

I'm not sure I understand how the current PEP design is bad from the
GC standpoint. Or how this proposal can be different, FWIW.

> - Allows for more implementation flexibility. This could be
> implemented directly on top of Yury's current prototype. But it could
> also, for example, be implemented by storing the context values in a
> flat array, where each context item is assigned an index when it's
> allocated.

You still want to have this optimization only for *some* keys. So I
think a separate API is still needed.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 13 August 2017 at 03:53, Yury Selivanov  wrote:
> On Sat, Aug 12, 2017 at 1:09 PM, Nick Coghlan  wrote:
>> Now that you raise this point, I think it means that generators need
>> to retain their current context inheritance behaviour, simply for
>> backwards compatibility purposes. This means that the case we need to
>> enable is the one where the generator *doesn't* dynamically adjust its
>> execution context to match that of the calling function.
>
> Nobody *intentionally* iterates a generator manually in different
> decimal contexts (or any other contexts). This is an extremely error
> prone thing to do, because one refactoring of generator -- rearranging
> yields -- would wreck your custom iteration/context logic. I don't
> think that any real code relies on this, and I don't think that we are
> breaking backwards compatibility here in any way. How many users need
> about this?

I think this is a reasonable stance for the PEP to take, but the
hidden execution state around the "isolated or not" behaviour still
bothers me.

In some ways it reminds me of the way function parameters work: the
bound parameters are effectively a *shallow* copy of the passed
arguments, so callers can decide whether or not they want the callee
to be able to modify them based on the arguments' mutability (or lack
thereof).

The execution context proposal uses copy-on-write semantics for
runtime efficiency, but it's essentially the same shallow copy concept
applied to __next__(), send() and throw() operations (and perhaps
__anext__(), asend(), and athrow() - I haven't wrapped my head around
the implications for async generators and context managers yet).

That similarity makes me wonder whether the "isolated or not"
behaviour could be moved from the object being executed and directly
into the key/value pairs themselves based on whether or not the values
were mutable, as that's the way function calls work: if the argument
is immutable, the callee *can't* change it, while if it's mutable, the
callee can mutate it, but it still can't rebind it to refer to a
different object.

The way I'd see that working with an always-reverted copy-on-write
execution context:

1. If a parent context wants child contexts to be able to make
changes, then it should put a *mutable* object in the context (e.g. a
list or class instance)
2. If a parent context *does not* want child contexts to be able to
make changes, then it should put an *immutable* object in the context
(e.g. a tuple or number)
3. If a child context *wants to share a context key with its parent,
then it should *mutate* it in place
4. If a child context *does not* want to share a context key with its
parent, then it should *rebind* it to a different object

That way, instead of reverted-or-not-reverted being an all-or-nothing
interpreter level decision, it can be made on a key-by-key basis by
choosing whether or not to use a mutable value.

To make that a little less abstract, consider a concrete example like
setting a "my_web_framework.request" key:

1. The step of *setting* the key will *not* be shared with the parent
context, as that modifies the underlying copy-on-write namespace, and
will hence be reverted when control is passed back to the parent
2. Any *mutation* of the request object *will* be shared, since
mutating the value doesn't have any effect on the copy-on-write
namespace

Nathaniel's example of wanting stack-like behaviour could be modeled
using tuples as values: when the child context appends to the tuple,
it will necessarily have to create a new tuple and rebind the
corresponding key, causing the changes to be invisible to the parent
context.

The contextlib.contextmanager use case could then be modeled as a
*separate* method that skipped the save/revert context management step
(e.g. "send_with_shared_context", "throw_with_shared_context")

> If someone does need this, it's possible to flip
> `gi_isolated_execution_context` to `False` (as contextmanager does
> now) and get this behaviour. This might be needed for frameworks like
> Tornado which support coroutines via generators without 'yield from',
> but I'll have to verify this.

Working through this above, I think the key points that bother me
about the stateful revert-or-not setting is that whether or not
context reversion is desirable depends mainly on two things:

- the specific key in question (indicated by mutable vs immutable values)
- the intent of the code in the parent context (which could be
indicated by calling different methods)

It *doesn't* seem to be an inherent property of a given generator or
coroutine, except insofar as there's a correlation between the code
that creates generators & coroutines and the code that subsequently
invokes them.

> Another idea: in one of my initial PEP implementations, I exposed
> gen.gi_execution_context (same for coroutines) to python as read/write
> attribute. That allowed to
>
> (a) get the execution context out of generator (for introspection or
> other purposes

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov  wrote:
> Yes, I considered this idea myself, but ultimately rejected it because:
>
> 1. Current solution makes it easy to introspect things. Get the
> current EC and print it out.  Although the context item idea could be
> extended to `sys.create_context_item('description')` to allow that.

My first draft actually had the description argument :-). But then I
deleted it on the grounds that there's also no way to introspect a
list of all threading.local objects, and no-one seems to be bothered
by that, so why should we bother here. Obviously it'd be trivial to
add though, yeah; I don't really care either way.

> 2. What if we want to pickle the EC? If all items in it are
> pickleable, it's possible to dump the EC, send it over the network,
> and re-use in some other process. It's not something I want to
> consider in the PEP right now, but it's something that the current
> design theoretically allows. AFAIU, `ci = sys.create_context_item()`
> context item wouldn't be possible to pickle/unpickle correctly, no?

That's true. In this API, supporting pickling would require some kind
of opt-in on the part of EC users.

But... pickling would actually need to be opt-in anyway. Remember, the
set of all EC items is a piece of global shared state; we expect new
entries to appear when random 3rd party libraries are imported. So we
have no idea what is in there or what it's being used for. Blindly
pickling the whole context will lead to bugs (when code unexpectedly
ends up with context that wasn't designed to go across processes) and
crashes (there's no guarantee that all the objects are even
pickleable).

If we do decide we want to support this in the future then we could
add a generic opt-in mechanism something like:

MY_CI = sys.create_context_item(__name__, "MY_CI", pickleable=True)

But I'm not sure that it even make sense to have a global flag
enabling pickle. Probably it's better to have separate flags to opt-in
to different libraries that might want to pickle in different
situations for different reasons: pickleable-by-dask,
pickleable-by-curio.run_in_process, ... And that's doable without any
special interpreter support. E.g. you could have
curio.Local(pickle=True) coordinate with curio.run_in_process.

> Some more comments:
>
> On Sat, Aug 12, 2017 at 7:35 PM, Nathaniel Smith  wrote:
> [..]
>> The advantages are:
>> - Eliminates the current PEP's issues with namespace collision; every
>> context item is automatically distinct from all others.
>
> TBH I think that the collision issue is slightly exaggerated.
>
>> - Eliminates the need for the None-means-del hack.
>
> I consider Execution Context to be an API, not a collection. It's an
> important distinction, If you view it that way, deletion on None is
> doesn't look that esoteric.

Deletion on None is still a special case that API users need to
remember, and it's a small footgun that you can't just take an
arbitrary Python object and round-trip it through the context.
Obviously these are both APIs and they can do anything that makes
sense, but all else being equal I prefer APIs that have fewer special
cases :-).

>> - Lets the interpreter hide the details of garbage collecting context values.
>
> I'm not sure I understand how the current PEP design is bad from the
> GC standpoint. Or how this proposal can be different, FWIW.

When the ContextItem object becomes unreachable and is collected, then
the interpreter knows that all of the values associated with it in
different contexts are also unreachable and can be collected.

I mentioned this in my email yesterday -- look at the hoops
threading.local jumps through to avoid breaking garbage collection.

This is closely related to the previous point, actually -- AFAICT the
only reason why it *really* matters that None deletes the item is that
you need to be able to delete to free the item from the dictionary,
which only matters if you want to dynamically allocate keys and then
throw them away again. In the ContextItem approach, there's no need to
manually delete the entry, you can just drop your reference to the
ContextItem and the the garbage collector take care of it.

>> - Allows for more implementation flexibility. This could be
>> implemented directly on top of Yury's current prototype. But it could
>> also, for example, be implemented by storing the context values in a
>> flat array, where each context item is assigned an index when it's
>> allocated.
>
> You still want to have this optimization only for *some* keys. So I
> think a separate API is still needed.

Wait, why is it a requirement that some keys be slow? That seems like
weird requirement :-).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Kevin Conway
As far as providing a thread-local like surrogate for coroutine based
systems in Python, we had to solve this for Twisted with
https://bitbucket.org/hipchat/txlocal. Because of the way the Twisted
threadpooling works we also had to make a context system that was both
coroutine and thread safe at the same time.

We have a similar setup for asyncio but it seems we haven't open sourced
it. I'll ask around for it if this group feels that an asyncio example
would be beneficial. We implemented both of these in plain-old Python so
they should be compatible beyond CPython.

It's been over a year since I was directly involved with either of these
projects, but added memory and CPU consumption were stats we watched
closely and we found a negligible increase in both as we rolled out async
context.

On Sat, Aug 12, 2017 at 9:16 PM Nathaniel Smith  wrote:

> On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov 
> wrote:
> > Yes, I considered this idea myself, but ultimately rejected it because:
> >
> > 1. Current solution makes it easy to introspect things. Get the
> > current EC and print it out.  Although the context item idea could be
> > extended to `sys.create_context_item('description')` to allow that.
>
> My first draft actually had the description argument :-). But then I
> deleted it on the grounds that there's also no way to introspect a
> list of all threading.local objects, and no-one seems to be bothered
> by that, so why should we bother here. Obviously it'd be trivial to
> add though, yeah; I don't really care either way.
>
> > 2. What if we want to pickle the EC? If all items in it are
> > pickleable, it's possible to dump the EC, send it over the network,
> > and re-use in some other process. It's not something I want to
> > consider in the PEP right now, but it's something that the current
> > design theoretically allows. AFAIU, `ci = sys.create_context_item()`
> > context item wouldn't be possible to pickle/unpickle correctly, no?
>
> That's true. In this API, supporting pickling would require some kind
> of opt-in on the part of EC users.
>
> But... pickling would actually need to be opt-in anyway. Remember, the
> set of all EC items is a piece of global shared state; we expect new
> entries to appear when random 3rd party libraries are imported. So we
> have no idea what is in there or what it's being used for. Blindly
> pickling the whole context will lead to bugs (when code unexpectedly
> ends up with context that wasn't designed to go across processes) and
> crashes (there's no guarantee that all the objects are even
> pickleable).
>
> If we do decide we want to support this in the future then we could
> add a generic opt-in mechanism something like:
>
> MY_CI = sys.create_context_item(__name__, "MY_CI", pickleable=True)
>
> But I'm not sure that it even make sense to have a global flag
> enabling pickle. Probably it's better to have separate flags to opt-in
> to different libraries that might want to pickle in different
> situations for different reasons: pickleable-by-dask,
> pickleable-by-curio.run_in_process, ... And that's doable without any
> special interpreter support. E.g. you could have
> curio.Local(pickle=True) coordinate with curio.run_in_process.
>
> > Some more comments:
> >
> > On Sat, Aug 12, 2017 at 7:35 PM, Nathaniel Smith  wrote:
> > [..]
> >> The advantages are:
> >> - Eliminates the current PEP's issues with namespace collision; every
> >> context item is automatically distinct from all others.
> >
> > TBH I think that the collision issue is slightly exaggerated.
> >
> >> - Eliminates the need for the None-means-del hack.
> >
> > I consider Execution Context to be an API, not a collection. It's an
> > important distinction, If you view it that way, deletion on None is
> > doesn't look that esoteric.
>
> Deletion on None is still a special case that API users need to
> remember, and it's a small footgun that you can't just take an
> arbitrary Python object and round-trip it through the context.
> Obviously these are both APIs and they can do anything that makes
> sense, but all else being equal I prefer APIs that have fewer special
> cases :-).
>
> >> - Lets the interpreter hide the details of garbage collecting context
> values.
> >
> > I'm not sure I understand how the current PEP design is bad from the
> > GC standpoint. Or how this proposal can be different, FWIW.
>
> When the ContextItem object becomes unreachable and is collected, then
> the interpreter knows that all of the values associated with it in
> different contexts are also unreachable and can be collected.
>
> I mentioned this in my email yesterday -- look at the hoops
> threading.local jumps through to avoid breaking garbage collection.
>
> This is closely related to the previous point, actually -- AFAICT the
> only reason why it *really* matters that None deletes the item is that
> you need to be able to delete to free the item from the dictionary,
> which only matters if you want to dynamically allocate keys an

Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 13 August 2017 at 11:27, Yury Selivanov  wrote:
> Yes, I considered this idea myself, but ultimately rejected it because:
>
> 1. Current solution makes it easy to introspect things. Get the
> current EC and print it out.  Although the context item idea could be
> extended to `sys.create_context_item('description')` to allow that.

I think the TLS/TSS precedent means we should seriously consider the
ContextItem + ContextStateToken approach for the core low level API.

We also have a long history of pain and quirks arising from the
locals() builtin being defined as returning a mapping even though
function locals are managed as a linear array, so if we can avoid that
for the execution context, it will likely be beneficial for both end
users (due to less quirky runtime behaviour, especially across
implementations) and language implementation developers (due to a
reduced need to make something behave like an ordinary mapping when it
really isn't).

If we decide we want a separate context introspection API (akin to
inspect.getcouroutinelocals() and inspect.getgeneratorlocals()), then
an otherwise opaque ContextStateToken would be sufficient to enable
that. Even if we don't need it for any other reason, having such an
API available would be desirable for the regression test suite.

For example, if context items are hashable, we could have the
following arrangement:

# Create new context items
sys.create_context_item(name)
# Opaque token for the current execution context
sys.get_context_token()
# Switch the current execution context to the given one
sys.set_context(context_token)
# Snapshot mapping context items to their values in given context
sys.get_context_items(context_token)

As Nathaniel suggestion, getting/setting/deleting individual items in
the current context would be implemented as methods on the ContextItem
objects, allowing the return value of "get_context_items" to be a
plain dictionary, rather than a special type that directly supported
updates to the underlying context.

> 2. What if we want to pickle the EC? If all items in it are
> pickleable, it's possible to dump the EC, send it over the network,
> and re-use in some other process. It's not something I want to
> consider in the PEP right now, but it's something that the current
> design theoretically allows. AFAIU, `ci = sys.create_context_item()`
> context item wouldn't be possible to pickle/unpickle correctly, no?

As Nathaniel notes, cooperative partial pickling will be possible
regardless of how the low level API works, and starting with a simpler
low level API still doesn't rule out adding features like this at a
later date.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
[replying to list]

On Sat, Aug 12, 2017 at 10:56 PM, Nick Coghlan  wrote:
> On 13 August 2017 at 11:27, Yury Selivanov  wrote:
>> Yes, I considered this idea myself, but ultimately rejected it because:
>>
>> 1. Current solution makes it easy to introspect things. Get the
>> current EC and print it out.  Although the context item idea could be
>> extended to `sys.create_context_item('description')` to allow that.
>
> I think the TLS/TSS precedent means we should seriously consider the
> ContextItem + ContextStateToken approach for the core low level API.

I actually like the idea and am fully open to it. I'm also curious if
it's possible to adapt the flat-array/fast access ideas that Nathaniel
mentioned.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nick Coghlan
On 13 August 2017 at 12:15, Nathaniel Smith  wrote:
> On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov  
> wrote:
>> Yes, I considered this idea myself, but ultimately rejected it because:
>>
>> 1. Current solution makes it easy to introspect things. Get the
>> current EC and print it out.  Although the context item idea could be
>> extended to `sys.create_context_item('description')` to allow that.
>
> My first draft actually had the description argument :-). But then I
> deleted it on the grounds that there's also no way to introspect a
> list of all threading.local objects, and no-one seems to be bothered
> by that, so why should we bother here.

In the TLS/TSS case, we have the design constraint of wanting to use
the platform provided TLS/TSS implementation when available, and
standard C APIs generally aren't designed to support rich runtime
introspection from regular C code - instead, they expect the debugger,
compiler, and standard library to be co-developed such that the
debugger knows how to figure out where the latter two have put things
at runtime.

> Obviously it'd be trivial to
> add though, yeah; I don't really care either way.

As noted in my other email, I like the idea of making the context
dependent state introspection API clearly distinct from the core
context dependent state management API.

That way the API implementation can focus on using the most efficient
data structures for the purpose, rather than being limited to the most
efficient data structures that can readily export a Python-style
mapping interface. The latter can then be provided purely for
introspection purposes.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Nathaniel Smith
On Sat, Aug 12, 2017 at 9:05 PM, Nick Coghlan  wrote:
> On 13 August 2017 at 12:15, Nathaniel Smith  wrote:
>> On Sat, Aug 12, 2017 at 6:27 PM, Yury Selivanov  
>> wrote:
>>> Yes, I considered this idea myself, but ultimately rejected it because:
>>>
>>> 1. Current solution makes it easy to introspect things. Get the
>>> current EC and print it out.  Although the context item idea could be
>>> extended to `sys.create_context_item('description')` to allow that.
>>
>> My first draft actually had the description argument :-). But then I
>> deleted it on the grounds that there's also no way to introspect a
>> list of all threading.local objects, and no-one seems to be bothered
>> by that, so why should we bother here.
>
> In the TLS/TSS case, we have the design constraint of wanting to use
> the platform provided TLS/TSS implementation when available, and
> standard C APIs generally aren't designed to support rich runtime
> introspection from regular C code - instead, they expect the debugger,
> compiler, and standard library to be co-developed such that the
> debugger knows how to figure out where the latter two have put things
> at runtime.

Excellent point.

>> Obviously it'd be trivial to
>> add though, yeah; I don't really care either way.
>
> As noted in my other email, I like the idea of making the context
> dependent state introspection API clearly distinct from the core
> context dependent state management API.
>
> That way the API implementation can focus on using the most efficient
> data structures for the purpose, rather than being limited to the most
> efficient data structures that can readily export a Python-style
> mapping interface. The latter can then be provided purely for
> introspection purposes.

Also an excellent point :-).

-n

-- 
Nathaniel J. Smith -- https://vorpus.org
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
On Sat, Aug 12, 2017 at 10:56 PM, Nick Coghlan  wrote:
[..]
> As Nathaniel suggestion, getting/setting/deleting individual items in
> the current context would be implemented as methods on the ContextItem
> objects, allowing the return value of "get_context_items" to be a
> plain dictionary, rather than a special type that directly supported
> updates to the underlying context.

The current PEP 550 design returns a "snapshot" of the current EC with
sys.get_execution_context().

I.e. if you do

ec = sys.get_execution_context()
ec['a'] = 'b'

# sys.get_execution_context_item('a') will return None

You did get a snapshot and you modified it -- but your modifications
are not visible anywhere. You can run a function in that modified EC
with `ec.run(function)` and that function will see that new 'a' key,
but that's it. There's no "magical" updates to the underlying context.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


Re: [Python-ideas] New PEP 550: Execution Context

2017-08-12 Thread Yury Selivanov
On Sat, Aug 12, 2017 at 10:12 AM, Nick Coghlan  wrote:
[..]
>
> 1. Are you sure you want to expose the CoW type to pure Python code?

Ultimately, why not? The execution context object you get with
sys.get_execution_context() is yours to change. Any change to it won't
be propagated anywhere, unless you execute something in that context
with ExecutionContext.run or set it as a current one.

>
> The draft API looks fairly error prone to me, as I'm not sure of the
> intended differences in behaviour between the following:
>
> @contextmanager
> def context(x):
> old_x = sys.get_execution_context_item('x')
> sys.set_execution_context_item('x', x)
> try:
> yield
> finally:
> sys.set_execution_context_item('x', old_x)
>
> @contextmanager
> def context(x):
> old_x = sys.get_execution_context().get('x')
> sys.get_execution_context()['x'] = x
> try:
> yield
> finally:
> sys.get_execution_context()['x'] = old_x

This one (the second example) won't do anything.

>
> @contextmanager
> def context(x):
> ec = sys.get_execution_context()
> old_x = ec.get('x')
> ec['x'] = x
> try:
> yield
> finally:
> ec['x'] = old_x

This one (the third one) won't do anything either.

You can do this:

ec = sys.get_execution_context()
ec['x'] = x
ec.run(my_function)

or `sys.set_execution_context(ec)`


>
> It seems to me that everything would be a lot safer if the *only*
> Python level API was a live dynamic view that completely hid the
> copy-on-write behaviour behind an "ExecutionContextProxy" type, such
> that the last two examples were functionally equivalent to each other
> and to the current PEP's get/set functions (rendering the latter
> redundant, and allowing it to be dropped from the PEP).

So there's no copy-on-write exposed to Python actually. What I am
thinking about, though, is that we might not need the
sys.set_execution_context() function. If you want to run something
with a modified or empty execution context, do it through
ExecutionContext.run method.

> 2. Do we need an ag_isolated_execution_context for asynchronous
> generators? (Modify this question as needed for the answer to the next
> question)

Yes, we'll need it for contextlib.asynccontextmanager at least.

>
> 3. It bothers me that *_execution_context points to an actual
> execution context, while *_isolated_execution_context is a boolean.
> With names that similar I'd expect them to point to the same kind of
> object.

I think we touched upon this in a parallel thread. But I think we can
rename "gi_isolated_execution_context" to
"gi_execution_context_isolated" or something more readable/obvious.

Yury
___
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/