Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-09-14 Thread Nick Coghlan
On 14 September 2017 at 11:44, Eric Snow  wrote:
> About Subinterpreters
> =
>
> Shared data
> ---

[snip]

> To make this work, the mutable shared state will be managed by the
> Python runtime, not by any of the interpreters.  Initially we will
> support only one type of objects for shared state: the channels provided
> by ``create_channel()``.  Channels, in turn, will carefully manage
> passing objects between interpreters.

Something I think you may want to explicitly call out as *not* being
shared is the thread objects in threading.enumerate(), as the way that
works in the current implementation makes sense, but isn't
particularly obvious (what I have below comes from experimenting with
your branch at https://github.com/python/cpython/pull/1748).

Specifically, what happens is that the operating system thread
underlying the existing interpreter thread that calls interp.run()
gets borrowed as the operating system thread underlying the MainThread
object in the called interpreter. That MainThread object then gets
preserved in the interpreter's interpreter state, but the mapping to
an underlying OS thread will change freely based on who's calling into
it. From outside an interpreter, you *can't* request to run code in
subthreads directly - you'll always run your given code in the main
thread, and it will be up to that to dispatch requests to subthreads.

Beyond the thread lending that happens when you call interp.run()
(where one of your threads gets borrowed as the other interpreter's
main thread), each interpreter otherwise maintains a completely
disjoint set of thread objects that it is solely responsible for.

This also clarifies for me what it means for an interpreter to be a
"main" interpreter: it's the interpreter who's main thread actually
corresponds to the main thread of the overall operating system
process, rather than being temporarily borrowed from another
interpreter.

We're going to have to put some thought into how we want that to
interact with the signal handling logic - right now, I believe *any*
main thread will consider it its responsibility to process signals
delivered to the runtime (and embedding application avoid the
potential problems arising from that by simply not installing the
CPython signal handlers in the first place), and we probably want to
change that condition to be "the main thread in the main interpreter".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-09-14 Thread Nick Coghlan
On 15 September 2017 at 12:04, Nathaniel Smith  wrote:
> On Thu, Sep 14, 2017 at 5:44 PM, Nick Coghlan  wrote:
>> The reason we're OK with this is that it means that only reading a new
>> message from a channel (i.e creating a cross-interpreter view) or
>> discarding a previously read message (i.e. closing a cross-interpreter
>> view) will be synchronisation points where the receiving interpreter
>> necessarily needs to acquire the sending interpreter's GIL.
>>
>> By contrast, if we allow an actual bytes object to be shared, then
>> either every INCREF or DECREF on that bytes object becomes a
>> synchronisation point, or else we end up needing some kind of
>> secondary per-interpreter refcount where the interpreter doesn't drop
>> its shared reference to the original object in its source interpreter
>> until the internal refcount in the borrowing interpreter drops to
>> zero.
>
> Ah, that makes more sense.
>
> I am nervous that allowing arbitrary memoryviews gives a *little* more
> power than we need or want. I like that the current API can reasonably
> be emulated using subprocesses -- it opens up the door for backports,
> compatibility support on language implementations that don't support
> subinterpreters, direct benchmark comparisons between the two
> implementation strategies, etc. But if we allow arbitrary memoryviews,
> then this requires that you can take (a) an arbitrary object, not
> specified ahead of time, and (b) provide two read-write views on it in
> separate interpreters such that modifications made in one are
> immediately visible in the other. Subprocesses can do one or the other
> -- they can copy arbitrary data, and if you warn them ahead of time
> when you allocate the buffer, they can do real zero-copy shared
> memory. But the combination is really difficult.

One constraint we'd want to impose is that the memory view in the
receiving interpreter should always be read-only - while we don't
currently expose the ability to request that at the Python layer,
memoryviews *do* support the creation of read-only views at the C API
layer (which then gets reported to Python code via the "view.readonly"
attribute).

While that change alone is enough to preserve the simplex nature of
the channel, it wouldn't be enough to prevent the *sender* from
mutating the buffer contents and having that change be visible in the
recipient.

In that regard it may make sense to maintain both restrictions
initially (as you suggested below): only accept bytes on the sending
side (to prevent mutation by the sender), and expose that as a
read-only memory view on the receiving side (to allow for zero-copy
data sharing without allowing mutation by the receiver).

> It'd be one thing if this were like a key feature that gave
> subinterpreters an advantage over subprocesses, but it seems really
> unlikely to me that a library won't know ahead of time when it's
> filling in a buffer to be transferred, and if anything it seems like
> we'd rather not expose read-write shared mappings in any case. It's
> extremely non-trivial to do right [1].
>
> tl;dr: let's not rule out a useful implementation strategy based on a
> feature we don't actually need.

Yeah, the description Eric currently has in the PEP is a summary of a
much longer suggestion Yury, Neil Schumenauer and I put together while
waiting for our flights following the core dev sprint, and the full
version had some of these additional constraints on it (most notably
the "read-only in the receiving interpreter" one).

> One alternative would be your option (3) -- you can put bytes in and
> get memoryviews out, and since bytes objects are immutable it's OK.

Indeed, I think that will be a sensible starting point. However, I
genuinely want to allow for zero-copy sharing of NumPy arrays
eventually, as that's where I think this idea gets most interesting:
the potential to allow for multiple parallel read operations on a
given NumPy array *in Python* (rather than Cython or C) without
running afoul of the GIL, and without needing to mess about with the
complexities of operating system level IPC.

 Handling an exception
>> That way channels can be a namespace *specifically* for passing in
>> channels, and can be reported as such on RunResult. If we decide to
>> allow arbitrary shared objects in the future, or add flag options like
>> "reraise=True" to reraise exceptions from the subinterpreter in the
>> current interpreter, we'd have that ability, rather than having the
>> entire potential keyword namespace taken up for passing shared
>> objects.
>
> Would channels be a dict, or...?

Yeah, it would be a direct replacement for the way the current draft
is proposing to use the keywords dict - it would just be a separate
dictionary instead.

It does occur to me that if we wanted to align with the way the
`runpy` module spells that concept, we'd call the option
`init_globals`, but I'm thinking it will be better to only allow
channels to be passed through directly, and requir

Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-09-14 Thread Nathaniel Smith
On Thu, Sep 14, 2017 at 5:44 PM, Nick Coghlan  wrote:
> On 14 September 2017 at 15:27, Nathaniel Smith  wrote:
>> I don't get it. With bytes, you can either share objects or copy them and
>> the user can't tell the difference, so you can change your mind later if you
>> want.
>> But memoryviews require some kind of cross-interpreter strong
>> reference to keep the underlying buffer object alive. So if you want to
>> minimize object sharing, surely bytes are more future-proof.
>
> Not really, because the only way to ensure object separation (i.e no
> refcounted objects accessible from multiple interpreters at once) with
> a bytes-based API would be to either:
>
> 1. Always copy (eliminating most of the low overhead communications
> benefits that subinterpreters may offer over multiple processes)
> 2. Make the bytes implementation more complicated by allowing multiple
> bytes objects to share the same underlying storage while presenting as
> distinct objects in different interpreters
> 3. Make the output on the receiving side not actually a bytes object,
> but instead a view onto memory owned by another object in a different
> interpreter (a "memory view", one might say)
>
> And yes, using memory views for this does mean defining either a
> subclass or a mediating object that not only keeps the originating
> object alive until the receiving memoryview is closed, but also
> retains a reference to the originating interpreter so that it can
> switch to it when it needs to manipulate the source object's refcount
> or call one of the buffer methods.
>
> Yury and I are fine with that, since it means that either the sender
> *or* the receiver can decide to copy the data (e.g. by calling
> bytes(obj) before sending, or bytes(view) after receiving), and in the
> meantime, the object holding the cross-interpreter view knows that it
> needs to switch interpreters (and hence acquire the sending
> interpreter's GIL) before doing anything with the source object.
>
> The reason we're OK with this is that it means that only reading a new
> message from a channel (i.e creating a cross-interpreter view) or
> discarding a previously read message (i.e. closing a cross-interpreter
> view) will be synchronisation points where the receiving interpreter
> necessarily needs to acquire the sending interpreter's GIL.
>
> By contrast, if we allow an actual bytes object to be shared, then
> either every INCREF or DECREF on that bytes object becomes a
> synchronisation point, or else we end up needing some kind of
> secondary per-interpreter refcount where the interpreter doesn't drop
> its shared reference to the original object in its source interpreter
> until the internal refcount in the borrowing interpreter drops to
> zero.

Ah, that makes more sense.

I am nervous that allowing arbitrary memoryviews gives a *little* more
power than we need or want. I like that the current API can reasonably
be emulated using subprocesses -- it opens up the door for backports,
compatibility support on language implementations that don't support
subinterpreters, direct benchmark comparisons between the two
implementation strategies, etc. But if we allow arbitrary memoryviews,
then this requires that you can take (a) an arbitrary object, not
specified ahead of time, and (b) provide two read-write views on it in
separate interpreters such that modifications made in one are
immediately visible in the other. Subprocesses can do one or the other
-- they can copy arbitrary data, and if you warn them ahead of time
when you allocate the buffer, they can do real zero-copy shared
memory. But the combination is really difficult.

It'd be one thing if this were like a key feature that gave
subinterpreters an advantage over subprocesses, but it seems really
unlikely to me that a library won't know ahead of time when it's
filling in a buffer to be transferred, and if anything it seems like
we'd rather not expose read-write shared mappings in any case. It's
extremely non-trivial to do right [1].

tl;dr: let's not rule out a useful implementation strategy based on a
feature we don't actually need.

One alternative would be your option (3) -- you can put bytes in and
get memoryviews out, and since bytes objects are immutable it's OK.

[1] https://en.wikipedia.org/wiki/Memory_model_(programming)

>>> Handling an exception
>>> -
>> It would also be reasonable to simply not return any value/exception from
>> run() at all, or maybe just a bool for whether there was an unhandled
>> exception. Any high level API is going to be injecting code on both sides of
>> the interpreter boundary anyway, so it can do whatever exception and
>> traceback translation it wants to.
>
> So any more detailed response would *have* to come back as a channel message?
>
> That sounds like a reasonable option to me, too, especially since
> module level code doesn't have a return value as such - you can really
> only say "it raised an exception (and this was the exception it

Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Nick Coghlan
On 15 September 2017 at 02:56, Mike Miller  wrote:
>
> On 2017-09-12 19:09, Nick Coghlan wrote:
>>
>> On 13 September 2017 at 02:01, Chris Barker - NOAA Federal
>>  wrote:
>>>
>>> This really does match well with the record concept in databases, and
>>> most
>>> people are familiar with that.
>>
>>
>> No, most people aren't familiar with that - they only become familiar
>> with it *after* they've learned to program and learned what a database
>> is.
>
>
> Pretty sure he was talking about programmers, and they are introduced to the
> concept early.  Structs, objects with fields, random access files,
> databases, etc.  Lay-folks are familiar with "keeping records" as you
> mention, but they are not the primary customer it seems.

Python is an incredibly common first programming language, so we need
to keep folks with *zero* knowledge of programming jargon firmly in
mind when designing new features. That isn't always the most important
consideration, but it's always *a* consideration.

And, as Stefan notes in his reply, we also need to keep *misleading*
inferences in mind when we consider repurposing existing jargon for a
new use case - what seems like an obviously intuitive connection based
on our own individual experiences with a term may turn out to be
extremely counterintuitive for someone with a different experience of
the same term.

In such cases, it can make sense to look for new *semantically
neutral* terminology as the official glossary entry and API naming
scheme, and rely on documentation to indicate that this is a
realisation of a feature that goes by other names in other contexts.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 554 v3 (new interpreters module)

2017-09-14 Thread Nick Coghlan
On 14 September 2017 at 15:27, Nathaniel Smith  wrote:
> On Sep 13, 2017 9:01 PM, "Nick Coghlan"  wrote:
>
> On 14 September 2017 at 11:44, Eric Snow 
> wrote:
>>send(obj):
>>
>>Send the object to the receiving end of the channel.  Wait until
>>the object is received.  If the channel does not support the
>>object then TypeError is raised.  Currently only bytes are
>>supported.  If the channel has been closed then EOFError is
>>raised.
>
> I still expect any form of object sharing to hinder your
> per-interpreter GIL efforts, so restricting the initial implementation
> to memoryview-only seems more future-proof to me.
>
>
> I don't get it. With bytes, you can either share objects or copy them and
> the user can't tell the difference, so you can change your mind later if you
> want.
> But memoryviews require some kind of cross-interpreter strong
> reference to keep the underlying buffer object alive. So if you want to
> minimize object sharing, surely bytes are more future-proof.

Not really, because the only way to ensure object separation (i.e no
refcounted objects accessible from multiple interpreters at once) with
a bytes-based API would be to either:

1. Always copy (eliminating most of the low overhead communications
benefits that subinterpreters may offer over multiple processes)
2. Make the bytes implementation more complicated by allowing multiple
bytes objects to share the same underlying storage while presenting as
distinct objects in different interpreters
3. Make the output on the receiving side not actually a bytes object,
but instead a view onto memory owned by another object in a different
interpreter (a "memory view", one might say)

And yes, using memory views for this does mean defining either a
subclass or a mediating object that not only keeps the originating
object alive until the receiving memoryview is closed, but also
retains a reference to the originating interpreter so that it can
switch to it when it needs to manipulate the source object's refcount
or call one of the buffer methods.

Yury and I are fine with that, since it means that either the sender
*or* the receiver can decide to copy the data (e.g. by calling
bytes(obj) before sending, or bytes(view) after receiving), and in the
meantime, the object holding the cross-interpreter view knows that it
needs to switch interpreters (and hence acquire the sending
interpreter's GIL) before doing anything with the source object.

The reason we're OK with this is that it means that only reading a new
message from a channel (i.e creating a cross-interpreter view) or
discarding a previously read message (i.e. closing a cross-interpreter
view) will be synchronisation points where the receiving interpreter
necessarily needs to acquire the sending interpreter's GIL.

By contrast, if we allow an actual bytes object to be shared, then
either every INCREF or DECREF on that bytes object becomes a
synchronisation point, or else we end up needing some kind of
secondary per-interpreter refcount where the interpreter doesn't drop
its shared reference to the original object in its source interpreter
until the internal refcount in the borrowing interpreter drops to
zero.

>> Handling an exception
>> -
> It would also be reasonable to simply not return any value/exception from
> run() at all, or maybe just a bool for whether there was an unhandled
> exception. Any high level API is going to be injecting code on both sides of
> the interpreter boundary anyway, so it can do whatever exception and
> traceback translation it wants to.

So any more detailed response would *have* to come back as a channel message?

That sounds like a reasonable option to me, too, especially since
module level code doesn't have a return value as such - you can really
only say "it raised an exception (and this was the exception it
raised)" or "it reached the end of the code without raising an
exception".

Given that, I think subprocess.run() (with check=False) is the right
API precedent here:
https://docs.python.org/3/library/subprocess.html#subprocess.run

That always returns subprocess.CompletedProcess, and then you can call
"cp.check_returncode()" to get it to raise
subprocess.CalledProcessError for non-zero return codes.

For interpreter.run(), we could keep the initial RunResult *really*
simple and only report back:

* source: the source code passed to run()
* shared: the keyword args passed to run() (name chosen to match
functools.partial)
* completed: completed execution without raising an exception? (True
if yes, False otherwise)

Whether or not to report more details for a raised exception, and
provide some mechanism to reraise it in the calling interpreter could
then be deferred until later.

The subprocess.run() comparison does make me wonder whether this might
be a more future-proof signature for Interpreter.run() though:

def run(source_str, /, *, channels=None):
...

That way channels can

Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-14 Thread Ivan Levkivskyi
On 14 September 2017 at 23:02, Ivan Levkivskyi  wrote:

> On 14 September 2017 at 22:07, Ethan Furman  wrote:
>
>> For comparison's sake, what would the above look like using __class__
>> assignment?  And what is the performance difference?
>>
>>
> FWIW I found a different solution:
>
> # file mod.py
>
> from typing_extensions import allow_forward_references
> allow_forward_references()
> from mod import Vertex, Edge  # the import is from this same module.
>
> It works both with __class__ assignment and with __getattr__
>
> --
> Ivan
>
>
Anyway, I don't think we should take this seriously, the way forward is PEP
563,
we should have clear separation between runtime context and type context.
In the latter
forward references are OK, but in the former, they are quite weird.

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-14 Thread Ivan Levkivskyi
On 14 September 2017 at 22:07, Ethan Furman  wrote:

> For comparison's sake, what would the above look like using __class__
> assignment?  And what is the performance difference?
>
>
FWIW I found a different solution:

# file mod.py

from typing_extensions import allow_forward_references
allow_forward_references()
from mod import Vertex, Edge  # the import is from this same module.

It works both with __class__ assignment and with __getattr__

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Guido van Rossum
Let's all please take a time out from the naming discussion.

On Sep 14, 2017 11:15 AM, "Stefan Krah"  wrote:

> On Thu, Sep 14, 2017 at 11:06:15AM -0700, Mike Miller wrote:
> > On 2017-09-14 10:45, Stefan Krah wrote:
> > >I'd expect something like a C struct or an ML record.
> >
> > Struct is taken, and your second example is record.
>
> *If* the name were collections.record, I'd expect collections.record to
> be something like a C struct or an ML record. I'm NOT proposing "record".
>
>
> > > from dataclass import dataclass
> > >
> > >This is more intuitive, since the PEP example also has attached methods
> > >like total_cost().  I don't think this is really common for records.
> >
> > Every class can be extended, does that mean they can't be given
> appropriate names?
>
> A class is not a record. This brief conversation already convinced me that
> "record" is a bad name for the proposed construct.
>
>
>
> Stefan Krah
>
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> guido%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-14 Thread Ivan Levkivskyi
On 14 September 2017 at 22:21, Ivan Levkivskyi  wrote:

> On 14 September 2017 at 22:07, Ethan Furman  wrote:
>
>>
>> For comparison's sake, what would the above look like using __class__
>> assignment?  And what is the performance difference?
>>
>>
> Actually I tried but I can't implement this without module __getattr__
> so that one can just write:
>
> from typing_extensions import allow_forward_references
> allow_forward_references('Vertex', 'Edge')
>
> Maybe I am missing something, but either it is impossible in principle or
> highly non-trivial.
>
>
Actually my version does not work either on my branch :-(
But maybe this is a problem with my implementation.

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-14 Thread Ivan Levkivskyi
On 14 September 2017 at 22:07, Ethan Furman  wrote:

>
> For comparison's sake, what would the above look like using __class__
> assignment?  And what is the performance difference?
>
>
Actually I tried but I can't implement this without module __getattr__
so that one can just write:

from typing_extensions import allow_forward_references
allow_forward_references('Vertex', 'Edge')

Maybe I am missing something, but either it is impossible in principle or
highly non-trivial.

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-14 Thread Ethan Furman

On 09/14/2017 12:08 PM, Ivan Levkivskyi wrote:

On 14 September 2017 at 01:13, Guido van Rossum wrote:



That last sentence is a key observation. Do we even know whether there are 
(non-toy) things that you can do *in
principle* with __class__ assignment but which are too slow *in practice* to 
bother? And if yes, is __getattr__ fast
enough? @property?


I myself have never implemented deprecation warnings management nor lazy 
loading,
so it is hard to say if __class__ assignment is fast enough. For me it is more 
combination
of three factors:

* modest performance improvement
* some people might find __getattr__ clearer than __class__ assignment
* this would be consistent with how stubs work

IMO we're still looking for applications.


How about this

def allow_forward_references(*allowed):
 caller_globals = sys._getframe().__globals__
 def typing_getattr(name):
 if name in allowed:
 return name
 raise AttributeError(...)
 caller_globals.__getattr__ = typing_getattr

from typing_extensions import allow_forward_references
allow_forward_references('Vertex', 'Edge')

T = TypeVar('T', bound=Edge)

class Vertex(List[Edge]):
 def copy(self: T) -> T:
 ...

class Edge:
 ends: Tuple[Vertex, Vertex]
 ...

Look mum, no quotes! :-)


For comparison's sake, what would the above look like using __class__ 
assignment?  And what is the performance difference?

--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-14 Thread Ivan Levkivskyi
(sorry for obvious mistakes in the example in previous e-mail)

On 14 September 2017 at 21:08, Ivan Levkivskyi  wrote:

> On 14 September 2017 at 01:13, Guido van Rossum  wrote:
>
>>
>> That last sentence is a key observation. Do we even know whether there
>> are (non-toy) things that you can do *in principle* with __class__
>> assignment but which are too slow *in practice* to bother? And if yes, is
>> __getattr__ fast enough? @property?
>>
>>
> I myself have never implemented deprecation warnings management nor lazy
> loading,
> so it is hard to say if __class__ assignment is fast enough. For me it is
> more combination
> of three factors:
>
> * modest performance improvement
> * some people might find __getattr__ clearer than __class__ assignment
> * this would be consistent with how stubs work
>
>
>> IMO we're still looking for applications.
>>
>>
> How about this
>
> def allow_forward_references(*allowed):
> caller_globals = sys._getframe().__globals__
> def typing_getattr(name):
> if name in allowed:
> return name
> raise AttributeError(...)
> caller_globals.__getattr__ = typing_getattr
>
> from typing_extensions import allow_forward_references
> allow_forward_references('Vertex', 'Edge')
>
> T = TypeVar('T', bound=Edge)
>
> class Vertex(List[Edge]):
> def copy(self: T) -> T:
> ...
>
> class Edge:
> ends: Tuple[Vertex, Vertex]
> ...
>
> Look mum, no quotes! :-)
>
> --
> Ivan
>
>
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 549: Instance Properties (aka: module properties)

2017-09-14 Thread Ivan Levkivskyi
On 14 September 2017 at 01:13, Guido van Rossum  wrote:

>
> That last sentence is a key observation. Do we even know whether there are
> (non-toy) things that you can do *in principle* with __class__ assignment
> but which are too slow *in practice* to bother? And if yes, is __getattr__
> fast enough? @property?
>
>
I myself have never implemented deprecation warnings management nor lazy
loading,
so it is hard to say if __class__ assignment is fast enough. For me it is
more combination
of three factors:

* modest performance improvement
* some people might find __getattr__ clearer than __class__ assignment
* this would be consistent with how stubs work


> IMO we're still looking for applications.
>
>
How about this

def allow_forward_references(*allowed):
caller_globals = sys._getframe().__globals__
def typing_getattr(name):
if name in allowed:
return name
raise AttributeError(...)
caller_globals.__getattr__ = typing_getattr

from typing_extensions import allow_forward_references
allow_forward_references('Vertex', 'Edge')

T = TypeVar('T', bound=Edge)

class Vertex(List[Edge]):
def copy(self: T) -> T:
...

class Edge:
ends: Tuple[Vertex, Vertex]
...

Look mum, no quotes! :-)

--
Ivan
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Stefan Krah
On Thu, Sep 14, 2017 at 11:06:15AM -0700, Mike Miller wrote:
> On 2017-09-14 10:45, Stefan Krah wrote:
> >I'd expect something like a C struct or an ML record.
> 
> Struct is taken, and your second example is record.

*If* the name were collections.record, I'd expect collections.record to
be something like a C struct or an ML record. I'm NOT proposing "record".


> > from dataclass import dataclass
> >
> >This is more intuitive, since the PEP example also has attached methods
> >like total_cost().  I don't think this is really common for records.
> 
> Every class can be extended, does that mean they can't be given appropriate 
> names?

A class is not a record. This brief conversation already convinced me that
"record" is a bad name for the proposed construct.



Stefan Krah



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Barry Warsaw
On Sep 14, 2017, at 09:56, Mike Miller  wrote:

> Record is the most common name for this ubiquitous concept.

Mind if we call them Eric Classes to keep it clear?  Because if its name is not 
Eric Classes, it will cause a little confusion.

g’day-bruce-ly y’rs,
-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Mike Miller


On 2017-09-14 10:45, Stefan Krah wrote:

I'd expect something like a C struct or an ML record.


Struct is taken, and your second example is record.



 from dataclass import dataclass

This is more intuitive, since the PEP example also has attached methods
like total_cost().  I don't think this is really common for records.


Every class can be extended, does that mean they can't be given appropriate 
names?

(Not to mention dataclass is hardly intuitive for something that can have 
methods added.)


-Mike
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Stefan Krah
On Thu, Sep 14, 2017 at 10:24:52AM -0700, Mike Miller wrote:
> An elegant name can make the difference between another obscure
> module thrown in the stdlib to be never seen again and one that gets
> used every day.  Which is more intuitive?
> 
> from collections import record

I'd expect something like a C struct or an ML record.


> from dataclass import dataclass

This is more intuitive, since the PEP example also has attached methods
like total_cost().  I don't think this is really common for records.



Stefan Krah



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Mike Miller


On 2017-09-12 21:05, Guido van Rossum wrote:
It's ironic that some people dislike "data classes" because these are regular 
classes, not just for data, while others are proposing alternative names that 
emphasize the data container aspect. So "data classes" splits the difference, by 
referring to both data and classes.


True that these data-classes will be a superset of a traditional record.  But, 
we already have objects and inheritance for those use cases.  The data-class is 
meant to be used primarily like a record, so why not name it that way?


Almost everything is extensible in Python; that shouldn't prevent focused names, 
should it?




Let's bikeshed about something else.


An elegant name can make the difference between another obscure module thrown in 
the stdlib to be never seen again and one that gets used every day.  Which is 
more intuitive?


from collections import record

from dataclass import dataclass



Would the language be as nice if "object" was named an "instanceclass?"  Or 
perhaps the "requests" module could have been named "httpcall."  Much of the 
reluctance to use the attrs module is about its weird naming.


Due to the fact that this is a simple, potentially ubiquitous enhancement an 
elegant name is important.  "For humans," or something, haha.


-Mike
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-14 Thread Mike Miller


On 2017-09-12 19:09, Nick Coghlan wrote:

On 13 September 2017 at 02:01, Chris Barker - NOAA Federal
 wrote:

This really does match well with the record concept in databases, and most
people are familiar with that.


No, most people aren't familiar with that - they only become familiar
with it *after* they've learned to program and learned what a database
is.


Pretty sure he was talking about programmers, and they are introduced to the 
concept early.  Structs, objects with fields, random access files, databases, 
etc.  Lay-folks are familiar with "keeping records" as you mention, but they are 
not the primary customer it seems.


Record is the most common name for this ubiquitous concept.



whether its referring to the noun (wreck-ord) or the verb (ree-cord).


This can be grasped from context quickly, and due to mentioned ubiquity, not 
likely to be a problem in the real world.  "Am I going to ree-cord this class?"




Also, considering their uses, it might make sense to put them in the
collections module.


Data classes are things you're likely to put *in* a collection, rather
than really being collections themselves (they're only collections in
the same sense that all Python classes are collections of attributes,
and that's not the way the collections module uses the term).



Yes, a collection of attributes, not significantly different than the namedtuple 
(that began this thread) or the various dictionaries implemented there already. 
The criteria doesn't appear to be very strict, should it be?


(Also, could be put into a submodule and imported into it maintain modularity. 
Where it lands though isn't so important, just that collections is relatively 
likely to be imported already on medium sized projects, and I think the 
definition fits, collections == "bags of stuff".)


Cheers,
-Mike
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com