[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Ronald Oussoren via Python-Dev


> On 28 Dec 2020, at 03:58, Greg Ewing  wrote:
> 
> Rather than a full-blown buffer-protocol-like thing, could we
> get by with something simpler? How about just having a flag
> in the unicode object indicating that it doesn't own the
> memory that it points to?

I don’t know about the OP, but for me that wouldn’t be good enough as I’d still
have to copy the string value because of the semantics of ObjC strings.

Ronald
—

Twitter / micro.blog: @ronaldoussoren
Blog: https://blog.ronaldoussoren.net/ 

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D4LABINWQ6ZFPLDFRP6AL2PWCBA343DI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Enum bug?

2020-12-28 Thread Stephen J. Turnbull
Paul Bryan via Python-Dev writes:

 > Should this be considered a bug in the Enum implementation?

Probably not.  The underlying implementation of Enums is integers, and
False and True *are* the integers 0 and 1 for most purposes.  And it
propagates further.  Same example:

>>> class Foo(enum.Enum):
...  A=True
...  B=1
...  C=0
...  D=False
... 
>>> Foo.B

>>> 

This amusing artifact was discussed in another thread recently.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PI3E3R5YWM4HGFLUQGNS327QD3AQXY4Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Phil Thompson via Python-Dev

On 28/12/2020 02:07, Inada Naoki wrote:

On Sun, Dec 27, 2020 at 8:20 PM Ronald Oussoren via Python-Dev
 wrote:


On 26 Dec 2020, at 18:43, Guido van Rossum  wrote:

On Sat, Dec 26, 2020 at 3:54 AM Phil Thompson via Python-Dev 
 wrote:




That wouldn’t be a solution for code using the PyUnicode_* APIs of 
course, nor Python code explicitly checking for the str type.


In the end a new string “kind” (next to the 1, 2 and 4 byte variants) 
where callbacks are used to provide data might be the most pragmatic.  
That will still break code peaking directly in the the PyUnicodeObject 
struct, but anyone doing that should know that that is not a stable 
API.




I had a similar idea for lazy loading or lazy decoding of Unicode 
objects.

But I have rejected the idea and proposed to deprecate
PyUnicode_READY() because of the balance between merits and
complexity:

* Simplifying the Unicode object may introduce more room for
optimization because Unicode is the essential type for Python. Since
Python is a dynamic language, a huge amount of str comparison happened
in runtime compared with static languages like Java and Rust.
* Third parties may forget to check PyErr_Occurred() after API like
PyUnicode_Contains or PyUnicode_Compare when the author knows all
operands are exact Unicode type.

Additionally, if we introduce the customizable lazy str object, it's
very easy to release GIL during basic Unicode operations. Many third
parties may assume PyUnicode_Compare doesn't release GIL if both
operands are Unicode objects. It will produce bugs hard to find and
reproduce.


I would have no problem with the protocol stating that the GIL must not 
be released by "foreign" unicode implementations.



So I'm +1 to make Unicode simple by removing PyUnicode_READY(), and -1
to make Unicode complicated by adding customizable callback for lazy
population.

Anyway, I am OK to un-deprecate PyUnicode_READY() and make it no-op
macro since Python 3.12.
But I don't know how many third-parties use it properly, because
legacy Unicode objects are very rare already.


For me lazy population might not be enough (as I'm not sure precisely 
what you mean by it). I would like to be able to use my foreign unicode 
thing to be used as the storage.


For example (where text() returns a unicode object with a foreign 
kind)...


some_text = an_editor.text()
more_text = another_editor.text()

if some_text == more_text:
print("The text is the same")

...would not involve any conversions at all. The following would require 
a conversion...


if some_text == "literal text":

Phil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZSPNNLM25FRIEK2KYN5JORIR76PZH22N/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Phil Thompson via Python-Dev

On 28/12/2020 11:27, Inada Naoki wrote:

On Mon, Dec 28, 2020 at 7:22 PM Phil Thompson
 wrote:


> So I'm +1 to make Unicode simple by removing PyUnicode_READY(), and -1
> to make Unicode complicated by adding customizable callback for lazy
> population.
>
> Anyway, I am OK to un-deprecate PyUnicode_READY() and make it no-op
> macro since Python 3.12.
> But I don't know how many third-parties use it properly, because
> legacy Unicode objects are very rare already.

For me lazy population might not be enough (as I'm not sure precisely
what you mean by it). I would like to be able to use my foreign 
unicode

thing to be used as the storage.

For example (where text() returns a unicode object with a foreign
kind)...

some_text = an_editor.text()
more_text = another_editor.text()

if some_text == more_text:
 print("The text is the same")

...would not involve any conversions at all.


So you mean custom internal representation of exact Unicode object?

Then, I am more strong -1, sorry.
I can not believe the merits of it is bigger than the costs of its 
complexity.

If 3rd party wants to use completely different internal
representation, it must not be a unicode object at all.


I would have thought that an object was defined by its behaviour rather 
than by any particular implementation detail. However I completely 
understand the desire to avoid additional complexity of the 
implementation.


Phil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/D4U7TWKNP347HG37H56EPVJHUNRET7QX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Inada Naoki
On Mon, Dec 28, 2020 at 8:52 PM Phil Thompson
 wrote:
>
>
> I would have thought that an object was defined by its behaviour rather
> than by any particular implementation detail.
>

As my understanding, the policy "an object was defined by its
behavior..." doesn't mean "put unlimited amount of implementation
behind one concrete type."
The policy means APIs shouldn't limit input to one concrete type
without a reason. In other words, duck typing and structural subtyping
are good.

For example, we can try making io.TextIOWrapper accepts not only
Unicode objects (including subclass) but any objects implementing some
protocol.
We already have __index__ for integers and buffer protocol for
byts-like objects. That is examples of the policy.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E3ZMFJDYKDCFPA4ROESPK6T4JPYQMTLU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Ronald Oussoren via Python-Dev


> On 28 Dec 2020, at 14:00, Inada Naoki  wrote:
> 
> On Mon, Dec 28, 2020 at 8:52 PM Phil Thompson
>  wrote:
>> 
>> 
>> I would have thought that an object was defined by its behaviour rather
>> than by any particular implementation detail.
>> 
> 
> As my understanding, the policy "an object was defined by its
> behavior..." doesn't mean "put unlimited amount of implementation
> behind one concrete type."
> The policy means APIs shouldn't limit input to one concrete type
> without a reason. In other words, duck typing and structural subtyping
> are good.
> 
> For example, we can try making io.TextIOWrapper accepts not only
> Unicode objects (including subclass) but any objects implementing some
> protocol.
> We already have __index__ for integers and buffer protocol for
> byts-like objects. That is examples of the policy.

I agree that that would be the cleanest approach, although I worry about
how long it will take until 3th-party code is converted to the new protocol. 
That’s
why I wrote earlier that adding this feature to PyUnicode_Type is the most
pragmantic solution ;-)

There are two clear options for a new protocol:

1. Add something similar to __index__ of __fspath__, but for “string-like” 
objects

2. Add an extension to the buffer protocol

In either case an ABC for string-like objects would also be nice, to be able
to opt in to the fairly common pattern of excluding strings from types that 
can be iterated over, that is:

if isinstance(value, collections.abc.Iterable) and not isinstance(value, 
str):
for item in value:  proces_item(item)
else:
process_item(value)

Ronald
—

Twitter / micro.blog: @ronaldoussoren
Blog: https://blog.ronaldoussoren.net/ 


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BCN2WSLQ6YKEF6OO4E75EGYOGB6CFKXA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Antoine Pitrou
On Mon, 28 Dec 2020 11:07:46 +0900
Inada Naoki  wrote:
> 
> Additionally, if we introduce the customizable lazy str object, it's
> very easy to release GIL during basic Unicode operations. Many third
> parties may assume PyUnicode_Compare doesn't release GIL if both
> operands are Unicode objects.

1) You have to prove such "many third parties" exist.  I've written my
share of C extension code and I don't remember assuming that
PyUnicode_Compare doesn't release the GIL.

2) Even if there is such third party code, it is clearly making
assumptions about undocumented implementation details. It is therefore
ok to break it in new versions of CPython.

However, I agree that having to call PyUnicode_READY() before calling
C unicode APIs is probably an obscure detail that few people remember
about.

Regards

Antoine.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O6IUNWSAIL3C6BKIP7IBZXJN3P43GH67/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Where is the SQLite module maintainer

2020-12-28 Thread Erlend Aasland
On 27 Dec 2020, at 22:38, Christian Heimes 
mailto:christ...@python.org>> wrote:
On 27/12/2020 21.20, Erlend Aasland wrote:
[…]
Who can help me review code that touches the sqlite3 module code base?

as far as I know we don't have an active module owner and maintainer any
more.

What about Berker Peksag? He's at least listed as a code owner in 
.github/CODEOWNERS, but I rarely see him engaged in sqlite3 matters on GitHub 
and bpo. The last time I remember him discussing sqlite3 matters was seven 
months ago. (Just an observation; not critisism.)

Gerhard Häring, the original author of pysqlite, is still listed
as expert. But he hasn't been active in many years. I haven't seen him
in quite some time, too.

I think it's been five or so years since he last participated on bpo.

How about you put your name in the expert index instead of him? :)

Thanks for your confidence, but I'm far from an expert :)


Erlend
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RCD2WC2IKYIGA4USOALE3GZ455LT7U4P/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Inada Naoki
On Mon, Dec 28, 2020 at 10:53 PM Antoine Pitrou  wrote:
>
> On Mon, 28 Dec 2020 11:07:46 +0900
> Inada Naoki  wrote:
> >
> > Additionally, if we introduce the customizable lazy str object, it's
> > very easy to release GIL during basic Unicode operations. Many third
> > parties may assume PyUnicode_Compare doesn't release GIL if both
> > operands are Unicode objects.
>
> 1) You have to prove such "many third parties" exist.  I've written my
> share of C extension code and I don't remember assuming that
> PyUnicode_Compare doesn't release the GIL.
>

It is my fault that I said "many", but I just pointed out possible
backward incompatibility. Why I have to prove it?

> 2) Even if there is such third party code, it is clearly making
> assumptions about undocumented implementation details. It is therefore
> ok to break it in new versions of CPython.
>

But it should be considered carefully, because these APIs are not
releasing GIL for a long time.
And this type of change do not cause just a simple crash, but very
rare undefined behaviors in multithreaded complex applications.
For example, borrowed references in the caller can be changed to other
objects with same size because memory blocks are reused.
It is very difficult to notice and reproduce.

> However, I agree that having to call PyUnicode_READY() before calling
> C unicode APIs is probably an obscure detail that few people remember
> about.

If we provide custom callback and call it in PyUnicode_READY(), many
Unicode APIs using PyUnicode_READY() will be changed from predictable
behavior API to "may run arbitrary code" behavior. It is obscure
detail too.

Regards,

-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FZCXLNKZDQXJL6FQ63GWS4DKLXVDFW2Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enhancement request for PyUnicode proxies

2020-12-28 Thread Antoine Pitrou
On Tue, 29 Dec 2020 02:20:45 +0900
Inada Naoki  wrote:
> On Mon, Dec 28, 2020 at 10:53 PM Antoine Pitrou  wrote:
> >
> > On Mon, 28 Dec 2020 11:07:46 +0900
> > Inada Naoki  wrote:  
> > >
> > > Additionally, if we introduce the customizable lazy str object, it's
> > > very easy to release GIL during basic Unicode operations. Many third
> > > parties may assume PyUnicode_Compare doesn't release GIL if both
> > > operands are Unicode objects.  
> >
> > 1) You have to prove such "many third parties" exist.  I've written my
> > share of C extension code and I don't remember assuming that
> > PyUnicode_Compare doesn't release the GIL.
> 
> It is my fault that I said "many", but I just pointed out possible
> backward incompatibility. Why I have to prove it?

Because most C extension code is far from that level of
micro-optimization, so I doubt you'll find much code that
deliberately relies on such an obscure implementation detail.

> > 2) Even if there is such third party code, it is clearly making
> > assumptions about undocumented implementation details. It is therefore
> > ok to break it in new versions of CPython.
> 
> But it should be considered carefully, because these APIs are not
> releasing GIL for a long time.
> And this type of change do not cause just a simple crash, but very
> rare undefined behaviors in multithreaded complex applications.
> For example, borrowed references in the caller can be changed to other
> objects with same size because memory blocks are reused.
> It is very difficult to notice and reproduce.

Agreed, but that's a general problem with the C API (the existence of
borrowed references and the fact that most C API calls can silently
release the GIL, even as a side effect of object (de)allocation).  It's
also why it's better for most use cases to something like Cython.

Regards

Antoine.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TL5BTMLGC57TSN35MRMNTYP7RRGH47N6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Enum bug?

2020-12-28 Thread Victor Stinner
IMO it's a feature, not a bug :-)

>>> import enum
>>> class Foo(enum.Enum):
...   A = 1
...   B = 1.0
...
>>> Foo(1)

>>> Foo(1.0)

>>> Foo.B


See: 
https://docs.python.org/dev/library/enum.html#duplicating-enum-members-and-values

"However, two enum members are allowed to have the same value. Given
two members A and B with the same value (and A defined first), B is an
alias to A. By-value lookup of the value of A and B will return A.
By-name lookup of B will also return A."

You can use @unique to detect such corner case:

>>> @enum.unique
... class Foo2(enum.Enum):
...   A = 1
...   B = 1.0
...
ValueError: duplicate values found in : B -> A

Victor

On Mon, Dec 28, 2020 at 1:25 AM Paul Bryan via Python-Dev
 wrote:
>
> Should this be considered a bug in the Enum implementation?
>
> >>> class Foo(enum.Enum):
>
> ...   A = True
>
> ...   B = 1
>
> ...   C = 0
>
> ...   D = False
>
> ...
>
> >>> Foo.A
>
> 
>
> >>> Foo(True)
>
> 
>
> >>> Foo(1)
>
> 
>
>
> Seems to me like it should store and compare both type and value.
>
> Paul
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/5IJPHFRLPZE5CGYZH6IXCDH2V4ODXMTB/
> Code of Conduct: http://python.org/psf/codeofconduct/



-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KWB2ABESR5WRB54TVU6VEC3J4CUSR5F3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: __init_subclass__ and metaclasses

2020-12-28 Thread Joao S. O. Bueno
For the record - the 3rd process that is currently un-customizable when
creating a class, i.e. things that happen in an opaque way inside
`type.__new__`,
is the ABC class machinery. I could not recall it when
writing the previous e-mail.

Still - I think this might be very little disruptive, and yet provide
metaclasses back with full customization power, including
being able to address the problem brought up by Etham.


   js
 -><-

On Fri, 25 Dec 2020 at 01:40, Joao S. O. Bueno 
wrote:

> Actually, there are a few steps that `type.__new__`  perform that are not
> customizable in metaclasses.
>
> I had sometimes thought about mailing this here, or Python ideas, but
> could not
> come up with a "real world" use case where the customization of those
> would be meaningful.
>
> Let'me see if I recall all cases - two of them are the calls to
> `__init_subclass__` and the descriptors `__set_name__` as you put it,
> I think there is a third behavior that can't be separated from
> `type.__new__` - but
> I can't remember it now
>
>
> Anyway, the "thing to do" that always occurred to me about it is to add
> "soft" method slots
> to `type` itself - so that `type.__new__` would call those  on the
> corresponding initialization phases.
>
> Since these are to be run only when classes are created, their impact
> should be negligible.
>
> In other words, have `type` implement methods like
> `__run_init_subclass__`, `__run_descriptor_setname__`,
> (and one for the other task I can't remember now). So, all metaclass code
> written up to today remains valid,
> and these behaviors become properly customizable.
>
> Adding keyword parameters to `type.__new__`, IMHO, besides a little bit
> fishy as we are talking of
> arguments to change the behavior of the method, would themselves compete
> and have to be
> filtered out, or otherwise special-cased in the `__init_subclass__` method
> itself.
> I mean - let's suppose we add `__suppress_init_subclass__` as an named
> parameter to
> `type.__new__` - what would happen with this argument in
> `__init_subclass__` ? Would it show
> up in the kwargs? Otherwise it would be the _only_  kwarg popped out and
> not passed
> to __init_subclass__, being an inconvenient exception.
>
> Having an overridable, separate, method in type to run __init_subclass__
> and __set_name__
> bypass these downsides.
>
> In time, Happy holidays everyone!
>
>js
>  -><-
>
> On Fri, 25 Dec 2020 at 00:38, Ethan Furman  wrote:
>
>> PEP 487 introduced __init_subclass__ and __set_name__, and both of those
>> were wins for the common cases of metaclass usage.
>>
>> Unfortunately, the implementation of PEP 487 with regards to
>> __init_subclass__ has made the writing of correct
>> metaclasses significantly harder, if not impossible.
>>
>> The cause is that when a metaclass calls type.__new__ to actually create
>> the class, type.__new__ calls the
>> __init_subclass__ methods of the new class' parents, passing it the newly
>> created, but incomplete, class.  In code:
>>
>> ```
>> class Meta(type):
>>  #
>>  def __new__(mcls, name, bases, namespace, **kwds):
>>  # create new class, which will call __init_subclass__ and
>> __set_name__
>>  new_class = type.__new__(mcls, name, bases, namespace, **kwds)
>>  # finish setting up class
>>  new_class.some_attr = 9
>> ```
>>
>> As you can deduce, when the parent __init_subclass__ is called with the
>> new class, `some_attr` has not been added yet --
>> the new class is incomplete.
>>
>> For Enum, this means that __init_subclass__ doesn't have access to the
>> new Enum's members (they haven't beet added yet).
>>
>> For ABC, this means that __init_subclass__ doesn't have access to
>> __abstract_methods__ (it hasn't been created yet).
>>
>> Because Enum is pure Python code I was able to work around it:
>> - remove new __init_subclass__ (if it exists)
>> - insert dummy class with a no-op __init_subclass__
>> - call type.__new__
>> - save any actual __init_subclass__
>> - add back any new __init_subclass__
>> - rewrite the new class' __bases__, removing the no-op dummy class
>> - finish creating the class
>> - call the parent __init_subclass__ with the now complete Enum class
>>
>> I have not been able to work around the problem for ABC.
>>
>> Two possible solutions I can think of:
>>
>> - pass a keyword argument to type.__new__ that suppresses the call to
>> __init_subclass__; and
>> - provide a way to invoke new class' parent's __init_subclass__ before
>> returning it
>>
>> or
>>
>> - instead of type.__new__ doing that work, have type.__init__ do it.
>>
>> Thoughts?
>>
>> --
>> ~Ethan~
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/ZMRRNSFSLJZDGGZ66CFCYQBINU62CDNX/
>

[Python-Dev] Re: __init_subclass__ and metaclasses

2020-12-28 Thread Guido van Rossum
Let me see if I can unpack this.

I observe that `type.__new__() ` is really the C function `type_new()` in
typeobject.c, and hence I will refer to it by that name.

I understand that `type_new()` is the only way to create type objects, and
it includes a call to `__init_subclass__()`.

In the source code of `type_new()`, calling `__init_subclass__()` is the
last thing it does before returning the newly created class object, so at
this point the class object is complete, *except* that any updates made by
the caller of `type_new()` after `type_new()` returns have not been made,
of course. (For example, `new_class.some_attr = 9` from Ethan's post, or
`__abstractmethods__`, which is set by update_abstractmethods() in abc.py.)

Now here's something that Ethan said that I don't follow:

> For Enum, this means that `__init_subclass__` doesn't have access to the
new Enum's members (they haven't been added yet)

I would presume that in an example like the following, the members *are*
set by the time `type_new()` is called. What am I missing?
```
class Color(enum.Enum):
RED = 1
GREEN = 2
BLUE = 4
```
Maybe the problem is that the members are still set to their "naive"
initial values (1, 2, 4) rather than to the corresponding enum values (e.g.
``)?

Without a more elaborate use case I can't give that more than a shrug. This
is how the `__init_subclass__()` protocol is designed. If it's not to your
liking, you can recommend to your users that they use something else.

Nore that for ABC, if you add or change the abstraction status of some
class attributes, you can just call `update_abstractmethods()` and it will
update `__abstractmethods__` based on the new contents of the class. This
also sounds like no biggie to me.



On Mon, Dec 28, 2020 at 6:47 PM Joao S. O. Bueno 
wrote:

> For the record - the 3rd process that is currently un-customizable when
> creating a class, i.e. things that happen in an opaque way inside
> `type.__new__`,
> is the ABC class machinery. I could not recall it when
> writing the previous e-mail.
>
> Still - I think this might be very little disruptive, and yet provide
> metaclasses back with full customization power, including
> being able to address the problem brought up by Etham.
>
>
>js
>  -><-
>
> On Fri, 25 Dec 2020 at 01:40, Joao S. O. Bueno 
> wrote:
>
>> Actually, there are a few steps that `type.__new__`  perform that are not
>> customizable in metaclasses.
>>
>> I had sometimes thought about mailing this here, or Python ideas, but
>> could not
>> come up with a "real world" use case where the customization of those
>> would be meaningful.
>>
>> Let'me see if I recall all cases - two of them are the calls to
>> `__init_subclass__` and the descriptors `__set_name__` as you put it,
>> I think there is a third behavior that can't be separated from
>> `type.__new__` - but
>> I can't remember it now
>>
>>
>> Anyway, the "thing to do" that always occurred to me about it is to add
>> "soft" method slots
>> to `type` itself - so that `type.__new__` would call those  on the
>> corresponding initialization phases.
>>
>> Since these are to be run only when classes are created, their impact
>> should be negligible.
>>
>> In other words, have `type` implement methods like
>> `__run_init_subclass__`, `__run_descriptor_setname__`,
>> (and one for the other task I can't remember now). So, all metaclass code
>> written up to today remains valid,
>> and these behaviors become properly customizable.
>>
>> Adding keyword parameters to `type.__new__`, IMHO, besides a little bit
>> fishy as we are talking of
>> arguments to change the behavior of the method, would themselves compete
>> and have to be
>> filtered out, or otherwise special-cased in the `__init_subclass__`
>> method itself.
>> I mean - let's suppose we add `__suppress_init_subclass__` as an named
>> parameter to
>> `type.__new__` - what would happen with this argument in
>> `__init_subclass__` ? Would it show
>> up in the kwargs? Otherwise it would be the _only_  kwarg popped out and
>> not passed
>> to __init_subclass__, being an inconvenient exception.
>>
>> Having an overridable, separate, method in type to run __init_subclass__
>> and __set_name__
>> bypass these downsides.
>>
>> In time, Happy holidays everyone!
>>
>>js
>>  -><-
>>
>> On Fri, 25 Dec 2020 at 00:38, Ethan Furman  wrote:
>>
>>> PEP 487 introduced __init_subclass__ and __set_name__, and both of those
>>> were wins for the common cases of metaclass usage.
>>>
>>> Unfortunately, the implementation of PEP 487 with regards to
>>> __init_subclass__ has made the writing of correct
>>> metaclasses significantly harder, if not impossible.
>>>
>>> The cause is that when a metaclass calls type.__new__ to actually create
>>> the class, type.__new__ calls the
>>> __init_subclass__ methods of the new class' parents, passing it the
>>> newly created, but incomplete, class.  In code:
>>>
>>> ```
>>> class Meta(type):
>>>  #
>>>  def __new__(mcls, name, bases, n

[Python-Dev] Re: __init_subclass__ and metaclasses

2020-12-28 Thread Ethan Furman

Issue #42775:   https://bugs.python.org/issue42775
PR #23986:  https://github.com/python/cpython/pull/23986
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EQCFWHIOM43L6IH64OFEHP7WTOC2IC6F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: __init_subclass__ and metaclasses

2020-12-28 Thread Ethan Furman

On 12/28/20 9:31 PM, Guido van Rossum wrote:


Let me see if I can unpack this.

I observe that `type.__new__() ` is really the C function `type_new()` in typeobject.c, and hence I will refer to it by 
that name.


I understand that `type_new()` is the only way to create type objects, and it 
includes a call to `__init_subclass__()`.

In the source code of `type_new()`, calling `__init_subclass__()` is the last thing it does before returning the newly 
created class object, so at this point the class object is complete, *except* that any updates made by the caller of 
`type_new()` after `type_new()` returns have not been made, of course. (For example, `new_class.some_attr = 9` from 
Ethan's post, or `__abstractmethods__`, which is set by update_abstractmethods() in abc.py.)


This is really the heart of the issue.  A major reason to write a custom metaclass is to be able to modify the returned 
class before giving it back to the user -- so even though `type_new` is complete the new class could easily not be 
complete, and calling `__init_subclass__` and `__set_name__` from `type_new` is premature.



Now here's something that Ethan said that I don't follow:


For Enum, this means that `__init_subclass__` doesn't have access to the new 
Enum's members (they haven't been added yet)


I would presume that in an example like the following, the members *are* set by the time `type_new()` is called. What am 
I missing?

```
class Color(enum.Enum):
     RED = 1
     GREEN = 2
     BLUE = 4
```
Maybe the problem is that the members are still set to their "naive" initial values (1, 2, 4) rather than to the 
corresponding enum values (e.g. ``)?


Before `type_new()` is called all the (future) members are removed from `namespace`, so they are not present when 
`__init_subclass__` is called.  Even if they were left in as `{'RED': 1, `GREEN`: 2, `BLUE`: 4}` it would not be 
possible for an `__init_subclass__` to customize the members further, or record them in custom data structures, or run 
validation code on them, or etc.


Without a more elaborate use case I can't give that more than a shrug. This is how the `__init_subclass__()` protocol is 
designed.


The `__init_subclass__` and `__set_name__` protocols are intended to be run before a new type is finished, but creating 
a new type has three major steps:


- `__prepare__` to get the namespace
- `__new__` to get the memory and data structures
- `__init__` for any final polishing

We can easily move the calls from `type_new()` to `type_init`.

Note that for ABC, if you add or change the abstraction status of some class attributes, you can just call 
`update_abstractmethods()` and it will update `__abstractmethods__` based on the new contents of the class. This also 
sounds like no biggie to me.


The issue tracker sample code that fails (from #35815):

```
import abc

class Base(abc.ABC):
#
def __init_subclass__(cls, **kwargs):
instance = cls()
print(f"Created instance of {cls} easily: {instance}")
#
@abc.abstractmethod
def do_something(self):
pass

class Derived(Base):
pass
```

And the output:

`Created instance of  easily: <__main__.Derived object 
at 0x10a6dd6a0>`

If `Base` had been completed before the call to `__init_subclass__`, then `Derived` would have raised an error -- and it 
does raise an error with the patch I have submitted on Github.


--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HG5SHTY76LKYQS7OY5CXH6TYMUGK7O5L/
Code of Conduct: http://python.org/psf/codeofconduct/