UGH!

I thought there was a general understanding that when typing was added to Python, there would be no impact, or at least minimal impact, on people who didn't use it.  (Raises hand.)
Now we see an(other) instance of intention creep.
Rob Cliffe

On 23/04/2022 02:13, Larry Hastings wrote:


This document is a loose proto-PEP for a new "forward class" / "continue class" syntax.  Keep in mind, the formatting is a mess. If I wind up submitting it as a real PEP I'll be sure to clean it up first.


/arry

--------------------------------------


PEP XXXX: Forward declaration of classes

Overview
--------

Python currently has one statement to define a class, the `class` statement:

```Python
    class X():
        # class body goes here
        def __init__(self, key):
            self.key = key
```

This single statement declares the class, including its bases and metaclass,
and also defines the contents of the class in the "class body".

This PEP proposes an additional syntax for declaring a class which splits
this work across two statements:
* The first statement is `forward class`, which declares the class and binds
  the class object.
* The second statement is `continue class`, which defines the contents
  of the class in the "class body".

To be clear: `forward class` creates the official, actual class object.
Code that wants to take a reference to the class object may take references
to the `forward class` declared class, and interact with it as normal.
However, a class created by `forward class` can't be *instantiated*
until after the matching `continue class` statement finishes.

Defining class `X` from the previous example using this new syntax would read
as follows:

```
    forward class X()

    continue class X:
        # class body goes here
        def __init__(self, key):
            self.key = key
```

This PEP does not propose altering or removing the traditional `class` statement;
it would continue to work as before.


Rationale
---------

Python programmers have had a minor problem with classes for years: there's
no way to have early-bound circular dependencies between objects. If A
depends on B, and B depends on A, there's no linear order that allows
you to cleanly declare both.

Most of the time, the dependencies were in late-binding code, e.g. A refers to B inside a method.  So this was rarely an actual problem at runtime.  When this problem did arise, in code run at definition-time, it was usually only
a minor headache and could be easily worked around.

But the explosion of static type analysis in Python, particularly with
the `typing` module and the `mypy` tool, has made circular definition-time dependencies between classes commonplace--and much harder to solve.  Here's
one simple example:

```Python
    class A:
        value: B

    class B:
        value: A
```

An attribute of `B` is defined using a type annotation of `A`, and an
attribute of `A` is defined using a type annotation of `B`. There's
no order to these two definitions that works; either `A` isn't defined
yet, or `B` isn't defined yet.

Various workarounds and solutions have been proposed to solve this problem, including two PEPs: PEP 563 (automatic stringized annotations) and PEP 649
(delayed evaluation of annotations using functions).
But nothing so far has been both satisfying and complete; either it
is wordy and clumsy to use (manually stringizing annotations), or it
added restrictions and caused massive code breakage for runtime use of
annotations (PEP 563), or simply didn't solve every problem (PEP 649).
This proposed  `forward class` / `continue class` syntax should permit
solving *every* forward-reference and circular-reference problem faced
in Python, using an elegant and Pythonic new syntax.

As a side benefit, `forward class` and `continue class` syntax enables
rudimentary separation of "interface" from "implementation", at least for
classes.  A user seeking to "hide" the implementation details of their
code could put their class definitions in one module, and the
implementations of those classes in a different module.

This new syntax is not intended to replace the traditional `class`
declaration syntax in Python.  If this PEP were accepted, the `class`
statement would still be the preferred mechanism for creating classes
in Python; `forward class` should only be used when it confers some
specific benefit.


Syntax
------

The `forward class` statement is the same as the `class` statement,
except it doesn't end with a colon and is not followed by an indented block.
Without any base classes or metaclass, the `forward class` statement is
as follows:

```
    forward class X
```

This would declare class `X`.

If `X` needs base classes or metaclass, the corresponding `forward class` statement
would be as follows, rendered in a sort of "function prototype" manner:

```
    forward class X(*bases, metaclass=object, **kwargs)
```

The `continue class` statement is similar to a `class` statement
without any bases or metaclass.  It ends with a colon,
and is followed by the "class body":

    continue class X:
        # class body goes here
        pass

One important difference: the `X` in `continue class X:` is not a *name*,
it's an *expression*.  This code is valid:

```
    forward class X()
    snodgrass = X

    continue class snodgrass:
        # class body goes here
        pass
```

as well as this:

```
    import my_module

    continue class my_module.X:
        # class body goes here
        pass
```

Using this new syntax, the forward-reference problem illustrated
in the *Rationale* section above is now easy to solve:

```Python
    forward class B

    class A:
        value: B = None

    continue class B:
        value: A = None
```

One final note.  Why must the base and metaclass be declared
with the `forward class` statement?  The point of this new
syntax is to allow creating the real class object, permitting
users of the class to take references to it early, before it's
fully defined.  And the class could be declared with a
metaclass, and the metaclass could have a `__new__`, which
means it's responsible for creating the class object, and
this syntax would have to preserve that behavior.

(This small detail is about to complicate this proposal a great
deal!)


#### Semantics of forward-declared class objects

`forward class X` declares a class, but the class is explicitly
not yet fully defined.  It won't be fully defined and ready to
be instantiated until after the corresponding `continue class`
statement.  We'll refer to a class object in this state as
a "forward-declared class object".  How does this object behave?

As per the "consenting adults" rule, the forward-declared class
object must permit most operations.  You should be able to examine
the object, compare it to other objects, inspect some attributes
(`__name__`, `__mro__`, `__class__`), and even set attributes.

However, the user isn't permitted to instantiate a forward-declared
class object until after the corresponding `continue class X`.
We ensure this with a new dunder attribute, `__forward__`,
which if present tells the Python runtime that this is a
forward-declared class object.  The `continue class` statement
would delete this attribute from the object, after which it
could be instantiated.

(Users could work around this constraint, or even delete `__forward__`
if they so chose--again, the "consenting adults" rule applies.)

It's explicitly permissible to create a forward-declared class
object that never gets finished with a `continue class` statement.
If all you need is an object that represents the class--say,
to satisfy static type annotation use cases--a forward-declared
class object works fine.

A subsequent section will address the complexities of
how `forward class` and `continue class` interact with metaclasses.
For now, a note about forward-declared class objects declared with
a metaclass implementing `__prepare__`.  The forward-declared class
object *dict* will be the "dict-like object" returned by the
`metaclass.__prepare__()` method.  This "dict-like object" won't
be processed and discarded until after `continue class` processes
the class body and calls the appropriate methods in the metaclass.


#### Semantics of `continue class`

`continue class` may only be run on a class once.
(As Eric V. Smith pointed out in response to an early version of
this proposal, allowing multiple "continue" declarations on the
same class would lead directly to language-condoned monkey-patching.)


#### Decorators

Both the `forward class` and `continue class` statements
support decorators, and the user may use decorators with either
or both statements for the same class.  But now that we've
split the responsibilities of the `class` statement between
these two new statements, which decorator goes with which
statement becomes a novel concern.  In general, decorators
that don't examine the contents of the class, but simply
want to register the class object and its name, can decorate
the `forward class` statement.  Also, class decorators that
want to return a different object for the class should decorate
`forward class`.  But all decorators that meaningfully examine
the contents of the class should decorate the `continue class`
statement.

Unfortunately, there are some decorators that can't work properly
with either `forward class` *or* `continue class`: a decorator
that meaningfully examine the declared contents of that class, but
also return an object other than the original class passed in.
In that case, the user cannot declare this class with
`forward class`; they must declare it with the conventional `class`
statement.

#### __slots__

This leads us to an example of a decorator that, as of 3.10,
wouldn't be usable with classes declared by `forward class`.
It's the new 3.10 feature `@dataclass(slots=True)`.  When called
in this way, dataclass examines the attributes of the class it
has decorated, dynamically constructs a new class using `__slots__`,
and returns this new class.

Since this decorator meaningfully examines the class, it must
be used with `continue class`.  But, this decorator also returns
an object other the original class, which means it's inapproriate
for `continue class` and should be called with `forward class`.
What to do?

We have a separate idea to ameliorate this specific situation.
Right now, a class that uses `__slots__` *must* define them in
the class body, as that member is read before the class name is
bound (or before any descriptors are run).  But we can simply
relax that, and make processing `__slots__` lazy, so that it
isn't examined until the first time the class is *instantiated.*
This would mean `@dataclass(slots=True)` could simply return the
original class, and thus would work fine when decorating a
`continue class` statement.


#### Metaclasses

The existing semantics of metaclasses present a thorny problem
for `forward class` and `continue class`.  First, a review of
how class definition works.

Most of the mechanisms involved with defining a class happen
internally to the interpreter.  However, there are a number
of special methods (aka "dunder methods") called during class
construction that are overridable by user code.  Empirical testing
with the current version of Python (as of this writing, 3.10.4)
reveals the order in which all this work is done.

When Python executes the definition of this class:

```Python
    class Foo(BaseClass, metaclass=MetaClass):
        # class body is here
        pass
```

these events are visible in this order:

1. Python calls `MetaClass.__prepare__`.
2. Python executes the "class body" for class Foo.
3. Python calls `MetaClass.__new__`.
4. Python calls `BaseClass.__init_subclass__`.
   (If multiple base classes define `__init_subclass__`,
   they're called in MRO order.)
5. Python calls `MetaClass.__init__`.
6. The `class` statement binds the class object to the name `Foo`.

The big problem this presents for `forward class`: the
"class body" is executed before the `MetaClass.__new__`.
This is necessary because one of the parameters to `MetaClass.__new__`
is `namespace`, the "dict-like object" returned by `MetaClass.__prepare__`
and initialized by executing the class body using that object
as a sort of locals dictionary.

This creates a chicken-and-egg problem: `forward class` needs
to define the class object, but the class object is defined
by `MetaClass.__new__`, and `MetaClass.__new__` can't run until
after the class body, which we don't run until the `continue class`
statement, which must be after `forward class`.

The unfortunate but necessary solution: split `__new__` into
two new special methods on metaclasses, `__new_forward__`
and `__new_continue__`.  As a reminder, here's the prototype
for `__new__` on a metaclass:

```Python
    def __new__(metaclass, name, bases, namespace, **kwargs):
```

The two new special methods would have the following prototypes:

```Python
    def __new_forward__(metaclass, name, bases, namespace, **kwargs):

    def __new_continue__(metaclass, cls, **kwargs):
```

`__new_forward__` creates the class object.  It sets the `namespace`
member as the class dict, but in general should not examine it
contents.  (Specifically, `__new_forward__` cannot make any assumptions
about whether or not the class body has been executed yet; more on this
in a moment.)

`__new_continue__` is guaranteed to be called after the
class body has been executed.

The externally-visible parts of class construction would
run in a different order for classes constructed using
`forward class` and `continue class`.  First, the visible
interactions from the `forward class` statement:

1. Python calls `MetaClass.__prepare__`.
2. Python calls `MetaClass.__new_forward__`.
3. The `forward class` statement binds the (forward-declared)
   class object to the name `Foo`.

And here are the visible interactions from the
`continue class` statement:

1. Python executes the class body.
2. Python calls `MetaClass.__new_continue__`.
3. Python calls `BaseClass.__init_subclass__`.
   (If multiple base classes define `__init_subclass__`,
   they're called in MRO order.)
4. Python calls `MetaClass.__init__`.

It's important to note that, while `namespace` is passed
in to `__new_forward__`, it's not yet initialized with the
class body.  It's passed in here because the "dict-like object"
returned by `MetaClass.__prepare__` is used as the `__dict__`
for the forward-declared class object.

(This is also novel.  Normally the "dict-like object" is
used as the namespace for the class body, then its contents
are copied out and it is discarded.  Here it will also be
used as the `__dict__` for the forward-declared class object
until the `continue class` statement executes.)

Splitting `__new__` into two methods in this manner has several
ramifications for existing code.

First, Python still needs to support `MetaClass.__new__`
for backwards compatibility with existing code.  Therefore,
when executing the `class` statement, Python will still call
`MetaClass.__new__`.  In fact, for maximum backwards
compatibility, the order of externally-visible events
for the `class` statement should not change at all.

The default implementation of `MetaClass.__new__` will be
changed to call `__new_forward__` and `__new_continue__`.
The implementation will be similar to the following
pseudo-code:

```Python
    def __new__(metaclass, name, bases, namespace, **kwargs):
        cls = metaclass.__new_forward__(metaclass, name, bases, namespace, **kwargs)
        metaclass.__new_continue__(metaclass, cls, namespace, **kwargs)
        return cls
```

This means the order of events will be slightly different
between a class defined with the `class` statement and
a class defined with the `forward class` and `continue class`
statements.  With a `class` statement, the class body will be
run *before* `__new_forward__` is called, but with a `forward class`
statement, the class body will be run *after* `__new_forward__`
is called.  (This is why `__new_forward__` cannot know in advance
whether or not the class body has been called, and the `namespace`
has been filled in.)

User code that defines its own metaclass with its own `__new__`
must also continue to work.  But this leads us to a dangerous
boundary condition:
  * if user code defines a metaclass, and
  * that metaclass defines `__new__` but not `__new_forward__` or
    `__new_continue__`, and
  * user code then uses that metaclass in a `forward class`
    declaration, then
Python must throw a `TypeError` exception.  This situation is
unsafe: clearly the intention with the user's metaclass is to
override some behavior in `__new__`, but the `forward new` statement
will never call `__new__`.

(It's safe to use a metaclass with `forward class` if it doesn't
define `__new__`, or if it defines both `__new__` and either
`__new_forward__` or `__new_continue__`.  It's also safe to
use a metaclass with `class` if it defines either `__new_forward__`
or `__new_continue__` but not `__new__`, because the default `__new__`
will call both `__new_forward__` and `__new_continue__`.)

Going forward, best practice for metaclasses would be to only
implement `__new_forward__` and `__new_continue__`.
Code with metaclasses that wanted to simultaneously support
versions of Python with these new dunder methods *and* older
versions of Python that predate this change would likely have
to conditionally define their own `__new__`, best practices
on this approach TBD.

#### Interactions between `class`, `forward class`, and `continue class`

`class` and `forward class` both bind a name to a newly-created object.
Thus, in the same way that you can have two `class` statements that
bind and re-bind the same name:

```Python
    class C:
        pass
    class C:
        pass
```

You can execute `class` and `forward class` statements in any order
to bind and re-bind the same name:

```Python
    class C:
        pass
    forward class C
```

This works as expected; when this code executes, the previous objects
are dereferenced, and only the last definition of `C` is kept.

Executing a `continue class` statement with a class defined by the
`class` statement raises a `ValueError` exception.
Executing a `continue class` statement with a class defined by the
`forward class` statement that has already had `continue class`
executed on it raises a `ValueError` exception.

It's expected that knowledgeable users will be able to trick
Python into executing `continue class` on the same class multiple
times by interfering with "dunder" attributes.  The same tricks
may also permit users to trick Python into executing `continue class`
on a class defined by the `class` statement.  This is undefined
and unsupported behavior, but Python will not prevent it.


Final Notes
-----------

#### Alternate syntax

Instead of `forward class`, we could use `def class`.  It's not as
immediately clear, which is why this PEP prefers `forward class`.
But this alternate syntax has the advantage of not adding a new
keyword to the language.


#### forward classes and PEP 649

I suggest that `forward class` meshes nicely with PEP 649.

PEP 649 solves the forward-reference and circular-reference
problem for a lot of use cases, but not all.  So by itself
it's not quite a complete solution to the problem.

This `forward class` proposal should solve *all* the
forward-reference and circular-reference problems faced
by Python users today.  However, its use requires
backwards-incompatible code changes to user code.

By adding both PEP 649 and `forward class` to Python, we
get the best of both worlds.  PEP 649 should handle most
forward-reference and circular-reference problems, but the
user could resort to `forward class` for the stubborn edge
cases PEP 649 didn't handle.

In particular, combining this PEP with PEP 649 achieves total
coverage of the challenges cited by PEP 563's *Rationale* section:

> * forward references: when a type hint contains names that have not been >   defined yet, that definition needs to be expressed as a string literal;
> * type hints are executed at module import time, which is not
>   computationally free.

PEP 649 solves many forward reference problems, and delays the evaluation
of annotations until they are used.  This PEP solves the remaining forward
reference problems.


### A proof-of-concept using decorators

I've published a repo with a proof-of-concept of
the `forward class` / `continue class` syntax,
implemented using decorators.
It works surprisingly well, considering.
You can find the repo here:

    https://github.com/larryhastings/forward

Naturally, the syntax using this decorators-based version
can't be quite as clean.  The equivalent declaration for
`class X` using these decorators would be as follows:

```Python
    from forward import *

    @forward()
    class X():
       ...

    @continue_(X)
    class _:
       # class body goes here
       pass
```

Specifically:

* You have to make the `forward` module available somehow.  You can just copy the   `forward` directory into the directory you want to experiment in, or you can   install it locally in your Python install or venv by installing the `flit`
  package from PyPI and running `flit install -s` .
* You must import and use the two decorators from the `forward` module.
  The easiest way is with `from forward import *` .
* For the `forward class` statement, you instead decorate a conventional class   declaration with `@forward()`.  The class body should be empty, with either   a single `pass` statement or a single ellipsis `...` on a line by itself;   the ellipsis form is preferred.  You should name this class with the desired
  final name of your class.
* For the `continue class` statement, you instead decorate a conventional
  class declaration with `@continue_()`, passing in the forward-declared class   object as a parameter to the decorator.  You can use the original name of
  the class if you wish, or a throwaway name like `_` as per the example.
* You may use additional decorators with either or both of these decorators.
  However it's vital that `@forward()` and `@continue_()` are the
  *first* decorators run--that is, they should be on the *bottom* of the
  stack of decorators.

Notes and caveats on the proof-of-concept:

* The `continue_` decorator returns the original "forwarded" class object.
  This is what permits you to stack additional decorators on the class.
  (But, again, you *must* call the `continue_` decorator first--it should
  be on the bottom.)
* To use `__slots__`, you will have to declare them in the `forward` class.
* The proof-of-concept can't support classes that inherit from a class
  which defines `__init_subclass__`.
* Like the proposed syntax, this proof-of-concept doesn't support
  decorators that both examine the contents of the class *and* return
  a different object, e.g. `@dataclass(slots=True)` in Python 3.10.
* This proof-of-concept doesn't work with metaclasses that
  override either `__new__` or `__init__`, where those functions
  examine the `namespace` argument in any meaningful way.


#### tools/

There are some tools in the `tools/` directory that will (attempt to)
automatically add or remove the `@forward()` decorator to class definitions
in Python scripts.  It turns this:

```Python
    class foo(...):
        pass
```

into this:

```Python
    @forward()
    class foo(...):
        ...

    @continue_(foo)
    class _____:
        pass
```

`tools/edit_file.py` will edit one or more Python files specified on the
command-line, making the above change.  By default it will toggle the presence of `@forward`
decorators.  You can also specify explicit behavior:

`-a` adds `@forward()` decorators to `class` statements that don't have them. `-r` removes `@forward` decorators, changing back to conventional `class` statements.
`-t` requests that it "toggle" the state of `@forward()` decorators.

The parser is pretty dumb, so don't run it on anything precious. If it goofs up, sorry!

`tools/edit_tree.py` applies `edit_py.py` to all `*.py` files found anywhere under
a particular directory.

`tools/edit_stdlib.py` was an attempt to intelligently apply `edit_file.py` to the `Lib` tree of a CPython checkout.  Sadly, the experiment didn't really work out; it seemed like there were so many exceptions where the brute-force modification didn't work, either due to descriptors, metaclasses, or base classes with `__init_subclass__`, that I gave up on the time investment.  It's provided here in a non-functional
state in case anyone wants to experiment with it further.

Also, it's intentionally delicate; it only works on git checkout trees, and only with
one specific revision id:

    7b87e8af0cb8df0d76e8ab18a9b12affb4526103

#### Postscript

Thanks to Eric V. Smith and Barry Warsaw for proofreading and ideas.
Thanks in particular to Eric V. Smith for the idea about making
`__slots__` processing lazy.
Thanks to Mark Shannon for the idea of prototyping `forward class`
and `continue class` using decorators and simply copying the
attributes.


_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CW6Z3OS42DYNMO4Y42R6O3LVTPPAA57X/
Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WSJGXGVOIWUAQWEJ2GAAAL3QT2BJ46K3/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to