[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

Larry Hastings Thu, 15 Apr 2021 22:13:43 -0700


On 4/15/21 9:24 PM, Inada Naoki wrote:

Unlike simple function case, PEP 649 creates function object instead
of code object for __co_annotation__ of methods.
It cause this overhead.  Can we avoid creating functions for each annotation?

As the implementation of PEP 649 currently stands, there are two reasonswhy the compiler might pre-bind the __co_annotations__ code object to afunction, instead of simply storing the code object:


 * If the annotations refer to a closure ("freevars" is nonzero), or
 * If the annotations /possibly/ refer to a class variable (the
   annotations code object contains either LOAD_NAME or LOAD_CLASSDEREF).

If the annotations refer to a closure, then the code object also needsto be bound with the "closure" tuple. If the annotations possibly referto a class variable, then the code object also needs to be bound withthe current "f_locals" dict. (Both could be true.)

Unfortunately, when generating annotations on a method, references tobuiltins (e.g. "int", "str") seem to generate LOAD_NAME instructionsinstead of LOAD_GLOBAL. Which means pre-binding the function happenspretty often for methods. I believe in your benchmark it will happenevery time. There's a lot of code, and a lot of runtime datastructures, inside compile.c and symtable.c behind the compiler'sdecision about whether something is NAME vs GLOBAL vs DEREF etc, and Iwasn't comfortable with seeing if I could fix it.

Anyway I assume it wasn't "fixable". The compiler would presumablyalready prefer to generate LOAD_GLOBAL vs LOAD_NAME, because LOAD_GLOBALwould be cheaper every time for a global or builtin. The fact that italready doesn't do so implies that it can't.

At the moment I have only one idea for a possible optimization, asfollows. Instead of binding the function object immediately, it /might/be cheaper to write the needed values into a tuple, then only actuallybind the function object on demand (like normal).

I haven't tried this because I assumed the difference at runtime wouldbe negligible. On one hand, you're creating a function object; on theother you're creating a tuple. Either way you're creating an object atruntime, and I assumed that bound functions weren't /that/ much moreexpensive than tuples. Of course I could be very wrong about that.

The other thing is, it would be a lot of work to even try theexperiment. Also, it's an optimization, and I was more concerned withcorrectness... and getting it done and getting this discussion underway.

What follows are my specific thoughts about how to implement thisoptimization.

In this scenario, the logic in the compiler that generates the codeobject would change to something like this:


   has_closure = co.co_freevars != 0
   has_load_name = co.co_code does not contain LOAD_NAME or
   LOAD_CLASSDEREF bytecodes
   if not (has_closure or has_load_name):
        co_ann = co
   elif has_closure and (not has_load_name):
        co_ann = (co, freevars)
   elif (not has_closure) and has_load_name:
        co_ann = (co, f_locals)
   else:
        co_ann = (co, freevars, f_locals)
   setattr(o, "__co_annotations__", co_ann)

(The compiler would have to generate instructions creating the tuple andsetting its members, then storing the resulting object on the objectwith the annotations.)

Sadly, we can't pre-create this "co_ann" tuple as a constant and storeit in the .pyc file, because the whole point of the tuple is to containone or more objects only created at runtime.

The code implementing __co_annotations__ in the three objects (function,class, module) would examine the object it got. If it was a codeobject, it would bind it; if it was a tuple, it would unpack the tupleand use the values based on their type:


   // co_ann = internal storage for __co_annotations__
   if isinstance(co_ann, FunctionType) or (co_ann == None):
        return co_ann
   co, freevars, locals = None
   if isinstance(co_ann, CodeType):
        co = co_ann
   else:
        assert isinstance(co_ann, tuple)
        assert 1 <= len(co_ann) <= 3
        for o in co_ann:
            if isinstance(o, CodeObject):
                assert not co
                co = o
            elif isinstance(o, tuple):
                assert not freevars
                freevars = o
            elif isinstance(o, dict):
                assert not locals
                locals = o
            else:
                raise ValueError(f"illegal value in co_annotations
   tuple: {o!r}")
   co_ann = make_function(co, freevars=freevars, locals=locals)
   return co_ann

If you experiment with this approach, I'd be glad to answer questionsabout it, either here or on Github, etc.



Cheers,


//arry/

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2OOCEE6OMBQYEIJXEGFWIBE62VPIJHP5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 649: Deferred Evaluation Of Annotations Using Descriptors, round 2

Reply via email to