[Python-ideas] Re: Using explicit parenthesization to convey aspects of semantic meaning?

Paul Sokolovsky Thu, 17 Dec 2020 02:26:34 -0800

Hello,

On Thu, 17 Dec 2020 12:46:17 +1300
Greg Ewing <greg.ew...@canterbury.ac.nz> wrote:

> On 17/12/20 8:16 am, Paul Sokolovsky wrote:
> > With all the above in mind, Python3.7, in a strange twist of fate,
> > and without much ado, has acquired a new operator: the method call,
> > ".()".  
> 
>  > CPython3.6 and below didn't have ".()" operator, and compiled it as
>  > "attr access" + "function call", but CPython3.7 got the new
>  > operator, and compiles it as such (the single operator).  
> 
> Um, no. There is no new operator. You're reading far more into the
> bytecode change than is actually there.

So, here we would need to remember what construes "a good scientific
theory". It's something which explain sufficiently wide array of
phenomena with sufficiently concise premises, predicts effects of
phenomena, and allows to use all that in some way which is "useful".

Henceforth, the theory of the call operator was presented. And if we
look at the actual code, then "a.b" is compiled into LOAD_ATTR, and
"a.b()" into LOAD_METHOD, *exactly* as the theory predicts. It also
makes it clear what is missing from the middle-layer (abstract syntax)
to capture the needed effect - literally, MethodCall AST node. It also
explains what's the meaning (and natural effect) of "(a.b)()". And that
explanation agrees with an intuitive meaning of the use of parens for
grouping. That meaning being "do stuff in parens before", and indeed,
theory tells us, that "(a.b)()" should naturally be implemented by
LOAD_ATTR, followed by CALL_FUNCTION. Which is again agrees with how
some implementations do that.

Given all that, "unlearning" a concept of a method call operator may be
not easier than "unlearning" a concept of natural numbers (or real
numbers, or transcendental numbers, or complex numbers). Please be my
guest.

> > So, why CPython3.7+ still compiles "(a.b)()" using LOAD_METHOD.  
> 
> Because a.b() and (a.b)() have the same semantics, *by definition*.
> That definition has NOT changed. Because they have the same
> semantics, there is no need to generate different bytecode for them.

By *a* particular definition. The theory is above a particular
definition. It explains how "a.b" vs "a.b()" should be compiled, and
indeed, they that way. It also explains how "(a.b)()" should be
compiled, and the fact that particular implementations "optimize" it
doesn't invalidate the theory in any way.

And yeah, as soon as we, umm, adjust the definition, the theory gets
useful to explain what may need to be changed in the codegeneration.

> > But still, are there Python implementations which compile "(a.b)()"
> > faithfully, with its baseline semantic meaning? Of course there're.
> > E.g., MicroPython-derived Python implementations compile it in the
> > full accordance with the theory presented here:  
> 
> All that means is that Micropython is missing a potential
> optimisation. This is probably a side effect of the way its
> parser and code generator work, rather than a conscious decision.

Good nailing down! But it's the same for CPython. CPython compiles
"(a.b)()" using LOAD_METHOD not because it consciously "optimizes" it,
but simply because it's *unable* to represent the difference between
"a.b()" and "(a.b)()".

More specific explanation:

1. MicroPython compiler is CST (concrete syntax tree) based, so it sees
all those parens. (And yes, it has to do silly things to not miss
various optimizations, and still misses a lot.)
2. CPython compiler is AST (abstract syntax tree) based, and as
the current ASDL definition
(https://github.com/python/cpython/blob/master/Parser/Python.asdl)
misses the proper MethodCall node, it conflates "a.b()" and "(a.b)()"
together.

So, the proper way to address the issue would be to add explicit
MethodCall node. So far (for Pycopy), I don't dare for such a step,
instead having a new "is_parenform" attribute on existing nodes,
telling that corresponding node was parens-wrapped in the surface
syntax. (And I had it before, as it's needed to e.g. parse generator
expressions with a recursive-decent parser.)

> Now, it's quite possible to imagine a language in which
> a.b() and (a.b)() mean different things. 

Not only that! It's even possible to imagine Python dialect where "a.b"
and "a.b()" would mean *exactly* what they are - first is attribute
access, second is method call, no exceptions allowed. But then the
question will arise how to call a callable stored in an attributed. So
tell me, Greg (first thing which comes to your mind, for as long as
it's "a)" or "b)" please) what do you like better:

a) (a.b)() syntax
b) apply() being resurrected

> Does anyone remember
> Prothon? (It's a language someone was trying to design a while
> back that was similar to Python but based on prototypes
> instead of classes.)

I don't remember it. Turned out, google barely remembers it, and I had
to give it a bunch of clarifying questions and "do you really mean
that?" answer before I got to e.g. http://wiki.c2.com/?ProthonLanguage 

> A feature of Prothon was that a.b() and t = a.b; t() would do
> quite different things (one would pass a self argument and the
> other wouldn't).
> 
> I considered that a bad thing. I *like* the fact that in Python
> I can use a.b to get a bound method object and call it later,
> with the same effect as if I'd called it directly.
> 
> I wouldn't want that to change. Fortunately, it never will,
> because changing it now would break huge amounts of code.

Of course it never will. It == the current Python's semantic mode.
Instead, new modes will capitalize on the newly discovered features.
One such mode, humbly named "the Strict Mode", was presented here on the
list recently. But that was only the part 1 of strict mode, .

We just discussed the pillar #1 of the strict mode, part 2. For that
pillar being the separation between methods and attributes. But you'll
ask "perhaps we can separate them, but how can we *clearly* separate
those, so there was no ambiguity?". Indeed, that would be  pillar #2.
It is inspired (retroactively of course, as I had that idea for many
years) by feedback received during discussion of the idea of
block-scoped variables.

Some people immediately responded that they want shadowing to be
disallowed (e.g.
https://mail.python.org/archives/list/python-ideas@python.org/message/3IKFBQ5NZ2X7RARMMJORM4V7GSVV5IQG/).
Of course, it doesn't make any sense to disallow shadowing of
block-scoped local variables, for that bringing no theoretical or
implementation benefits (and practical benefits are addressed by
linters or opt-in warnings). But those people were absolutely right -
there're places in Python where shadowing is truly harmful. So, their
wish is granted, but in the good tradition of wish-granting, not where
and not in a way they wanted. For the strict mode, part 2, pillar #2
saying: "It's not allowed to shadow a method with an attribute".

And combined, parts 1 and 2 allow to optimize namespace lookups in
Python, where part 1 deals with module/class namespaces, and part 2
with object namespaces. 

What's interesting, that part 2, just like part 1, of the strict mode
doesn't really make any "revolutionary" changes. It just exposes,
emphasizes, and makes consistent, the properties Python language
already has. Ain't that cute? Do you spot any issues, Greg? 

> -- 
> Greg

[]

-- 
Best regards,
 Paul                          mailto:pmis...@gmail.com
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PF42DP2IYYRFUJMYCJ2NDRBXU57VNFEH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Using explicit parenthesization to convey aspects of semantic meaning?

Reply via email to