On 12/07/2020 11:38, Larry Hastings wrote:
In that case, I'd like to make a specific pitch for "don't make '_'
special". (I'm going to spell it '_' as it seems to be easier to read
this way; ignore the quotes.)
IIUC '_' is special in two ways:
1) we permit it to be used more than once in a single pattern, and
2) if it matches, it isn't bound.
If we forego these two exceptions, '_' can go back to behaving like
any other identifier. It becomes an idiom rather than a special case.
Drilling down on what we'd need to change:
To address 1), allow using a name multiple times in a single pattern.
622 v2 already says: [...]
If we relax it now, then we don't need '_' to be special in this way.
All in all this part seems surprisingly uncontentious.
To address 2), bind '_' when it's used as a name in a pattern.
This adds an extra reference and an extra store. That by itself seems
harmless.
The existing implementation has optimizations here. If that's
important, we could achieve the same result with a little dataflow
analysis to optimize away the dead store. We could even special-case
optimizing away dead stores /only/ to '_' and /only/ in match/case
statements and all would be forgiven.
Folks point out that I18N code frequently uses a global function named
'_'. The collision of these two uses is unfortunate, but I think it's
survivable. I certainly don't think this collision means we should
special-case this one identifier in this one context in the /language/
specification.
Consider:
* There's no installed base of I18N code using pattern matching,
because it's a new (proposed!) syntax. Therefore, any I18N code
that wants to use match/case statements will be new code, and so
can be written with this (admittedly likely!) collision in mind.
I18N code could address this in several ways, for example:
o Mandate use of an alternate name for "don't care" match
patterns in I18N code, perhaps '__' (two underscores). This
approach seems best.
In keeping with this change, I additionally propose removing '*_' as a
special token. '*_' would behave like any other '*identifier',
binding the value to the unpacked sequence. Alternately, we could
keep the special token but change it to '*' so it mirrors Python
function declaration syntax. I don't have a strong opinion about this
second alternative.
+1 to everything
One consideration: if you /do/ use '_' multiple times in a single
pattern, and you /do/ refer to its value afterwards, what value should
it get? Consider that Python already permits multiple assignments in a
single expression:
(x:="first", x:="middle", x:="last")
After this expression is evaluated, x has been bound to the value
"last". I could live with "it keeps the rightmost". I could also
live with "the result is implementation-defined". I suspect it
doesn't matter much, because the point of the idiom is that people
don't care about the value.
I'd expect it to bind to the last one. If that's in any way problematic,
in order to prevent oblivious misuse, referencing an identifier that was
bound more than once should raise an exception. But as you say, it
doesn't really matter.
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/ZKW6AOZIYPTCTIQLFZ6L37JXTZKTAYVM/
Code of Conduct: http://python.org/psf/codeofconduct/