On 7/8/20 8:02 AM, Guido van Rossum wrote:
Regarding the syntax for wildcards and OR patterns, the PEP explains
why `_` and `|` are the best choices here: no other language surveyed
uses anything but `_` for wildcards, and the vast majority uses `|`
for OR patterns. A similar argument applies to class patterns.
In that case, I'd like to make a specific pitch for "don't make '_'
special". (I'm going to spell it '_' as it seems to be easier to read
this way; ignore the quotes.)
IIUC '_' is special in two ways:
1) we permit it to be used more than once in a single pattern, and
2) if it matches, it isn't bound.
If we forego these two exceptions, '_' can go back to behaving like any
other identifier. It becomes an idiom rather than a special case.
Drilling down on what we'd need to change:
To address 1), allow using a name multiple times in a single pattern.
622 v2 already says:
For the moment, we decided to make repeated use of names within the
same pattern an error; we can always relax this restriction later
without affecting backwards compatibility.
If we relax it now, then we don't need '_' to be special in this way.
All in all this part seems surprisingly uncontentious.
To address 2), bind '_' when it's used as a name in a pattern.
This adds an extra reference and an extra store. That by itself seems
harmless.
The existing implementation has optimizations here. If that's
important, we could achieve the same result with a little dataflow
analysis to optimize away the dead store. We could even special-case
optimizing away dead stores /only/ to '_' and /only/ in match/case
statements and all would be forgiven.
Folks point out that I18N code frequently uses a global function named
'_'. The collision of these two uses is unfortunate, but I think it's
survivable. I certainly don't think this collision means we should
special-case this one identifier in this one context in the /language/
specification.
Consider:
* There's no installed base of I18N code using pattern matching,
because it's a new (proposed!) syntax. Therefore, any I18N code
that wants to use match/case statements will be new code, and so can
be written with this (admittedly likely!) collision in mind. I18N
code could address this in several ways, for example:
o Mandate use of an alternate name for "don't care" match patterns
in I18N code, perhaps '__' (two underscores). This approach
seems best.
o Use a different name for the '_' function in scopes where you're
using match/case, e.g. 'gettext'.
o Since most Python code lives inside functions, I18N code could
use '_' in its match/case statements, then "del _" after the
match statement. '_' would revert back to finding the global
function. (This wouldn't work for code at module scope for
obvious reasons. One /could/ simply rebind '_', but I doubt
people want to consider this approach in the first place.)
* As the PEP mentions, '_' is already a Python idiom for "I don't care
about this value", e.g. "basename, _, extension =
filename.partition('.')". I18N has already survived contact with
this idiom.
* Similarly, '_' has a special meaning in the Python REPL. Admittedly,
folks don't use a lot of I18N work in the REPL, so this isn't a
problem in practice. I'm just re-making the previous point: I18N
programmers already cope with other idiomatic uses of '_'.
* Static code analyzers could detect if users run afoul of this
collision. "Warning: match/case using _ in module using _ for
gettext" etc.
One consideration: if you /do/ use '_' multiple times in a single
pattern, and you /do/ refer to its value afterwards, what value should
it get? Consider that Python already permits multiple assignments in a
single expression:
(x:="first", x:="middle", x:="last")
After this expression is evaluated, x has been bound to the value
"last". I could live with "it keeps the rightmost". I could also live
with "the result is implementation-defined". I suspect it doesn't
matter much, because the point of the idiom is that people don't care
about the value.
In keeping with this change, I additionally propose removing '*_' as a
special token. '*_' would behave like any other '*identifier', binding
the value to the unpacked sequence. Alternately, we could keep the
special token but change it to '*' so it mirrors Python function
declaration syntax. I don't have a strong opinion about this second
alternative.
Cheers,
//arry/
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/OF4BT5HEWPEGDHNPX26NCANJBYQLLCHT/
Code of Conduct: http://python.org/psf/codeofconduct/