On 7/8/20 8:02 AM, Guido van Rossum wrote:
Regarding the syntax for wildcards and OR patterns, the PEP explains
why `_` and `|` are the best choices here: no other language surveyed
uses anything but `_` for wildcards, and the vast majority uses `|`
for OR patterns.  A similar argument applies to class patterns.


In that case, I'd like to make a specific pitch for "don't make '_' special".  (I'm going to spell it '_' as it seems to be easier to read this way; ignore the quotes.)


IIUC '_' is special in two ways:

1) we permit it to be used more than once in a single pattern, and
2) if it matches, it isn't bound.

If we forego these two exceptions, '_' can go back to behaving like any other identifier.  It becomes an idiom rather than a special case.


Drilling down on what we'd need to change:

To address 1), allow using a name multiple times in a single pattern.

622 v2 already says:

   For the moment, we decided to make repeated use of names within the
   same pattern an error; we can always relax this restriction later
   without affecting backwards compatibility.

If we relax it now, then we don't need '_' to be special in this way.  All in all this part seems surprisingly uncontentious.


To address 2), bind '_' when it's used as a name in a pattern.

This adds an extra reference and an extra store.  That by itself seems harmless.

The existing implementation has optimizations here.  If that's important, we could achieve the same result with a little dataflow analysis to optimize away the dead store.  We could even special-case optimizing away dead stores /only/ to '_' and /only/ in match/case statements and all would be forgiven.

Folks point out that I18N code frequently uses a global function named '_'.  The collision of these two uses is unfortunate, but I think it's survivable.  I certainly don't think this collision means we should special-case this one identifier in this one context in the /language/ specification.

Consider:

 * There's no installed base of I18N code using pattern matching,
   because it's a new (proposed!) syntax.  Therefore, any I18N code
   that wants to use match/case statements will be new code, and so can
   be written with this (admittedly likely!) collision in mind.  I18N
   code could address this in several ways, for example:
     o Mandate use of an alternate name for "don't care" match patterns
       in I18N code, perhaps '__' (two underscores).  This approach
       seems best.
     o Use a different name for the '_' function in scopes where you're
       using match/case, e.g. 'gettext'.
     o Since most Python code lives inside functions, I18N code could
       use '_' in its match/case statements, then "del _" after the
       match statement.  '_' would revert back to finding the global
       function.  (This wouldn't work for code at module scope for
       obvious reasons.  One /could/ simply rebind '_', but I doubt
       people want to consider this approach in the first place.)
 * As the PEP mentions, '_' is already a Python idiom for "I don't care
   about this value", e.g. "basename, _, extension =
   filename.partition('.')".  I18N has already survived contact with
   this idiom.
 * Similarly, '_' has a special meaning in the Python REPL. Admittedly,
   folks don't use a lot of I18N work in the REPL, so this isn't a
   problem in practice.  I'm just re-making the previous point: I18N
   programmers already cope with other idiomatic uses of '_'.
 * Static code analyzers could detect if users run afoul of this
   collision.  "Warning: match/case using _ in module using _ for
   gettext" etc.


One consideration: if you /do/ use '_' multiple times in a single pattern, and you /do/ refer to its value afterwards, what value should it get?  Consider that Python already permits multiple assignments in a single expression:

   (x:="first", x:="middle", x:="last")

After this expression is evaluated, x has been bound to the value "last".  I could live with "it keeps the rightmost".  I could also live with "the result is implementation-defined".  I suspect it doesn't matter much, because the point of the idiom is that people don't care about the value.


In keeping with this change, I additionally propose removing '*_' as a special token.  '*_' would behave like any other '*identifier', binding the value to the unpacked sequence. Alternately, we could keep the special token but change it to '*' so it mirrors Python function declaration syntax.  I don't have a strong opinion about this second alternative.


Cheers,


//arry/

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OF4BT5HEWPEGDHNPX26NCANJBYQLLCHT/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to