[Python-ideas] Re: Custom string prefixes

Steven D'Aprano Wed, 28 Aug 2019 04:16:44 -0700

On Tue, Aug 27, 2019 at 05:13:41PM -0000, stpa...@gmail.com wrote:

> The difference between `x'...'` and `x('...')`, other than visual noise, is 
> the
> following:
> 
> - The first "x" is in its own namespace of string prefixes. The second "x"
>   exists in the global namespace of all other symbols.


Ouch! That's adding a lot of additional complexity to the language.

Python's scoping rules are usually described as LEGB:

- Local
- Enclosing (non-local)
- Global (module)
- Builtins

but that's an over-simplification, dating back to something like Python 
1.5 days. Python scope also includes:

- class bodies can be the local scope, but they don't work quite
  the same as function locals);
- parts of the body of comprehensions behave as if they were a
  seperate scope.

This proposal adds a completely seperate, parallel set of scoping rules 
for these string prefixes. How many layers in this parallel scope?

The simplest design is to have a single, interpreter wide namespace for 
prefixes. Then we will have name clashes, especially since you seem to 
want to encourage single character prefixes like "v" (verbose, version) 
or "d" (date, datetime, decimal). Worse, defining a new prefix will 
affect all other modules using the same prefix.

So we need a more complex parallel scope. How much more complex?


* if I define a string prefix inside a comprehension, function or 
  class body, will that apply across the entire module or just inside 
  that comp/func/class?

* how do nested functions interact with prefixes?

* do we need a set of parallel keywords equivalent to global and 
  nonlocal for prefixes?


If different modules have different registries, then not only do we need 
to build a parallel set of scoping rules for prefixes into the 
interpreter, but we need a parallel way to import them from other 
modules, otherwise they can't be re-used.

Does "from module import x" import the regular object x from the module 
namespace, or the prefix x from the prefix-namespace? So it seems we'll 
need a parallel import system as well.

All this adds more complexity to the language, more things to be coded 
and tested and documented, more for users to learn, more for other 
implementations to re-implement, and the benefit is marginal: the 
ability to drop parentheses from some but not all function calls.


Now consider another problem: introspection, or the lack thereof.

One of the weaknesses of string prefixes is that it's hard to get help 
for them. In the REPL, we can easily get help on any class or function:

    help(function)

and that's really, really great. We can use the inspect module or dir() 
to introspect functions, classes and instances, but we can't do the same 
for string prefixes.

What's the difference between r-strings and u-strings? help() is no help 
(pun intended), since help sees only the string instance, not the syntax 
you used to create it. All of these will give precisely the same output:

    help(str())
    help('')
    help(u'')
    help(r"")

etc. This is a real weakness of the prefix system, and will apply 
equally to custom prefixes. It is *super easy* to introspect a class or 
function like Version; it is *really hard* to do the same for a prefix.

You want this seperate namespace for prefixes so that you can have an v 
prefix without "polluting" the module namespace with a v function (or 
class). But v doesn't write itself! You still have to write a function 
or class, athough you might give it a better name and then register it 
with the single letter prefix:

    @register_prefix('v')
    class Version:
        ...

(say). This still leaves Version lying around in your global namespace, 
unless you explicitly delete it:

    del Version

but you probably won't want to do that, since Version will probably be 
useful for those who want to create Version objects from expressions or 
variables, not just string literals.

So the "pollution" isn't really pollution at all, at least not if you 
use reasonable names, and the main justification for parallel namespaces 
seems much weaker.

Let me put it another way: parallel namespaces is not a feature of this 
proposal. It is a point against it.


> - Python style discourages too short variable names, especially in libraries,
>   because they have increased chance of clashing with other symbols, and
>   generally may be hard to understand. At the same time, short names for
>   string prefixes could be perfectly fine: there won't be too many of them
>   anyways.

That's an interesting position for the proponent of a new feature to 
take. "Don't worry about this being confusing, because hardly anyone 
will use it."


>   The standard prefixes "b", "r", "u", "f" are all short, and nobody
>   gets confused about them.

Plenty of people get confused about raw strings.

There's only four, plus uppercase and combinations, and they are 
standard across the entire language. If there were dozens of them, 
coming from lots of different modules and third-party libraries, with 
lots of conflicts ('v' for version in foolib, but 'v' for verbose in 
barlib), the situation would be very different.

We can't extrapolate from four built-in prefixes being manageable to 
concluding that dozens of clashing user-defined prefixes will be too.


> - Barrier of entry. Today you can write `from re import compile as x` and then
>   write `x('...')` to denote a regular expression (if you don't mind having 
> `x` as
>   a global variable). But this is not the way people usually write code.

I doubt that is true. "from module import foo as bar" is a standard, 
commonly used Python language feature:

https://stackoverflow.com/questions/22245711/from-import-or-import-as-for-modules

in particular this answer here:

https://stackoverflow.com/a/29010729

Besides, we don't design the language for the least knowledgable, most 
ignorant, copy-and-paste coders.


> People
>   write the code the way they are taught from examples, and the examples 
>   don't speak about regular expression objects. The examples only show
>   regular expressions-as-strings

That's simply wrong. The *very first* example of a regular expression here:

https://scotch.io/tutorials/an-introduction-to-regex-in-python

uses the compile function.

More examples talking about regex objects:

https://docs.python.org/3/library/re.html#re.compile

https://pymotw.com/2/re/#compiling-expressions

https://docs.python.org/3/howto/regex.html#compiling-regular-expressions

https://stackoverflow.com/questions/20386207/what-does-pythons-re-compile-do

These weren't hard to find. You don't have to dig deep into obscure 
parts of the WWW to find people talking about regex objects. I think you 
underestimate the knowledge of the average Python programmer.


-- 
Steven
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NS45GVLBWCGINYQYXZMYPCOGSDFAQC7K/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

Reply via email to