[Python-ideas] Re: Custom string prefixes

stpasha Tue, 27 Aug 2019 01:24:07 -0700

Thank you, Steven, for taking the time to write such an elaborate rebuttal.
If I understand the heart of your argument correctly, you're concerned that
the prefixed strings may add confusion to the code. That nobody knows
what `l'abc'` or `czt'xxx'` could possibly mean, while at the same time
`v'1.0'` could mean many things, whereas `v'cal-{a}'` would mean nothing
at all...


These are all valid concerns. The string (or number) prefixes add new power
to the language, and with new power comes new responsibility. While the
syntax can be used to enhance readability of the code, it can also be abused
to make the code more obscure. However, Python does not attempt to be
an idiot-proof language. "We are all consenting adults" is one of its guiding
principles. If a certain feature can potentially be misused shouldn't deter us
from adding it, if the benefits are significant.

And the benefits in terms of readability can be significant. Consider the
existing python prefixes: `r'...'` is purely for readability, it adds no extra
functionality; `f'...'` has a neat compiler support, but even if it didn't (and 
most
python users don't actually realize f-strings get preprocess by the compiler)
it would still enhance readability compared to `str.format()`. It's nice to be 
able
to write a complex number as `5 + 3j` instead of `complex(5, 3)`. And so on.

> What's v() do? Verbose string?
> Oh, you intended a version string did you? If only you had written 
> version instead of v I might not have guessed wrong. What were 
> you saying about preferring readability and clarity over brevity?

You're correct that, devoid of context, `v"smth..."` is not very meaningful. The
"v" suffix could mean "version", or "verbose", or "volatile", or "vectorized",
or "velociraptor", or whatever. Luckily, the code is almost always exists
within a specific context. It solves a particular problem, and works within a
particular domain, and makes perfect sense for people working within that
domain.

This isn't much different than, say, `np.` suffix, which means "numpy" in the
domain of numerical computations, NP-completeness for some mathematicians,
and "no problem" for regular users. 

>From practical perspective, the meaning of each particular symbol will come
from the way that it was created or imported. For example, if you script says
`from packaging.version import v` then "v" is a version. If, on the other hand,
it says `from zoo import velociraptor as v`, then it's an altogether different 
beast.

> In other words, I got all of the meaning from the string part, not the 
> prefix. The prefix on its own, I would have guessed completely wrong.

Exactly. You look at string "1.10a" and you know it must be a version string,
because you're a human, you're smart. The compiler is not a human, it has no
idea. To the Python interpreter it's just a PyUnicode object of length 5. It's
meaningless. But when you combine this string with a prefix into a single
object, it gains power. It can have methods or special behaviors. It can have
a type, different from `str`, that can be inspected when passing this object to
another function.

Think of `v"1.10a"` as making a "typed string" (even though it may end up not
being a string at all). By writing `v"1.10a"` I convey the intent for this to 
be a
version string.

> for rather insignificant gains, the saving of two parentheses. 

Two bytes doesn't sound like a lot. I mean, it is quite little on the grand 
scale
of things. However, I don't think the simple byte-count is a proper measure
here. There could be benefits to readability even if it was 0 or negative byte
difference.

I believe a good way to think about this is the following: if the feature was 
already implemented, would people want to use it, and would it improve
readability of their code? I speculate that the answer is true to both of these
questions. At least some people.

As a practical example, consider function `pandas.read_csv()`. The documentation
for its `sep` parameter says "In addition, separators longer than 1 character 
and
different from ``'\s+'`` will be interpreted as regular expressions ...". In 
this case
they wanted the `sep` parameter to handle both simple separators, and the
regular expression separators. However, as there is no syntax to create a 
"regular expression string", they ended up with this dubious heuristic based on
the length of the string... Ideally, they should have said that `sep` could be 
either
a string or a regexp-object, but the barrier to write 

    from re import compile as rx
    rx('...')

is just impossibly high for a typical user. Not to mention that such code 
**would**
be actually harder to read, because I'd be inventing my own notation for a 
function that is commonly known under a different name.

My another pet peeve is datetime literals. Or, rather, their absence. I often 
see,
again in pandas, how people create columns of strings ["2010-05-01", 
"2010-05-02", 
...], and then call `parse_datetime()`. It would have been more straightforward 
if
there was a standard syntax for denoting datetime constants, allowing us to
create a column of datetime type directly.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/52EUBRUHNT5URR4EE65XMWMKCRUTOQOR/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

Reply via email to