On Tue, Aug 27, 2019 at 08:22:22AM -0000, stpa...@gmail.com wrote: > The string (or number) prefixes add new power to the language
I don't think they do. It's just syntactic sugar for a function call. There's nothing that czt'...' will do that czt('...') can't already do. If you have a proposal that allows custom string prefixes to do something that a function call cannot do, I've missed it. > If a certain feature can potentially be misused shouldn't deter us > from adding it, if the benefits are significant. Very true, but so far I see nothing in this proposal that suggests that the benefits are more significant than avoiding having to type a pair of parentheses. Every benefit I have seen applies equally to the function call version, but without the added complexity to the language of allowing custom string prefixes. > And the benefits in terms of readability can be significant. I don't think they will be. I think they will encourage cryptic one-character function names disguised as prefixes: v'...' instead of Version(...) x'...' instead of re.compile(...) to take two examples from your proposal. At least this is somewhat better: sql'...' but that leaves the ambiguity of not knowing whether that's a chained function call s(q(l(...))) or a single sql(...). I believe it will also encourage inefficient and cryptic string parsing instead of more clear use of seperate arguments. Your earlier example: frac'123/4567' The Fraction constructor already accepts such strings, and it is occasionally handy for parsing user-input. But using it to parse string literals gives slow, inefficient code for little or no benefit: [steve@ando cpython]$ ./python -m timeit -s 'from fractions import Fraction' 'Fraction(123, 4567)' 20000 loops, best of 5: 18.9 usec per loop [steve@ando cpython]$ ./python -m timeit -s 'from fractions import Fraction' 'Fraction("123/4567")' 5000 loops, best of 5: 52.9 usec per loop Unless you can suggest a way to parse arbitrary strings in arbitrary ways at compile-time, these custom string prefixes are probably doomed to be slow and inefficient. The best thing I can say about this is that at least frac'123/4567' would probably be easy to understand, since the / syntax for fractions is familiar to most people from school. But the same cannot be said for other custom prefixes: cf'[0; 37, 7, 1, 2, 5]' Perhaps you can guess the meaning of that cf-string. Perhaps you can't. A hint might point you in the right direction: assert cf'[0; 37, 7, 1, 2, 5]' == Fraction(123, 4567) (By the way, the semi-colon is meaningful and not a typo.) To the degree that custom string prefixes will encourage cryptic one and two letter names, I think that this will hurt readability and clarity of code. But if the reader has the domain knowledge to recognise what "cf" stands for, this may be no worse than (say) "re" (regular expression). In conventional code, we might call the cf function like this: cf([0, 37, 7, 1, 2, 5]) # Single list argument. cf(0, 37, 7, 1, 2, 5) # *args version. Either way works for me. But it is your argument that replacing the parentheses with quote marks is "more readable": cf([0, 37, 7, 1, 2, 5]) cf'[0; 37, 7, 1, 2, 5]' not just a little bit more readable, but enough to make up for the inefficiency of having to write your own parser, deal with errors, compile a string literal, parse it at runtime, and only then call the actual cf constructor and return a cf object. Even if I accepted your claim that swapping (...) for '...' was more readable, I am skeptical that the additional work and runtime inefficiency would be worth the supposed benefit. I don't wish to say that parsing strings to extract information is always an anti-pattern: http://cyrille.martraire.com/2010/01/the-string-obsession-anti-pattern/ after all we often need to process data coming from config files or other user-input, where we have no choice but to accept a string. But parsing string *literals* usually is an anti-pattern, especially when there is a trivial transformation from the string to the constructor arguments, e.g. 123/4567 --> Fraction(123, 4567). [...] > Exactly. You look at string "1.10a" and you know it must be a version string, > because you're a human, you're smart. The compiler is not a human, it has no > idea. To the Python interpreter it's just a PyUnicode object of length 5. It's > meaningless. But when you combine this string with a prefix into a single > object, it gains power. It can have methods or special behaviors. It can have > a type, different from `str`, that can be inspected when passing this object > to > another function. Everything you say there applies to ordinary function call syntax too: Version('1.10a') can have methods, special behaviours, a type different from str, etc. Not one of those benefits comes from *custom string prefixes*. They all come from the use of a custom type. In fact, we can can be more explicit and clear with the constructor: Version(major=1, minor=10, stage='a') There is nothing magic about this v-string prefix. You still have to write a Version class with a version-string parser. The compiler can't help you, because it has no knowledge of the format of version strings. All the compiler can do is pass the string '1.10a' to the function v(). [...] > > for rather insignificant gains, the saving of two parentheses. > > Two bytes doesn't sound like a lot. I mean, it is quite little on the grand > scale > of things. However, I don't think the simple byte-count is a proper measure > here. There could be benefits to readability even if it was 0 or negative byte > difference. "There could be..." lots of things, but the onus is on you to prove that there actually *are* such benefits. > I believe a good way to think about this is the following: if the feature was > already implemented, would people want to use it, and would it improve > readability of their code? I answered that in my previous post. I would prefer an explicit, clear, self-documenting function call Version() over a terse, unclear syntax that looks like a string but isn't. I don't think that v'1.10a' is clearer or more readable than Version('1.10a'). It is *shorter*, but that's it. The bottom line is, so long as this proposal is for nothing more than mere syntactic sugar allowing you to drop the parentheses from certain function calls (those that take a single string argument), the benefit is tiny, and the added complexity and opportunity for abuse and confusion is large. > As a practical example, consider function `pandas.read_csv()`. The > documentation > for its `sep` parameter says "In addition, separators longer than 1 character > and > different from ``'\s+'`` will be interpreted as regular expressions ...". In > this case > they wanted the `sep` parameter to handle both simple separators, and the > regular expression separators. However, as there is no syntax to create a > "regular expression string", they ended up with this dubious heuristic based > on > the length of the string... I can't help pandas' poor API, and I doubt that your proposal would have prevented it either. > Ideally, they should have said that `sep` could be either > a string or a regexp-object, but the barrier to write > > from re import compile as rx > rx('...') > > is just impossibly high for a typical user. Think about what you are saying about the sophisticated data scientists who are typical pandas users: - they can write "import pandas" - but not "import re" or "from re import compile as rx" - they will be able to import your rx'...' string prefix from wherever it comes from (perhaps "from re import rx"?) - and are capable of writing regular expressions using your custom rx'...' syntax - but adding parentheses is beyond them: rx('...'). I cannot take this argument about sophisticated regex-users who are defeated by function call syntax seriously. -- Steven _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/KOIHHFVDRWNMY3GSU6XE3GNF4SSQVOP6/ Code of Conduct: http://python.org/psf/codeofconduct/