subject:"\[Python\-ideas\] Re\: Custom string prefixes"

On Aug 29, 2019, at 16:58, Greg Ewing  wrote:
> 
> Steven D'Aprano wrote:
>> I don't think that stpa...@gmail.com means that the user literally assigns 
>> to locals() themselves. I read his proposal as having the compiler 
>> automatical mangle the names in some way, similar to name mangling inside 
>> classes.
> 
> Yes, but at some point you have to define a function to handle
> your string prefix. If it's at the module level then it's no
> problem, because you can do something like
> 
>   globals()["~f"] = lambda: ...

What happens if you do this, and then include "~f" in __all__, and then import 
* from that module?

I personally would rather have my prefixes or suffixes available in every 
module that imports them, without needing to manually register them each time. 
Not a huge deal, and if nobody else agrees, fine. But if I could __all__ it, I 
could get what I want anyway. :)

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SPS3KQGRUWRVSRLUG2CFX6QYRK4SKCU6/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Greg Ewing


Steven D'Aprano wrote:
I don't think that stpa...@gmail.com means that the user literally 
assigns to locals() themselves. I read his proposal as having the 
compiler automatical mangle the names in some way, similar to name 
mangling inside classes.


Yes, but at some point you have to define a function to handle
your string prefix. If it's at the module level then it's no
problem, because you can do something like

   globals()["~f"] = lambda: ...

But you can't do that for locals. So mangling to something
unspellable would effectively preclude having string prefixes
local to a function.

--
Greg
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IHCCBYVDDRKC4KMKOJFQ3QFY2QWNPS7M/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Greg Ewing


Rhodri James wrote:
Suppose that we did have some funky mechanism to get the compiler to 
create objects at compile time


It doesn't necessarily have to be at compile time. It can be at run
time, as long as it only happens once.

So we use "start_date" somewhere, and mutate it because the start date 
for some purpose was different.  Then we use it somewhere else, and it's 
not the start date we thought it was.  This is essentially the mutable 
default argument gotcha, just writ globally.


I don't think this is as much of a problem as it seems. We often
assign things to globals that are intended to be treated as constants,
with the understanding that it's our responsibility to refrain from
mutating them.

--
Greg
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NHB5WD4S5ZXUOICHFTFLK5INZPJRZRL2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Eric V. Smith





One way to handle this particular case would be to do it as a variant
of f-string that doesn't join its arguments, but passes the list to
some other function. Just replace the final step BUILD_STRING step
with BUILD_LIST, then call the function. There'd need to be some way
to recognize which sections were in the literal and which came from
interpolations (one option is to simply include empty strings where
necessary such that it always starts with a literal and then
alternates), but otherwise, the "sql" manager could do all the
escaping it wants. However, this wouldn't be enough to truly
parameterize a query; it would only do escaping into the string
itself.

Another option would be to have a single variant of f-string that,
instead of creating a string, creates a "string with formatted
values". That would then be a single object that can be passed around
as normal, and if conn.execute() received such a string, it could do
the proper parameterization.


See PEP 501: https://www.python.org/dev/peps/pep-0501/

Eric
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JMYEWFPO7XVLAX5VD7TBPNQW53SM3ZPN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Rob Cliffe via Python-ideas


On 29/08/2019 22:10:21, Andrew Barnert via Python-ideas wrote:


As I’ve said before, I believe that anything that doesn’t have a builtin type 
does not deserve builtin syntax. And I don’t understand why that isn’t a 
near-ubiquitous viewpoint.

+1 (maybe that means I'm missing something).
Just curious:  Is there any reason not to make decimal.Decimal a 
built-in type?  It's tried and tested.  There are situations where 
floats are appropriate, and others where Decimals are appropriate (I'm 
currently using it myself); conceptually I see them as on an equal 
footing.  If it were built-in, there would be good reason to accept 
1.23d meaning a Decimal literal (distinct from a float literal), whether 
or not (any part of) the OP was adopted.

Rob Cliffe

  But it’s not just you; at least three people (all of whom dislike the whole 
concept of custom affixes) seem at least in principle open to the idea of 
adding builtin affixes for types that don’t exist. Which makes me think it’s 
almost certainly not that you’re all crazy, but that I’m missing something 
important. Can you explain it to me?
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GZF2UHWTJNNREOMUEB3HB5BISNHYXFZH/
Code of Conduct: http://python.org/psf/codeofconduct/


---
This email has been checked for viruses by AVG.
https://www.avg.com

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HZTLBTRRK3YMN5S4E3IM77AY4G3LELIP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 29, 2019, at 07:52, Steven D'Aprano  wrote:
> 
>> On Thu, Aug 29, 2019 at 05:30:39AM -0700, Andrew Barnert wrote:
>>> On Aug 29, 2019, at 04:58, Steven D'Aprano  wrote:
>>> 
>>> - quote marks are also used for function calls, but only a limited 
>>> subset of function calls (those which take a single string literal 
>>> argument).
>> 
>> This is a disingenuous argument.
>> 
>> When you read spam.eggs, of course you know that that means to call 
>> the __getattr__('eggs') method on spam. But do you actually read it as 
>> a special method calling syntax that’s restricted to taking a single 
>> string that must be an identifier as an argument
> 
> You make a good point about abstractions, but you are missing the 
> critical point that spam.eggs *doesn't look like a string*. Things that 
> look similar should be similar; things which are different should not 
> look similar.

Which is exactly why you’d read 1.23dec or 1.23f as a number, because it looks 
like a number and also acts like a number, rather than as a function call that 
takes the string '1.23', even if you know that’s how it’s implemented.

And most of the string affixes people have suggested are for string-ish things. 
I’m not sure what a “version string” is, but I might design that as an actual 
subclass of str that adds extractor methods and overrides comparison. A 
compiled regex isn’t literally a string, but neither is a bytes; it’s still 
clearly _similar_ to a string, in important ways. And so is a path, or a URL 
(although I don’t know what you’d use the url prefix for in Python, given that 
we don’t have a string-ish type like ObjC’s NSURL to return and I don’t think 
we need one, but presumably whoever wrote the url affix would be someone who 
disagreed and packaged the prefix with such a class).

And versions of the proposal that allow delimiters other than quotes so you can 
write things like regex/a.*b/, well, I’d need to see a specific proposal to be 
sure, but that seems even less objectionable in this regard. That looks like 
nothing else in Python, but it looks like a regex in awk or sed or perl, so I’d 
probably read it as a regex object.

> I acknowledge your point (and the OP's) that many things in Python are 
> ultimately implemented as function calls. But none of those things look 
> like strings:
> 
> - The argument to the import statement looks like an identifier 
>  (since it is an identifier, not an arbitrary string);
> 
> - The argument to __getattr__ etc looks like an identifier
>  (since it is an identifier, not an arbitrary string);
> 
> - The argument to __getitem__ is an arbitrary expression, not just
>  a string.

The arguments to the dec and f affix handlers look like numeric literals, not 
arbitrary strings.

The arguments to path and version are… probably string literal representations 
(with the quotes and all), not arbitrary strings. Although that does depends on 
the details of the specific proposal, if _any_ of your killer uses needs 
uncooked strings, then either you rcome up with something over complicated like 
C++ where you can register three different kinds of affixes, or you just always 
pass uncooked strings (because it’s trivial to cook on demand but impossible to 
de-cook).

And the arguments to regex may be some _other_ kind of restricted special 
string that… I don’t think anyone has tried to define yet, but you can vaguely 
imagine what it would have to be like, and it certainly won’t be any arbitrary 
string.

> Let me suggest some design principles that should hold for languages 
> with more-or-less "conventional" syntax. Languages like APL or Forth 
> excluded.
> 
> - anything using ' or " quotation marks as delimiters (with or without 
>  affixes) ought to return a string, and nothing but a string;

So b"abc" should not be allowed?

Let’s say I created a native-UTF16-string type to deal with some horrible 
Windows or Java stuff. Why would this principle of yours suggest that I 
shouldn’t be allowed to use u16"" just like b””?

This is a design guideline for affixes, custom or otherwise. Which could be 
useful as a filter on the list of proposed uses, to see if any good ones remain 
(and if no string affix uses remain, then of course the proposal is either 
useless or should be restricted to just numbers or whatever), but it can’t be 
an argument against all affixes, or against custom affixes, or anything else 
generic like that.

> - as a strong preference, anything using quotation marks as delimiters
>  ought to be processed at compile-time (f-strings are a conspicuous 
>  exception to that principle);

I don’t see why you should even want to _know_ whether it’s true, much less 
have a strong preference.

Here are things you probably really do care about: (a) they act like strings, 
(b) they act like constants, (c) if there are potential issues parsing them, 
you see those issues as soon as possible, (d) working with them is more than 
fast enough. Compile time is neither necessary

[Python-ideas] Re: Custom string prefixes

> On Aug 29, 2019, at 06:40, Rhodri James  wrote:
> 
> However, it sounds like what you really want is something I've often really 
> wanted to -- a way to get the compiler to pre-create "constant" objects for 
> me.

People often say they want this, but does anyone actually ever have a good 
reason for it?

I was taken in by the lure of this idea myself—all those wasted frozenset 
constructor calls! (This was before the peephole optimizer understood 
frozensets.) Of course I hadn’t even bothered to construct the frozensets from 
tuples instead of lists, which should have been a hint that I was in premature 
optimization mode, and should have been the first thing I tried before going 
off the deep end. But hacking bytecode is fun, so I sat down and wrote a 
bytecode processor that let me replace any expression with a LOAD_CONST, much 
as the builtin optimizer does for things like simple arithmetic. It’s easy to 
hook it up to a decorator to call on a function, or to an import hook to call 
at module compile time. And then, finally, it’s time to benchmark and discover 
that it makes no difference. Stripping things down to something trivial enough 
to be tested… aha, I really was saving 13us, it’s just that 13us is not 
measurable in code that takes seconds to run.

Maybe someone has a real use case where it matters. But I’ve never seen one. I 
tried to find good nails for my shiny new hammer and never found one, and 
eventually just stopped maintaining it. And then I revived it when I wrote my 
decimal literal hack (the predecessor to the more general user literal hack I 
linked earlier in the thread) back during the 2015 iteration of this 
discussion, but again couldn’t come up with a plausible example where those 
2.3d pseudo-literals were measurably affecting performance and needed 
constifying; I don’t think I even bothered mentioning it in that thread.

Also, even if you find a problem, it‘s almost always easy to work around today. 
If the constant is constructed inside a loop, just manually lift it out of the 
loop. If it’s in a function body, this is effectively the same problem as 
global or builtin lookups being too slow inside a function body, and can be 
solved the same way, with a keyword parameter with a default value. And if the 
Python community thinks that _sin=sin is good enough for the uncommon problem 
of lookups significantly affecting performance, surely of 
_vals=frozenset((1,2,3)) is also good enough for that far more uncommon 
problem, and therefore _limit=1e1000dec would also be good enough for the new 
but probably even more uncommon one.

(Also, notice that the param default can be used with mutable values, it’s just 
up to you to make sure you don’t accidentally mutate them; an invisible 
compiler optimization couldn’t do that, at least not without something like 
Victor Stinner’s FAT guards.)

For what it’s worth, I actually found my @constify decorator more readable than 
the param default, especially for global functions—but not nearly enough so 
that it’s worth using a hacky, CPython-specific module that I have to maintain 
across Python versions (and byteplay to byteplay3 to bytecode) and that nobody 
else is using. Or to propose for a builtin (or stdlib but magic) feature.

What this all comes down to is that, despite my initial impression, I really 
don’t care whether Python thinks 1.23d is a constant value or not; I only care 
whether the human reader thinks it is one.

Think about it this way: do you know off the top of your head whether (1, 
(2,3)) gets optimized to a const the same way (1,2) does in CPython? Has it 
ever occurred to you to check before I asked? And this is actually something 
that changed relatively recently. Why would someone who doesn’t even think 
about when tuples are constified want to talk about how to force Python to 
constify other types? Because even years of Python experience hasn’t cured us 
of premature-optimization-itis.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/L53N6XRIQ7CL43B2R2ZVB3IIFHNK5XD2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> There's no such thing, though, any more than there's such a thing as a
> "raw string". There are only two types of string in Python - text and
> bytes. You can't behave differently based on whether you were given a
> triple-quoted, raw, or other string literal.

A simple implementation could be something like:

@register_literal_prefix("sql")
class SqlLiteral(str): pass

class Connection:
...
def execute(self, stmt):
if isinstance(stmt, SqlLiteral):
# proceed as usual
...
else:
throw TypeError("Expected sql'' string")
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6D2E2GFBGKELBZB23PMT4OMEDZIJWFJD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Richard Damon

On 8/29/19 11:14 AM, Chris Angelico wrote:
> On Fri, Aug 30, 2019 at 3:51 AM Pasha Stetsenko  wrote:
>> My understanding is that for a sql prefix the most valuable part is to be 
>> able
>> to know that it was created from a literal. No other magic, definitely not
>> auto-executing. Then it would be legal to write
>>
>> result = conn.execute(sql"SELECT * FROM people WHERE id=?",
>>   user_id)
>>
>> but not
>>
>> result = conn.execute(f"SELECT * FROM people WHERE id={user_id}")
>>
>> In order to achieve this, the `execute()` method only has to look at
>> the type of its argument, and throw an error if it's a plain string.
> There's no such thing, though, any more than there's such a thing as a
> "raw string". There are only two types of string in Python - text and
> bytes. You can't behave differently based on whether you were given a
> triple-quoted, raw, or other string literal.

But isn't the idea of the sql" (or other) prefix was that the 'plain
string' was put through a special function that processes it, and that
function could return an object of some other type, so it could detect
the difference.

>
>> Perhaps with some more imagination we can make
>>
>> result = conn.execute(sql"SELECT * FROM people WHERE id={user_id}")
>>
>> work too, but in this case the `sql"..."` token would only create an
>> `UnpreparedStatement` object, which expects a variable named "user_id",
>> and then the `conn.execute()` method would pass locals()/globals() into
>> the `.prepare()` method of that statement, binding those values to
>> the placeholders. Crucially, the `.prepare()` method shouldn't modify the
>> object, but return a new PreparedStatement, which then gets executed
>> by the `conn.execute()`.
> One way to handle this particular case would be to do it as a variant
> of f-string that doesn't join its arguments, but passes the list to
> some other function. Just replace the final step BUILD_STRING step
> with BUILD_LIST, then call the function. There'd need to be some way
> to recognize which sections were in the literal and which came from
> interpolations (one option is to simply include empty strings where
> necessary such that it always starts with a literal and then
> alternates), but otherwise, the "sql" manager could do all the
> escaping it wants. However, this wouldn't be enough to truly
> parameterize a query; it would only do escaping into the string
> itself.
>
> Another option would be to have a single variant of f-string that,
> instead of creating a string, creates a "string with formatted
> values". That would then be a single object that can be passed around
> as normal, and if conn.execute() received such a string, it could do
> the proper parameterization.
>
> Not sure either of them would be worth the hassle, though.
>
> ChrisA
=

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JNVP4DU6S3NXQ3MAXOF6XXY3E6VGKVSL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Chris Angelico

On Fri, Aug 30, 2019 at 3:51 AM Pasha Stetsenko  wrote:
>
> My understanding is that for a sql prefix the most valuable part is to be able
> to know that it was created from a literal. No other magic, definitely not
> auto-executing. Then it would be legal to write
>
> result = conn.execute(sql"SELECT * FROM people WHERE id=?",
>   user_id)
>
> but not
>
> result = conn.execute(f"SELECT * FROM people WHERE id={user_id}")
>
> In order to achieve this, the `execute()` method only has to look at
> the type of its argument, and throw an error if it's a plain string.

There's no such thing, though, any more than there's such a thing as a
"raw string". There are only two types of string in Python - text and
bytes. You can't behave differently based on whether you were given a
triple-quoted, raw, or other string literal.

> Perhaps with some more imagination we can make
>
> result = conn.execute(sql"SELECT * FROM people WHERE id={user_id}")
>
> work too, but in this case the `sql"..."` token would only create an
> `UnpreparedStatement` object, which expects a variable named "user_id",
> and then the `conn.execute()` method would pass locals()/globals() into
> the `.prepare()` method of that statement, binding those values to
> the placeholders. Crucially, the `.prepare()` method shouldn't modify the
> object, but return a new PreparedStatement, which then gets executed
> by the `conn.execute()`.

One way to handle this particular case would be to do it as a variant
of f-string that doesn't join its arguments, but passes the list to
some other function. Just replace the final step BUILD_STRING step
with BUILD_LIST, then call the function. There'd need to be some way
to recognize which sections were in the literal and which came from
interpolations (one option is to simply include empty strings where
necessary such that it always starts with a literal and then
alternates), but otherwise, the "sql" manager could do all the
escaping it wants. However, this wouldn't be enough to truly
parameterize a query; it would only do escaping into the string
itself.

Another option would be to have a single variant of f-string that,
instead of creating a string, creates a "string with formatted
values". That would then be a single object that can be passed around
as normal, and if conn.execute() received such a string, it could do
the proper parameterization.

Not sure either of them would be worth the hassle, though.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6LEZYLI6KJ2WXWZM2C6PVD3STD5LF2QU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> How does one get a value into locals()["re~"]?

You're right, I didn't think about that. I agree with Steven's
interpretation that the user is not expected to modify locals
herself, still the immutable nature of locals presents a
considerable challenge.

So I'm thinking that perhaps we could change that to
`globals()["re~"]`, where globals are in fact mutable and 
can even be modified by the user. This would make it so
that affixes can only be declared at a module level, similar
to how `from library import *` is not allowed in a function
either.

This is probably a saner approach anyways -- if affixes 
could mean different things in different functions, that
could be quite confusing...
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XXMC3HQVBTKF5X7ROG4IBZYTH66KZPLB/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

My understanding is that for a sql prefix the most valuable part is to be able
to know that it was created from a literal. No other magic, definitely not 
auto-executing. Then it would be legal to write

result = conn.execute(sql"SELECT * FROM people WHERE id=?",
  user_id)

but not

result = conn.execute(f"SELECT * FROM people WHERE id={user_id}")

In order to achieve this, the `execute()` method only has to look at
the type of its argument, and throw an error if it's a plain string.

Perhaps with some more imagination we can make

result = conn.execute(sql"SELECT * FROM people WHERE id={user_id}")

work too, but in this case the `sql"..."` token would only create an 
`UnpreparedStatement` object, which expects a variable named "user_id",
and then the `conn.execute()` method would pass locals()/globals() into
the `.prepare()` method of that statement, binding those values to
the placeholders. Crucially, the `.prepare()` method shouldn't modify the
object, but return a new PreparedStatement, which then gets executed
by the `conn.execute()`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Y4ISQCWYFNC5DNGUQYRXY5IZMOYUAYVP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

My understanding is that for a sql prefix the most valuable part is to be able
to know that it was created from a literal. No other magic, definitely not 
auto-executing. Then it would be legal to write

result = conn.execute(sql"SELECT * FROM people WHERE id=?",
  user_id)

but not

result = conn.execute(f"SELECT * FROM people WHERE id={user_id}")

In order to achieve this, the `execute()` method only has to look at
the type of its argument, and throw an error if it's a plain string.

Perhaps with some more imagination we can make

result = conn.execute(sql"SELECT * FROM people WHERE id={user_id}")

work too, but in this case the `sql"..."` token would only create an 
`UnpreparedStatement` object, which expects a variable named "user_id",
and then the `conn.execute()` method would pass locals()/globals() into
the `.prepare()` method of that statement, binding those values to
the placeholders. Crucially, the `.prepare()` method shouldn't modify the
object, but return a new PreparedStatement, which then gets executed
by the `conn.execute()`.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WSY76HP4JMV3VNVOJLCF55ILGW3W7WMM/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Richard Damon

On 8/26/19 4:03 PM, stpa...@gmail.com wrote:
> In Python strings are allowed to have a number of special prefixes:
>
> b'', r'', u'', f'' 
> + their combinations.
>
> The proposal is to allow arbitrary (or letter-only) user-defined prefixes as 
> well.
> Essentially, a string prefix would serve as a decorator for a string, 
> allowing the
> user to impose a special semantics of their choosing.
>
> There are quite a few situations where this can be used:
> - Fraction literals: `frac'123/4567'`
> - Decimals: `dec'5.34'`
> - Date/time constants: `t'2019-08-26'`
> - SQL expressions: `sql'SELECT * FROM tbl WHERE a=?'.bind(a=...)`
> - Regular expressions: `rx'[a-zA-Z]+'`
> - Version strings: `v'1.13.0a'`
> - etc.
>
> This proposal has been already discussed before, in 2013:
> https://mail.python.org/archives/list/python-ideas@python.org/thread/M3OLUURUGORLUEGOJHFWEAQQXDMDYXLA/
>
> The opinions were divided whether this is a useful addition. The opponents
> mainly argued that as this only "saves a couple of keystrokes", there is no
> need to overcomplicate the language. It seems to me that now, 6 years later, 
> that argument can be dismissed by the fact that we had, in fact, added new
> prefix "f" to the language. Note how the "format strings" would fall squarely
> within this framework if they were not added by now.
>
> In addition, I believe that "saving a few keystroked" is a worthy goal if it 
> adds
> considerable clarity to the expression. Readability counts. Compare:
>
> v"1.13.0a"
> v("1.13.0a")
>
> To me, the former expression is far easier to read. Parentheses, especially as
> they become deeply nested, are not easy on the eyes. But, even more 
> importantly,
> the first expression much better conveys the *intent* of a version string. It 
> has
> a feeling of an immutable object. In the second case the string is passed to 
> the
> constructor, but the string has no meaning of its own. As such, the second
> expression feels artificial. Consider this: if the feature already existed, 
> how *would*
> you prefer to write your code?
>
> The prefixes would also help when writing functions that accept different 
> types
> of their argument. For example:
>
> collection.select("abc")   # find items with name 'abc'
> collection.select(rx"[abc]+")  # find items that match regular expression
>
> I'm not discussing possible implementation of this feature just yet, we can 
> get to
> that point later when there is a general understanding that this is worth 
> considering.

I have seen a lot of discussion on this but haven't seen a few points
that I thought of brought up. One solution to all these would be to have
these be done as suffixes,

Python currently has a number of existing prefixes to strings that are
valid, and it might catch some people when they want to use a
combination that is currently a valid prefix. (It has been brought up
that this converts an invalid prefix from an immediately diagnosable
syntax error to a run time error.)

This also means that it becomes very hard to decide to add a new prefix
as that would now have a defined meaning.

A second issue is that currently some of the prefixes (like r) change
how the string literal is parsed. These means that the existing prefixes
are just a slightly special case of the general rules, but need to be
treated very differently, or perhaps somehow the prefix needs to
indicate what standard prefix to use to parse the string. Some of your
examples could benefit by sometimes being able to use r' and sometimes
not, so being able to say both r'string're or 'string're could be useful.

-- 
Richard Damon
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VXIDEEE3225UIKWJOROCJVIESXBJIS2O/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Paul Moore

On Thu, 29 Aug 2019 at 15:54, Steven D'Aprano  wrote:

> Let me suggest some design principles that should hold for languages
> with more-or-less "conventional" syntax. Languages like APL or Forth
> excluded.

This will degenerate into nitpicking very fast, so let me just say
that I understand the general idea that you're trying to express here.
I don't entirely agree with it, though, and I think there are some
fairly common violations of your suggestion below that make your
arguments less persuasive than maybe you'd like.

> - anything using ' or " quotation marks as delimiters (with or without
>   affixes) ought to return a string, and nothing but a string;

In C, Java and C++, 'x' is an integer (char).
In SQL (some dialects, at least) TIMESTAMP'2019-08-22 11:32:12' is a
TIMESTAMP value.
In Python, b'123' is a bytes object (which maybe you're willing to
classify as "a string", but the line blurs quite fast).

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XPL2VXD55GL7VCM7TO36MI4ZAECEJFUS/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Thu, Aug 29, 2019 at 05:30:39AM -0700, Andrew Barnert wrote:
> On Aug 29, 2019, at 04:58, Steven D'Aprano  wrote:
> > 
> > - quote marks are also used for function calls, but only a limited 
> > subset of function calls (those which take a single string literal 
> > argument).
> 
> This is a disingenuous argument.
> 
> When you read spam.eggs, of course you know that that means to call 
> the __getattr__('eggs') method on spam. But do you actually read it as 
> a special method calling syntax that’s restricted to taking a single 
> string that must be an identifier as an argument

You make a good point about abstractions, but you are missing the 
critical point that spam.eggs *doesn't look like a string*. Things that 
look similar should be similar; things which are different should not 
look similar.

I acknowledge your point (and the OP's) that many things in Python are 
ultimately implemented as function calls. But none of those things look 
like strings:

- The argument to the import statement looks like an identifier 
  (since it is an identifier, not an arbitrary string);

- The argument to __getattr__ etc looks like an identifier
  (since it is an identifier, not an arbitrary string);

- The argument to __getitem__ is an arbitrary expression, not just
  a string.

All three are well understood to involve runtime lookups: modules must 
be searched for and potentially compiled, object superclass inheritance 
hierarchies must be searched; items or keys in a list or dict must be 
looked up. None of them suggest a constant literal in the same way that 
"" string delimiters do.

The large majority of languages follow similar principles, allowing for 
usually minor syntactic differences. Some syntactic conventions are very 
weak, and languages can and do differ greatly. But some are very, very 
strong, e.g.:

123.4567 is nearly always a numeric float of some kind, rather 
than ((say) multiplying two ints;

' and/or " are nearly always used for delimiting strings.

Even languages like Forth, which have radically different syntax to 
mainstream languages, sort-of follows that convention of associating
quote marks with strings.

." outputs the following character string, terminating at 
the next " character.

i.e. ." foo" in Forth would be more or less equivalent to print("foo") 
in Python.

Let me suggest some design principles that should hold for languages 
with more-or-less "conventional" syntax. Languages like APL or Forth 
excluded.

- anything using ' or " quotation marks as delimiters (with or without 
  affixes) ought to return a string, and nothing but a string;

- as a strong preference, anything using quotation marks as delimiters
  ought to be processed at compile-time (f-strings are a conspicuous 
  exception to that principle);

- using affixes for numeric types seems like a fine idea, and languages
  like Julia that offer a wide-range of builtin numeric types show 
  that this works fine; in Python2 we used to have native ints and 
  longints that took a L suffix so there's precedent there.

[...]
> And the same goes for regex"a.*b" or 1.23f as well. Of course you’ll 
> know that under the covers that means something like calling 
> __whatever_registry__['regex'] with the argument "a.*b", but you’re 
> going to think of it as a regex object 

No I'm not. I'm going to think of it as a *string*, because it looks 
like a string.

Particularly given the OP's preference for single-letter prefixes.

1.23f doesn't look like a string, it looks like a number. I have no 
objection to that in principle, although of course there is a question 
whether float32 is important enough to justify either builtin syntax or 
custom, user-defined syntax.

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RQQFV5AJCVJHYSYUVM2UQ2HQOLU6KBMV/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Rhodri James


On 29/08/2019 14:40, Rhodri James wrote:

Pace Stephen's point


My apologies, it was Steven's point.

--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/H4PWNBVMQWNTD25X7L3DW36FCP4R5Y2L/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Thu, Aug 29, 2019 at 08:17:39PM +1200, Greg Ewing wrote:
> stpa...@gmail.com wrote:
> 
> >re'a|b|c'  --becomes-->  (locals()["re~"])("a|b|c")
> >2.3f   --becomes-->  (locals()["~f"])("2.3")
> 
> How does one get a value into locals()["re~"]?

I don't think that stpa...@gmail.com means that the user literally 
assigns to locals() themselves. I read his proposal as having the 
compiler automatical mangle the names in some way, similar to name 
mangling inside classes.

The transformation from prefix re to mangled name 're~' is easy, the 
compiler could surely handle that, but I'm not sure how the other side 
of it will work. How does one register that re.compile (say) is to be 
aliased as the prefix 're'? I'm fairly sure we don't want to allow ~ in 
identifiers:

# not this
re~ = re.compile

I'm still not convinced that we need this parallel namespace idea, even 
in a watered down version as name-mangling. Why not just have the prefix 
X call name X for any valid name X (apart from the builtin prefixes)? I 
still am not convinced that is a good idea, but at least the complexity 
is significantly reduced.

P.S. stpa...@gmail.com if you're reading this, it would be nice if you 
signed your emails with a name, so we don't have to refer to you by your 
email address or as "the OP".

-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/W45RVSWEBM22EXZQ4DGE5KP7WNIRPWCG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Thu, Aug 29, 2019 at 09:58:35PM +1000, Steven D'Aprano wrote:

> Since Python is limited to ASCII syntax, we only have a small number of 
> symbols suitable for delimiters. With such a small number available, 

Oops, I had an interrupted thought there.

With such a small number available, there is bound to be some 
duplication, but it tends to be fairly consistent across the majority of 
conventional programming languages.



-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5BDUSFIRL2CC47R73HFIUEX2EX2K77N2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Rhodri James


On 29/08/2019 00:24, Andrew Barnert wrote:

On Aug 27, 2019, at 10:21, Rhodri James  wrote:


You make the point yourself: this is something we already understand from 
dealing with complex numbers in other circumstances.  That is not true of 
generic single-character string prefixes.


It certainly is true for 1.23f.


I would contend that (and anyway 1.23f is redundant; 1.23 is already a 
float literal).  But anyway I said "generic single-character string 
prefixes", because that's what the original proposal was.  You seem to 
be going off on creating literal syntax for standard library types 
(which, for the record, I think is a good idea and deserves its own 
thread), but that's not what the OP seems to be going for.


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/326USPRMZIYO4WBCEWV4HJETQTTIVKMY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Paul Moore

On Thu, 29 Aug 2019 at 14:21, Andrew Barnert  wrote:
> You can’t avoid tradeoffs by trying to come up with a rule that makes 
> language decisions automatically. (If you could, why would this list even 
> exist?) The closest thing you can get to that is the vague and 
> self-contradictory and facetious but still useful Zen.

Sorry, I wasn't trying to imply that you could. Just that choosing to
implement some, but not all, possible literal affixes on a case by
case basis was a valid language design option, and one that is taken
in many cases. Your statement

> Think about it this way; assuming f and frac and dec and re and sql and so on 
> are useful, out options are:
>
> 1) people don’t get a useful feature
> 2) we add user-defined affixes
> 3) we add all of these as builtin affixes
>
> While #3 theoretically isn’t impossible, it’s wildly implausible, and 
> probably a bad idea to boot, so the realistic choice is between 1 and 2.

seemed to imply that you thought it was an "all or nothing" choice. My
apologies if I misunderstood your point.

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3KOQHR5TSNVLCVLOZNGXWWSRW5UHYWLX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Rhodri James


On 28/08/2019 23:01, stpa...@gmail.com wrote:

you have something that looks like a kind of string czt'...'
but is really a function call that might return absolutely
anything at all;


This is kinda the whole point. I understand, of course, how the
idea of a string-that-is-not-a-string may sound blasphemous,
however I invite you to look at this from a different perspective.


I don't think it's blasphemous.  I think it's misleading, and that's far 
worse.



Today's date is 2019-08-28. The date is a moment in time, or
perhaps a point in the calendar, but it is certainly not a string.
How do we write this date in Python? As
`datetime("2019-08-28")`. We are forced to put the date into
a string and pass that string into a function to create an actual
datetime object.


Pace Stephen's point that this is not in fact how datetime works, this 
has the major advantage of being readable.  My thought processes on 
coming across that in code would go something like; "OK, we have a 
function call.  Judging from the name its something to do with dates and 
times, so the result is going to be some date/time thing.  Oh, I 
remember seeing "from datetime import datetime" at the top, so I know 
where to look it up if it becomes important.  Fine.  Moving on."



With this proposal the code would look something like
`dt"2019-08-28"`. You're right, it's not a string anymore. But
it *should not* have been a string to begin with, we only used
a string there because Python didn't offer us any other way.
Now with prefixed strings the justice is finally done: we are
able to express the notion of  directly.


Here my thoughts would be more like; "OK, this is some kind of special 
string.  I wonder what "dt" means.  I wonder where I look it up.  The 
string looks kind of like a date in ISO order, bear that in mind.  Maybe 
"dt" is "date/time"."  Followed a few lines later by "wait, why are we 
calling methods on that string that don't look like string methods? 
WTF?  Maybe "dt" means "delirium tremens".  Abort!  Abort!"


Obviously I've played this up a bit, but the point remains that even if 
I do work out that "dt" is actually a secret function call, I have to go 
back and fix my understanding of the code that I've already read.  This 
significantly increases the chance that my understanding will be wrong. 
This is a Bad Thing.



And the fact that it may still use strings under the hood to
achieve the desired result is really an implementation detail,
that may even change at some point in the future.


If all that dt"string" gives us is a run-time call to dt("string"), it's 
a complete non-starter as far as I'm concerned.  It's adding confusion 
for no real gain.  However, it sounds like what you really want is 
something I've often really wanted to -- a way to get the compiler to 
pre-create "constant" objects for me.  The trouble is that after 
thinking about it for a bit, it almost always turns out that I don't 
want that after all.


Suppose that we did have some funky mechanism to get the compiler to 
create objects at compile time so we don't have the run-time creation 
cost to contend with.  For the sake of argument, let's make it


  start_date = $datetime(2019,8,28)

(I know this syntax would be laughed out of court, but like I said, for 
the sake of argument...)


So we use "start_date" somewhere, and mutate it because the start date 
for some purpose was different.  Then we use it somewhere else, and it's 
not the start date we thought it was.  This is essentially the mutable 
default argument gotcha, just writ globally.


The obvious cure for that would be to have our compile-time created 
objects be immutable.  Leaving aside questions like how we do that, and 
whether contained containers are immutable, and so on, we still have the 
problem that we don't actually want an immutable object most of the 
time.  I find that almost invariably I need to use the constant as a 
starting point, but tweak it somehow.  Perhaps like in the example 
above, the start date is different for a particular purpose.  In that 
case I need to copy the immutable object to a mutable version, so I have 
all the object creation shenanigans to go through anyway, and that 
saving I thought I had has gone away.


I'm afraid these custom string prefixes won't achieve what I think you 
want to achieve, and they will make code less readable in the process.



the OP still hasn't responded to my question about the ambiguity
of the proposal (is czt'...' a one three-letter prefix, or three
one-letter prefixes?)


Sorry, I thought this part was obvious. It's a single three-letter prefix.


So how do you distinguish the custom prefix "br" from a raw byte string? 
 Existing syntax allows prefixes to stack, so there's inherent 
ambiguity in multi-character prefixes.


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to

[Python-ideas] Re: Custom string prefixes

On Aug 29, 2019, at 00:58, Paul Moore  wrote:
> 
> If you
> assume everything should be handled by general mechanisms, you end up
> at the Lisp/Haskell end of the spectrum. If you decide that the
> language defines the limits, you are at the C end.

And if you don’t make either assumption, but instead judge each case on its own 
merits, you end up with a language which is better than languages at either 
extreme.

There are plenty of cases where Python generalizes beyond most languages (how 
many languages use the same feature for async functions and sequence iteration? 
or get metaclasses for free by having only one “kind” and then defining both 
construction and class definitions as type calls?), and plenty where It doesn’t 
generalize as much as most languages, and its best features are found all 
across that spectrum.

You can’t avoid tradeoffs by trying to come up with a rule that makes language 
decisions automatically. (If you could, why would this list even exist?) The 
closest thing you can get to that is the vague and self-contradictory and 
facetious but still useful Zen.

If you really did try to zealously pick one side or the other, always avoiding 
general solutions whenever a hardcoded solution is simpler no matter what, the 
best-case scenario would be something like Go, where a big ecosystem of codegen 
tools defeats your attempt to be zealous and makes your language actually 
usable despite your own efforts until soon you start using those tools even in 
the stdlib.

Also, I’m not sure the spectrum is nearly as well defined as you imply in the 
first place. It’s hard to find a large C project that doesn’t use the hell out 
of preprocessor macros to effectively create custom syntax for things like 
error handling and looping over collections (not to mention M4 macros to 
autoconf the code so it’s actually portable instead of just theoretically 
portable), and meanwhile Haskell’s syntax is chock full of special-purpose 
features you couldn’t build yourself (would anyone even use the language 
without, say, do blocks?).

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DIOIGVP4EA5GID4DFZGJT2HPMDLNBA7Y/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 29, 2019, at 04:58, Steven D'Aprano  wrote:
> 
> - quote marks are also used for function calls, but only a limited 
> subset of function calls (those which take a single string literal 
> argument).

This is a disingenuous argument.

When you read spam.eggs, of course you know that that means to call the 
__getattr__('eggs') method on spam. But do you actually read it as a special 
method calling syntax that’s restricted to taking a single string that must be 
an identifier as an argument, or do you read it as accessing the eggs member? 
Of course you read it as member access, not as a special restricted calling 
syntax (except in rare cases—e.g., you’re debugging a 
__getattribute__), because to do otherwise would be willfully obtuse to do so, 
and would actively impede your understanding of the code. And the same goes for 
lots of other cases, like [1:7].

And the same goes for regex"a.*b" or 1.23f as well. Of course you’ll know that 
under the covers that means something like calling 
__whatever_registry__['regex'] with the argument "a.*b", but you’re going to 
think of it as a regex object or a float object, not as a special restricted 
calling syntax, unless you want to actively impede your understanding of the 
code.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JGDXZSDXGHFAHSPIS5MCKDDWJJ2WVOV2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Wed, Aug 28, 2019 at 10:01:25PM -, stpa...@gmail.com wrote:

> > you have something that looks like a kind of string czt'...' 
> > but is really a function call that might return absolutely 
> > anything at all;
> 
> This is kinda the whole point.

Yes, I understand that. And that's one of the reasons why I think that 
this is a bad idea.

Since Python is limited to ASCII syntax, we only have a small number of 
symbols suitable for delimiters. With such a small number available, 

- parentheses () are used for grouping and function calls;
- square brackets [] are used for lists and subscripting;
- curly brackets {} are used for dicts and sets;
- quote marks are used for bytes and strings;

And with your proposal:

- quote marks are also used for function calls, but only a limited 
subset of function calls (those which take a single string literal 
argument).

Across a large majority of languages, it is traditional and common to 
use round brackets for grouping and function calls, and square and curly 
brackets for collections. There are a handful of languages, like 
Mathematica, which use [] for function calls.






> I understand, of course, how the 
> idea of a string-that-is-not-a-string may sound blasphemous,

Its not a matter of blasphemy. It's a matter of readability and 
clarity.


> however I invite you to look at this from a different perspective.
> 
> Today's date is 2019-08-28. The date is a moment in time, or 
> perhaps a point in the calendar, but it is certainly not a string.
> How do we write this date in Python? As 
> `datetime("2019-08-28")`. We are forced to put the date into
> a string and pass that string into a function to create an actual
> datetime object.

We are "forced" to write that are we? Have you ever tried it?


py> from datetime import datetime
py> datetime("2019-08-28")
Traceback (most recent call last):
  File "", line 1, in 
TypeError: an integer is required (got type str)


> With this proposal the code would look something like 
> `dt"2019-08-28"`. You're right, it's not a string anymore. But
> it *should not* have been a string to begin with, we only used
> a string there because Python didn't offer us any other way.

py> datetime(2019, 8, 28)
datetime.datetime(2019, 8, 28, 0, 0)


It is difficult to take your argument seriously when so much of it rests 
on things which aren't true.


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RFAZPMCGCPO4JOHLBHLTE5KNCA5RP6LN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Greg Ewing


stpa...@gmail.com wrote:


re'a|b|c'  --becomes-->  (locals()["re~"])("a|b|c")
2.3f   --becomes-->  (locals()["~f"])("2.3")


How does one get a value into locals()["re~"]?

--
Greg
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SSUP6MFT2XU2BOZKIT4TBGBEMIPQHZW2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-29 Thread Paul Moore

On Thu, 29 Aug 2019 at 01:18, Andrew Barnert  wrote:
> > Also, it's worth noting that the benefits of *user-defined* literals
> > are *not* the same as the benefits of things like 0.2f, or 3.14d, or
> > even re/^hello.*/. Those things may well be useful. But the benefit
> > you gain from *user-defined* literals is that of letting the end user
> > make the design decisions, rather than the language designer. And
> > that's a subtly different thing.
>
> That’s a good point, but I think you’re missing something big here.
>
> Think about it this way; assuming f and frac and dec and re and sql and so on 
> are useful, out options are:
>
> 1) people don’t get a useful feature
> 2) we add user-defined affixes
> 3) we add all of these as builtin affixes
>
> While #3 theoretically isn’t impossible, it’s wildly implausible, and 
> probably a bad idea to boot, so the realistic choice is between 1 and 2.

That's a completely different point. Built in affixes are defined by
the language, user defined affixes are defined by the user
(obviously!) That includes all aspects of design - both how a given
affix works, and whether it's justified to have an affix at all for a
given use case. The argument is identical to that of user-defined
operators vs built in operators. If you can use this argument to
justify user-defined affixes, it applies equally to user-defined
operators, which is something that has been asked for far more often,
with much more widespread precedents in other languages, and been
rejected every time.

Regarding your cases #1, #2, and #3, this is the fundamental point of
language design - you have to choose whether a feature is worthwhile
(in the face of people saying "well *I* would find it useful), and
whether to provide a general mechanism or make a judgement on which
(if any) use cases warrant a special-case language builtin. If you
assume everything should be handled by general mechanisms, you end up
at the Lisp/Haskell end of the spectrum. If you decide that the
language defines the limits, you are at the C end. Traditionally,
Python has been a lot closer to the "language defined" end of the
scale than the "general mechanisms" end. You can argue whether that's
good or bad, or even whether things should change because people have
different expectations nowadays, but it's a fairly pervasive design
principle, and should be treated as such.

This actually goes back to the OP's point:

> we can get to that point later when there is a general understanding that 
> this is worth considering

The biggest roadblock to a "general understanding that this is worth
considering" is precisely that Python has traditionally avoided
(over-) general mechanisms for things like this. The obvious other
example, as I mentioned above, being user defined operators. I've been
very careful *not* to use the term "Pythonic" here, as it's too easy
for that to be a way of just saying "my opinion is more correct than
yours" without a real justification, but the real stumbling block for
proposals like this tends to be far less about the technical issues,
and far *more* about "does this fit into the philosophy of Python as a
language, that has made it as successful as it is?" My instinct is
that it doesn't fit well with Python's general philosophy.

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AIB4VA56WQ2Z26GD37ITJHD64OQVVDYT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 28, 2019, at 12:45, stpa...@gmail.com wrote:
>
> In the thread from
> 2013 where this issue was discussed, many people wanted `sql"..."`
> literal to be available as literal and nothing else.

Since this specific use has come up a few times—and a similar feature in other
languages—can you summarize exactly what people want from this one?

IIRC, DB-API 2.0 doesn’t have any notion of compiled statements, or bound
statements, just this:

Connection.execute(statement: str, *args) -> Cursor

So the only thing I can think of is that sql"…" is a shortcut for that. Maybe:

curs = sql"SELECT lastname FROM person WHERE firstname={firstname}"

… which would do the equivalent of:

curs = conn.execute("SELECT lastname FROM person WHERE firstname=?",
firstname)

… except that it knows whether your particular database library uses ? or %s or
whatever for SQL params.

I can see how that could be useful, but I’m not sure how it could be easily
implemented.

First, it has to know where to find your connection object. Maybe the library
that exposes the prefix requires you to put the connection in a global (or
threadlocal or contextvar) with a specific name, or manages a pool of
connections that it stores in its own module or something? But that seems
simultaneously too magical and too restrictive.

And then it has to do f-string-style evaluation of the brace contents, in your
scope, to get the args to pass along. Which I’d assume means that prefix
handlers need to get passed locals and globals, so the sql prefix handler can
eval each braced expression? (Even that wouldn’t be as good as f-strings, but
it might be good enough here?)

Even with all that, I‘m pretty sure I’d never use it. I’m often willing to
bring magic into my database API, but only if I get a lot more magic (an
expression-builder library, a full-blown ORM, that thing that I forget the name
of that translates generators into SQL queries quasi-LINQ-style, etc.). But
maybe there are lots of people who do want just this much magic and no more. Is
this roughly what people are asking for?

If so, is that eval magic needed for any other examples you’ve seen besides
sql? It’s definitely not needed for regexes, paths, really-raw strings, or any
of the numeric examples, but if it is needed for more than one good example,
it’s probably still worth looking at whether it’s feasible.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/TIDUG5ZWIX2ATV7QZMYULQCFPERF3LMI/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 28, 2019, at 01:05, Paul Moore  wrote:
> 
> On Wed, 28 Aug 2019 at 05:04, Andrew Barnert via Python-ideas
>  wrote:
>> What matters here is not whether things like the OP’s czt'abc' or my 1.23f 
>> or 1.23d are literals to the compiler, but whether they’re readable ways to 
>> enter constant values to the human reader.
>> 
>> If so, they’re useful. Period.
>> 
>> Now, it’s possible that even though they’re useful, the feature is still not 
>> worth adding because of Chris’s issue that it can be abused, or because 
>> there’s an unavoidable performance cost that makes it a bad idea to rely on 
>> them, or because they’re not useful in _enough_ code to be worth the effort, 
>> or whatever. Those are questions worth discussing. But arguing about whether 
>> they meet (one of the three definitions of) “literal” is not relevant.
> 
> Extended (I'm avoiding the term "custom" for now) literals like 0.2f,
> 3.14D, re/^hello.*/ or qw{a b c} have a fairly solid track record in
> other languages, and I think in general have proved both useful and
> straightforward in those languages. And even in Python, constructs
> like f-strings and complex numbers are examples of such things.
> However, I know of almost no examples of other languages that have
> added *user-definable* literal types (with the notable exception of
> C++, and I don't believe I've seen use of that feature in user code -
> which is not to say that it's not used). That to me says that there
> are complexities in extending the question to user-defined literals
> that we need to be careful of.

Agreed 100%. That’s why I think we need a more concrete proposal, that includes 
at least some thought on implementation, before we can go any farther, as I 
said in my first reply.

The OP wanted to get some feeling of whether at least some people might find 
some version of this useful before going further. I think we’ve got that now 
(the fact that not 100% of the responders agree doesn’t change that), so we 
need to get more detailed now.

My own proposal was just to answer the charge that any design will inherently 
be impossible or magical or complicated by giving a design that is none of 
those. It shouldn’t be taken as any more than that. If there are good use cases 
for prefixes, prefixes plus suffixes, etc., then my proposal can’t get you 
there, so let’s wait for the OP’s

> Some specific
> questions which would need to be dealt with:
> 
> 1. What is valid in the "literal" part of the construct (this is the
> p"C:\" question)?

I think this pretty much has to be either (a) exactly what’s valid in the 
equivalent literals today, or (b) something equally simple to describe, and 
parse, even if it’s different (like really-raw strings, or perlesque regex with 
delimiters other than quotes, or whatever).

Either way, I think you want to use the same rule for all affixed literals, not 
allow a choice of different ones like C++ does.

> 2. How do definitions of literal syntax get brought into scope in time
> for the parser to act on them (this is about "import xyz_literal"
> making xyz"a string" valid but leaving abc"a string" as a syntax
> error)?

I don’t know that this is actually necessary. If `abc"a string"` raises an 
error at execution time rather than compile time, yes, that’s different from 
how most syntax errors work today, but is it really unacceptable? (Notice that 
in the most typical case, the error still gets raised from importing the module 
or from the top level of the script—but that’s just the most typical case, not 
all cases—you could get those errors from, say, calling a method, which you 
don’t normally expect.)

There’s clearly a trade off here, because the only other alternative (at least 
that I’ve thought of or seen from anyone else; I’d love to be wrong) is that 
what you’ve imported and/or registers affects how later imports work (and 
doesn’t that mean some kind of registry hash needs to get encoded in .pyc files 
or something too?). While that is normal for people who use import hooks, most 
people don’t use import hooks most of the time, and I suspect that weirdness 
would be more off-putting than the late errors.

Another big one: How do custom prefixes interact with builtin string prefixes? 
For suffixes, there’s no problem suffixing, say, a b-string, but for prefixes, 
there is. If this is going to be allowed, there are multiple ways it could be 
designed, but someone has to pick one and specify it.

(Actually, for suffixes, there _is_ a similar issue: is `1.2jd` a `d` suffix on 
the literal `1.2j`, or a `jd` suffix on `1.2`? I think the former, because it’s 
a trivially simple rule that doesn’t need to touch any of the rest of the 
grammar. Plus, not only is it likely to never matter, but on the rare cases 
where it does matter, I think it’s the rule you’d want. For example, if I 
created my own ComplexDecimal class and wanted to use a suffix for it, why 
would I want to define both `d` and `jd` instead of just

[Python-ideas] Re: Custom string prefixes

On Aug 27, 2019, at 10:21, Rhodri James  wrote:
> 
> You make the point yourself: this is something we already understand from 
> dealing with complex numbers in other circumstances.  That is not true of 
> generic single-character string prefixes.

It certainly is true for 1.23f.

And, while 1.23d for a decimal or 1/3F for a Fraction may not be identical to 
any other context, it’s a close-enough analogy that it’s immediately familiar. 
Although I might actually prefer 1.23dec or 1/3frac or something more explicit 
in those cases. (Fortunately, there’s nothing in the design stopping me from 
doing that.)

As for string prefixes, I don’t think those should usually, or maybe even ever, 
be single-character. People have given examples like sql"…" (I’m still not sure 
exactly what that does, but it’s apparently used in other languages for 
something?) and regex"…" and path"…" (which are a lot more obvious). I’m not 
sure if they actually are useful, which is why my proposal didn’t have them; 
I’m waiting on the OP to give more complete examples, cite similar uses from 
other languages, etc. But I doubt the problem you’re talking about, that they’d 
all be unfamiliar cryptic one-letter things, is likely to arise.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CL6TTK737GD5KCAJKUL3CACZBBHSHVU3/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> all of which hugely outweighs the gain of being able to avoid a pair 
> of parentheses.
 
Thank you for summarizing the main objections so succinctly, 
otherwise it becomes too easy to get lost in the discussion. Let me
try to answer them as best as I can:


> you have something that looks like a kind of string czt'...' 
> but is really a function call that might return absolutely 
> anything at all;

This is kinda the whole point. I understand, of course, how the 
idea of a string-that-is-not-a-string may sound blasphemous,
however I invite you to look at this from a different perspective.

Today's date is 2019-08-28. The date is a moment in time, or 
perhaps a point in the calendar, but it is certainly not a string.
How do we write this date in Python? As 
`datetime("2019-08-28")`. We are forced to put the date into
a string and pass that string into a function to create an actual
datetime object.

With this proposal the code would look something like 
`dt"2019-08-28"`. You're right, it's not a string anymore. But
it *should not* have been a string to begin with, we only used
a string there because Python didn't offer us any other way.
Now with prefixed strings the justice is finally done: we are
able to express the notion of  directly.

And the fact that it may still use strings under the hood to
achieve the desired result is really an implementation detail,
that may even change at some point in the future.


> you have a redundant special case for calling functions that
> take a single argument, but only if that argument is a string
> literal;

There are many things in python that are in fact function calls
in disguise. Decorators? function calls. Imports? function calls.
Class definition? function call. Getters/setters? function calls.
Attribute access? function calls. Even a function call is a 
function call via `__call__()`. I may be oversimplifying a bit, but
the point is that just because something can be written as a
function call doesn't mean it's the most natural way of doing it.

Besides, there are use cases (such as `sql'...'`) where people
do actually want to have a function that is constrained to string
literals only.

Having said that, prefixed (suffixed) strings (numbers) are not 
*exactly* equivalent to function calls. The points of difference 
are:
- prefixes/suffixes are namespaced separately from regular
  variable names.
- their results can be automatically memoized, bringing them
  closer to builtin literals.


> you encourage people to write cryptic single-character 
> functions, like v(), x(), instead of meaningful expressions
> like Version() and re.compile();

Which is why I suggested to put them in a separate 
namespace. You're right that function `v()` is cryptic and 
should be avoided. But a prefix `v"..."` is neither a function
nor a variable, it's ok for it to be short. The existing string
prefixes are all short after all.


> you encourage people to defer parsing that could be efficiently 
> done in your head at edit time into slow and likely inefficient
> string parsing done at runtime;

I don't encourage such thing, it's just that most often there is no
other way. For example, consider regular expression `[0-9]+`. 
I can "parse it in my head" to understand that it means a 
sequence of digits, but how exactly am I supposed to convey
this understanding to Python?

Or perhaps I can parse "2019-08-28" in my head, and write in
Python `datetime(year=2019, month=8, day=28)`. However, such
form would greatly reduce readability of the code from humans'
perspective. And human readability matters more than computer
readability, for now.

In fact, purely from the efficiency perspective, the prefixed strings
can potentially have better performance because they are
auto-memoized, while `datetime("2019-08-28")` needs to re-parse
its input string every time (or add its own internal memoization, 
but even that would be less efficient because it doesn't know the
input is a literal string).


> the OP still hasn't responded to my question about the ambiguity
> of the proposal (is czt'...' a one three-letter prefix, or three 
> one-letter prefixes?)

Sorry, I thought this part was obvious. It's a single three-letter prefix.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XU6XI4NIHKQGZ7IKSBNSJ6SRYMDVYLZD/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> A really good example here is the p"C:\" question. Is the 
> proposal that the "string part" of the literal is just a normal 
> string? If so, then how do you address this genuine issue 
> that not all paths are valid? What about backslash-escapes 
> (p"C:\temp")? Is the string a raw string or not? If the 
> proposal is that the path-literal code can define how the 
> string is parsed, then how does that work?

I don't usually work with windows, but I can see how this could
be a pain point for windows users. They need both backslashes
and the quotation marks in their paths.

As nobody has suggested yet how to deal with the problem,
I'd like to give it a try. Behold:

p{C:\}

The part within the curly braces is considered a "really-raw"
string. The "really-raw" means that every character is interpreted
exactly as it looks, there are no escape characters. Internal braces
will be allowed too, provided that they are properly nested:

p{C:\"Program Files"\{hello}\}

If you **need** to have unmatched braces in the string, your last
hope is the triple-braced literal:

p{{{Letter Ж looks like }|{... }}}

The curly braces can only be used with a string prefix (or suffix?).

And while we're at it, why not allow chained literals:

re{(\w+)}{"\1"}
frac{1}{17}
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RDDQSNZF3WT47FXIVXJTVPIO3DQN5Z52/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> In addition, there is the question of how user-defined literals would
> get turned into constants within the code.

So, I'm just brainstorming here, but how about the following
approach:

- Whenever a compiler sees `abc"def"`, it creates a constant of
  the type `ud_literal` with fields `.prefix="abc"`, `.content="def"`.

- When it compiles a function then instead of `LOAD_CONST n`
  op it would emit `LOAD_UD_CONST n` op.

- This new op first checks whether its argument is a "ud_literal",
  and if so calls the '.resolve()` method on that argument. The 
  method should call the prefix with the content, producing an
  object that the LOAD_UD_CONST op stores back in the 
  `co_consts` storage of the function. It is a TypeError for the
  resolve method to return another ud_literal.

- Subsequent calls to the LOAD_UD_CONST op will see that
  the argument is no longer a ud-literal, and will return it as-is.

This system would allow each constant to be evaluated only 
once and subsequently memoized, and only compute those 
constants that will actually be used.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JMLDHB4W44TZP5I72KQC55VIEJ56A5RW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> Ouch! That's adding a lot of additional complexity to the language. 
> ...
> This proposal adds a completely separate, parallel set of scoping rules 
> for these string prefixes. How many layers in this parallel scope?

Right, having a parallel set of scopes sounds like WAY too much work.
Which is why I didn't want to start my proposal with a particular 
implementation -- I simply don't have enough experience for that.
Still, we can brainstorm possible approaches, and come up with 
something that is feasible.

For example, how about this: prefixes/suffixes "live" in the same local
scope as normal variables, however, in order to separate them from 
the normal variables, their names get mangled into something that is
not a valid variable name. Thus,

re'a|b|c'  --becomes-->  (locals()["re~"])("a|b|c")
2.3f   --becomes-->  (locals()["~f"])("2.3")

Assuming that most people don't create variable names that start
or end with `~`, the impact on existing code should be minimal (we
could use an even more rare character there, say `\0`).

The current string prefixes would be special-cased by the compiler to
behave exactly as they behave right now.

Also, a prefix such as `czt""` is always just a single prefix, there is no
need to treat it as 3 single-char prefixes.

> One of the weaknesses of string prefixes is that it's hard to get help 
> for them. ...
> What's the difference between r-strings and u-strings? help() is no help

Well, it's just another problem to overcome. I know in Python one can get
help on keywords and even operators by saying `help('class')` or `help('+')`.
We could extend this to allow `help('foo""')` to give the help for the
prefix "foo".

Specifically, if the argument to `help` is a string, and that string is not a
registered topic, then check whether the string is of the form `""`
or `''` or `""` or `''`, and invoke the help for the corresponding
prefix / suffix.

This will even solve the problem with the help for existing affixes `b""`,
`f""`, `0j`, etc.

>  you probably won't want to do that, since Version will probably be 
> useful for those who want to create Version objects from expressions or 
> variables, not just string literals.

For the Version class you're right. But use cases vary. In the thread from
2013 where this issue was discussed, many people wanted `sql"..."`
literal to be available as literal and nothing else. Presumably, if you wanted
to construct a query dynamically there could be a separate function
`sql_unsafe()` taking a simple string as an argument.


> So the "pollution" isn't really pollution at all, at least not if you 
> use reasonable names, and the main justification for parallel namespaces 
> seems much weaker.

The pollution argument is that, on one hand, we want to use short names
such as "v" for prefixes/suffixes, while on the other hand we don't want 
them to be "regular" variable names because of the possibilities of name
clashes. It's perfectly fine to have a short character for a prefix and at the
same time a longer name for a function. It's like we have the `unicode()`
function and `u"..."` prefix. It's like most command line utilities offer short
single-character options and longer full-name options.

> That's an interesting position for the proponent of a new feature to 
> take. "Don't worry about this being confusing, because hardly anyone 
> will use it."

I'm sorry if I expressed myself ambiguously. What I meant to say is that
the set of different prefixes within a single program will likely be small.


> We can't extrapolate from four built-in prefixes being manageable to 
> concluding that dozens of clashing user-defined prefixes will be too.

That's a valid point. Though we can't extrapolate that they will be 
unmanageable either. There's just not enough data. But we could look 
at other languages who have more suffixes. Say, C or C++.

Ultimately, this can be a self-regulating feature: if having too many
suffixes/prefixes makes one's code unreadable, then simply stop using
them and go back to regular function calls.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OCPR77XVTKWZJUSIBNJAF75FITRPE7AP/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 28, 2019, at 00:40, Chris Angelico  wrote:
> 
> On Wed, Aug 28, 2019 at 2:40 PM Andrew Barnert  wrote:
>>> People can be trusted with powerful features that can introduce
>>> complexity. There's just not a lot of point introducing a low-value
>>> feature that adds a lot of complexity.
>> 
>> But it really doesn’t add a lot of complexity.
>> 
>> If you’re not convinced that really-raw string processing is doable, drop 
>> that.
>> 
>> Since the OP hasn’t given a detailed version of his grammar, just take mine: 
>> a literal token immediately followed by one or more identifier characters 
>> (that couldn’t have been munched by the literal) is a user-suffix literal. 
>> This is compiled into code that looks up the suffix in a central registry 
>> and calls it with the token’s text. That’s all there is to it.
>> 
> 
> What is a "literal token", what is an "identifier character", 

Literals and identifier characters are already defined today, so I don’t need 
new definitions for them. 

The existing tokens are already implemented in the tokenizer and in the 
tokenize module, which is why I was able to slap together multiple variations 
on a proof of concept 4 years ago in a few minutes as a token-stream-processing 
import hook. 

My import hook version is a hack, of course, but it serves as a counter to your 
argument that there’s no simple thing that could work by being a dead simple 
thing that does work. And there’s no reason to believe a real version wouldn’t 
be at least as simple.

> and how
> does this apply to your example of having digits, a decimal point, and
> then a suffix

We add a`suffixedfloatnumber` production defined as `floatnumber identifier`. 
So, the `2.34` parses as a `floatnumber` the same as always. That `d` can’t be 
part of a `floatnumber`, but it can be the start of an `idenfifier`, and those 
two nodes together can make up a `suffixedfloatnumber`. No need for any new 
lookahead or other context. And for the concrete implementation in CPython, it 
should be obvious that the suffix can be pushed down into the tokenizer, at 
which point the parse becomes trivial.

If you’re asking how my hacky version works, you could just read the code, 
which is simpler than an explanation, but here goes (from memory, because I’m 
on my phone): To the existing tokenizer, `d` isn’t a delimiter character, so it 
tries to match the whole `2.34d`. That doesn’t match anything. But `2.34` does 
match something, etc., so ultimately it emits two tokens, `floatnumber('2.34'), 
error('d')`. My import hook reads the stream of tokens. When it sees a 
`floatnumber` followed by an `error`, it checks whether the error body could be 
an identifier token. If so, it replaces those two tokens in the steam with… I 
forget, but probably I just hand-parsed the lookup and call and emit the tokens 
for that.

I can’t _guarantee_ that the real version would be simpler until I try it. And 
I don’t want to hijack the OP’s thread and replace his proposal (which does 
give me what I want) with mine (which doesn’t give him what he wants), unless 
he abandons the idea of attempting to implement his version. But I’m pretty 
confident it would be as simple as it sounds, which is even simpler than the 
hacky version (which, again, is dead simple and works today).

And most variations on the idea you could design would be just as simple. Maybe 
the OP will perversely design one that isn’t. If so, it’s his job to show that 
it can be implemented. And if he gives up, then I’ll argue for something that I 
can implement simply. But I don’t think that’s even going to come up.

> What if you want to have a string, and what if you want
> to have that string contain backslashes or quotes? If you want to say
> that this doesn't add complexity, give us some SIMPLE rules that
> explain this.

Well, that works exactly the same way a string does today (including the 
optional r prefix). The closing quote can now be followed by a string of 
identifier characters, but everything up to there is exactly the same as today. 
So, it doesn’t add any complexity, because it uses the same rules as today.

I did suggest, as a throwaway addon to the OP’s proposal, that you could 
instead do raw strings or even really-raw (the string ends at the first 
matching quote; backslashes mean nothing). I don’t know if he wants either of 
those, but if he does, raw string literals are already defined in the grammar 
and implemented in the tokenizer, and really-raw is an even simpler grammar 
(identical to the existing grammar except that instead of `longstringchar | 
stringescapeseq` there’s a `` node, and 
the same for `shortstringitem`).

> And make absolutely sure that the rules are identical for EVERY
> possible custom prefix/suffix,

Well, in my version, since the rule for suffixedstringliteral is just 
`stringliteral identifier`, of course it’s the same for every possible suffix; 
there’s no conceivable way it could be different.

If the OP wants to

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Konstantin Schukraft


On Wed, Aug 28, 2019 at 04:02:26PM +0100, Paul Moore wrote:

On Wed, 28 Aug 2019 at 15:55, Mike Miller  wrote:



On 2019-08-28 01:05, Paul Moore wrote:
> However, I know of almost no examples of other languages that have
> added*user-definable*  literal types (with the notable exception of

Believe there is such a feature in modern JavaScript:

https://developers.google.com/web/updates/2015/01/ES6-Template-Strings#tagged_templates


Interesting - thanks for the pointer!


Elixir has something it calls sigils. It seems to be basically the
map-to-function variant:

https://elixir-lang.org/getting-started/sigils.html

Konstantin


signature.asc
Description: PGP signature
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GKQ2AXLISVFY6TEIEE4FI5YMV7PDFCEW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Wed, 28 Aug 2019 at 15:55, Mike Miller  wrote:
>
>
> On 2019-08-28 01:05, Paul Moore wrote:
> > However, I know of almost no examples of other languages that have
> > added*user-definable*  literal types (with the notable exception of
>
> Believe there is such a feature in modern JavaScript:
>
> https://developers.google.com/web/updates/2015/01/ES6-Template-Strings#tagged_templates

Interesting - thanks for the pointer!
Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/J2JQ2ERBNIWQ7267QOEAXRI7N3NMX4X5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Mike Miller




On 2019-08-28 01:05, Paul Moore wrote:

However, I know of almost no examples of other languages that have
added*user-definable*  literal types (with the notable exception of


Believe there is such a feature in modern JavaScript:

https://developers.google.com/web/updates/2015/01/ES6-Template-Strings#tagged_templates

-Mike
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B4EUENINXAXDNZU2VAZ4CCDERRK4ND2E/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Chris Angelico

On Wed, Aug 28, 2019 at 10:50 PM Rhodri James  wrote:
>
> On 28/08/2019 02:38, stpa...@gmail.com wrote:
> > Thanks, Andrew, you're able to explain this much better than I do.
> > Just wanted to add that Python*already*  has ways to grossly abuse
> > its syntax and create unreadable code. For example, I can write
> >
> >  >>> о = 3
> >  >>> o = 5
> >  >>> ο = 6
> >  >>> (о, o, ο)
> >  (3, 5, 6)
>
> OK, I'll bite: how?  If you were using "thing.o" I would believe you
> were doing something unhelpful with properties, but just "o"?
>

'\u043e' CYRILLIC SMALL LETTER O
'o' LATIN SMALL LETTER O
'\u03bf' GREEK SMALL LETTER OMICRON

Virtually indistinguishable in most fonts, but distinct characters.
It's the same thing you can do with "I" and "l" in many fonts, or "rn"
and "m" in some, but taken to a more untypable level.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LEHFRMXDVYBJZAAAMSSCCPU7GYTK5IRN/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Wed, 28 Aug 2019 at 13:49, Rhodri James  wrote:

> OK, I'll bite: how?  If you were using "thing.o" I would believe you
> were doing something unhelpful with properties, but just "o"?

Presumably Unicode variables with confusable characters?

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3DU4YAH5QEMFSLUXFZSQB4GOQO2M3IKW/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Wed, 28 Aug 2019 at 13:15, Anders Hovmöller  wrote:
>
> > On 28 Aug 2019, at 14:09, Piotr Duda  wrote:

> > There is much simpler solution, just make `abc"whatever"` synatctic
> > sugar for `string_literal_abc(r"whatever", closure)` where closure is
> > object that allow read only access to variables in call site.
>
> So to use abc"foo" we must import string_literal_abc? Seems pretty confusing 
> to me!

The only sane proposal that I can see (assuming that no-one is
proposing to drop the principle that Python shouldn't have mutable
syntax) is to modify the definition

stringliteral   ::=  [stringprefix](shortstring | longstring)
stringprefix::=  "r" | "u" | "R" | "U" | "f" | "F"
 | "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF"

to expand the definition of  to allow any
identifier-like token (precise details to be confirmed). Then, if it's
one of the values enumerated above (you'd also need some provison for
special-casing bytes literals, which are in a different syntax rule),
work as at present. For any other identifier-like token, you'd define

TOKEN(shortstring|longstring)

as being equivalent to

TOKEN(r(shortstring|longstring))

I.e., treat the string as a raw string, and TOKEN as a function name,
and compile to a function call of the named function with the raw
string as argument.

That's a well-defined proposal, although whether it's what people want
is a different question. Potential issues:

1. It makes a whole class of typos that are currently syntax errors
into runtime errors - fru"foo\and {bar}" is now a function call rather
than a syntax error (it was never a raw Unicode f-string, even though
someone might think it was and be glad to be corrected by the current
syntax error...)
2. It begs the question of whether people want raw-string semantics -
whilst it's the most flexible option, it does mean that literals
wanting to allow escape sequences would need to implement it
themselves.
3. It does nothing for the edge case that a trailing \ isn't allowed -
p"C:\" wouldn't be a valid Path literal.

There are of course other possible proposals, but we'd need more than
broad statements to make sense of them (specifically, either "exactly
*what* new syntax are you suggesting we allow?", or "how are you
proposing to allow users to alter Python syntax on demand?")

Paul
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CGGDESNYMT3TA2MCY5CEYTQT5CUHTJLF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Rhodri James

On 28/08/2019 02:38, stpa...@gmail.com wrote:

Thanks, Andrew, you're able to explain this much better than I do.
Just wanted to add that Python*already*  has ways to grossly abuse
its syntax and create unreadable code. For example, I can write

 >>> о = 3
 >>> o = 5
 >>> ο = 6
 >>> (о, o, ο)
 (3, 5, 6)

OK, I'll bite: how?  If you were using "thing.o" I would believe you 
were doing something unhelpful with properties, but just "o"?

--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VMYYPWSXOPZ6KOKOY4W5K3VCOWVQFZKY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Anders Hovmöller

> On 28 Aug 2019, at 14:09, Piotr Duda  wrote:
> 
> śr., 28 sie 2019 o 13:18 Steven D'Aprano  napisał(a):
>> 
>>> On Tue, Aug 27, 2019 at 05:13:41PM -, stpa...@gmail.com wrote:
>>> 
>>> The difference between `x'...'` and `x('...')`, other than visual noise, is 
>>> the
>>> following:
>>> 
>>> - The first "x" is in its own namespace of string prefixes. The second "x"
>>>  exists in the global namespace of all other symbols.
>> 
>> Ouch! That's adding a lot of additional complexity to the language.
> 
> There is much simpler solution, just make `abc"whatever"` synatctic
> sugar for `string_literal_abc(r"whatever", closure)` where closure is
> object that allow read only access to variables in call site.

So to use abc"foo" we must import string_literal_abc? Seems pretty confusing to 
me!
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XMKWHIOIEHPPXSFTBDQGZ5AE5RGSWFLE/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Piotr Duda

śr., 28 sie 2019 o 13:18 Steven D'Aprano  napisał(a):
>
> On Tue, Aug 27, 2019 at 05:13:41PM -, stpa...@gmail.com wrote:
>
> > The difference between `x'...'` and `x('...')`, other than visual noise, is 
> > the
> > following:
> >
> > - The first "x" is in its own namespace of string prefixes. The second "x"
> >   exists in the global namespace of all other symbols.
>
> Ouch! That's adding a lot of additional complexity to the language.

There is much simpler solution, just make `abc"whatever"` synatctic
sugar for `string_literal_abc(r"whatever", closure)` where closure is
object that allow read only access to variables in call site.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6GFRVQOVKXPSX54BKOA7LDWPY4YSUYAG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Steven D'Aprano

On Tue, Aug 27, 2019 at 05:13:41PM -, stpa...@gmail.com wrote:

> The difference between `x'...'` and `x('...')`, other than visual noise, is 
> the
> following:
> 
> - The first "x" is in its own namespace of string prefixes. The second "x"
>   exists in the global namespace of all other symbols.

Ouch! That's adding a lot of additional complexity to the language.

Python's scoping rules are usually described as LEGB:

- Local
- Enclosing (non-local)
- Global (module)
- Builtins

but that's an over-simplification, dating back to something like Python 
1.5 days. Python scope also includes:

- class bodies can be the local scope, but they don't work quite
  the same as function locals);
- parts of the body of comprehensions behave as if they were a
  seperate scope.

This proposal adds a completely seperate, parallel set of scoping rules 
for these string prefixes. How many layers in this parallel scope?

The simplest design is to have a single, interpreter wide namespace for 
prefixes. Then we will have name clashes, especially since you seem to 
want to encourage single character prefixes like "v" (verbose, version) 
or "d" (date, datetime, decimal). Worse, defining a new prefix will 
affect all other modules using the same prefix.

So we need a more complex parallel scope. How much more complex?

* if I define a string prefix inside a comprehension, function or 
  class body, will that apply across the entire module or just inside 
  that comp/func/class?

* how do nested functions interact with prefixes?

* do we need a set of parallel keywords equivalent to global and 
  nonlocal for prefixes?

If different modules have different registries, then not only do we need 
to build a parallel set of scoping rules for prefixes into the 
interpreter, but we need a parallel way to import them from other 
modules, otherwise they can't be re-used.

Does "from module import x" import the regular object x from the module 
namespace, or the prefix x from the prefix-namespace? So it seems we'll 
need a parallel import system as well.

All this adds more complexity to the language, more things to be coded 
and tested and documented, more for users to learn, more for other 
implementations to re-implement, and the benefit is marginal: the 
ability to drop parentheses from some but not all function calls.

Now consider another problem: introspection, or the lack thereof.

One of the weaknesses of string prefixes is that it's hard to get help 
for them. In the REPL, we can easily get help on any class or function:

help(function)

and that's really, really great. We can use the inspect module or dir() 
to introspect functions, classes and instances, but we can't do the same 
for string prefixes.

What's the difference between r-strings and u-strings? help() is no help 
(pun intended), since help sees only the string instance, not the syntax 
you used to create it. All of these will give precisely the same output:

help(str())
help('')
help(u'')
help(r"")

etc. This is a real weakness of the prefix system, and will apply 
equally to custom prefixes. It is *super easy* to introspect a class or 
function like Version; it is *really hard* to do the same for a prefix.

You want this seperate namespace for prefixes so that you can have an v 
prefix without "polluting" the module namespace with a v function (or 
class). But v doesn't write itself! You still have to write a function 
or class, athough you might give it a better name and then register it 
with the single letter prefix:

@register_prefix('v')
class Version:
...

(say). This still leaves Version lying around in your global namespace, 
unless you explicitly delete it:

del Version

but you probably won't want to do that, since Version will probably be 
useful for those who want to create Version objects from expressions or 
variables, not just string literals.

So the "pollution" isn't really pollution at all, at least not if you 
use reasonable names, and the main justification for parallel namespaces 
seems much weaker.

Let me put it another way: parallel namespaces is not a feature of this 
proposal. It is a point against it.

> - Python style discourages too short variable names, especially in libraries,
>   because they have increased chance of clashing with other symbols, and
>   generally may be hard to understand. At the same time, short names for
>   string prefixes could be perfectly fine: there won't be too many of them
>   anyways.

That's an interesting position for the proponent of a new feature to 
take. "Don't worry about this being confusing, because hardly anyone 
will use it."

>   The standard prefixes "b", "r", "u", "f" are all short, and nobody
>   gets confused about them.

Plenty of people get confused about raw strings.

There's only four, plus uppercase and combinations, and they are 
standard across the entire language. If there were dozens of them, 
coming from lots of different modules

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Rhodri James


On 27/08/2019 18:07, Andrew Barnert via Python-ideas wrote:

On Aug 27, 2019, at 08:52, Steven D'Aprano  wrote:

On Tue, Aug 27, 2019 at 05:24:19AM -0700, Andrew Barnert via Python-ideas wrote:

There is a possibility in between the two extremes of “useless” and
“complete monster”: the prefix accepts exactly one token, but can
parse that token however it wants.

How is that different from passing a string argument to a function or
class constructor that can parse that token however it wants?

x'...'

x('...')

Unless there is some significant difference between the two, what does
this proposal give us?

Before I get into this, let me ask you a question. What does the j suffix give 
us? You can write complex numbers without it just fine:

 c = complex
 c(1, 2)

And you can even write a j function trivially:

 def j(x): return complex(0, x)
 1 + j(2)

But would anyone ever write that when they can write it like this:

 1 + 2j

I don’t think so. What does the j suffix give us? The two extra keystrokes are 
trivial. The visual noise of the parens is a bigger deal. The real issue is 
that this matches the way we conceptually think of complex numbers, and the way 
we write them in other contexts. (Well, the way electrical engineers write 
them; most of the rest of us use i rather than j… but still, having to use j 
instead of i is less of an impediment to reading 1+2j than having to use 
function syntax like 1+i(2).


You make the point yourself: this is something we already understand 
from dealing with complex numbers in other circumstances.  That is not 
true of generic single-character string prefixes.


--
Rhodri James *-* Kynesim Ltd
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BHSES2YDLYFFUYOSW753G3IC2C5OGVC2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Wed, 28 Aug 2019 at 05:04, Andrew Barnert via Python-ideas
 wrote:
> What matters here is not whether things like the OP’s czt'abc' or my 1.23f or 
> 1.23d are literals to the compiler, but whether they’re readable ways to 
> enter constant values to the human reader.
>
> If so, they’re useful. Period.
>
> Now, it’s possible that even though they’re useful, the feature is still not 
> worth adding because of Chris’s issue that it can be abused, or because 
> there’s an unavoidable performance cost that makes it a bad idea to rely on 
> them, or because they’re not useful in _enough_ code to be worth the effort, 
> or whatever. Those are questions worth discussing. But arguing about whether 
> they meet (one of the three definitions of) “literal” is not relevant.

Extended (I'm avoiding the term "custom" for now) literals like 0.2f,
3.14D, re/^hello.*/ or qw{a b c} have a fairly solid track record in
other languages, and I think in general have proved both useful and
straightforward in those languages. And even in Python, constructs
like f-strings and complex numbers are examples of such things.
However, I know of almost no examples of other languages that have
added *user-definable* literal types (with the notable exception of
C++, and I don't believe I've seen use of that feature in user code -
which is not to say that it's not used). That to me says that there
are complexities in extending the question to user-defined literals
that we need to be careful of.

In my view, the issue isn't abuse of the feature, or performance, or
limited value. It's the very basic problem that it's *really hard* to
define and implement such a feature in a way that everyone is happy
with - particularly in a language like Python which doesn't have a
user-exposed "compile source to binary" step (I tried very hard to
cover myself against nitpicking there - I'm sure I failed, but please,
don't get sidetracked, you know what I mean here :-)). Some specific
questions which would need to be dealt with:

1. What is valid in the "literal" part of the construct (this is the
p"C:\" question)?
2. How do definitions of literal syntax get brought into scope in time
for the parser to act on them (this is about "import xyz_literal"
making xyz"a string" valid but leaving abc"a string" as a syntax
error)?

These questions also fundamentally affect other tools like IDEs,
linters, code formatters, etc.

In addition, there is the question of how user-defined literals would
get turned into constants within the code. In common with list
expressions, tuples, etc, user-defined literals would need to be
handled as translating into runtime instructions for constructing the
value (i.e., a function call). But people typically don't expect
values that take the form of a literal like this to be "just" syntax
sugar for a function call. So there's an education issue here. Code
will get errors at runtime that the users might have expected to
happen at compile time, or in the linter.

It's not that these questions can't be answered. Obviously they can,
as you produced a proof of concept implementation. But the design
trade-offs that one person might make are deeply unsatisfactory to
someone else, and there's no "obviously right" answer (at least not
yet, as no-one Dutch has explained what's obvious ;-))

Also, it's worth noting that the benefits of *user-defined* literals
are *not* the same as the benefits of things like 0.2f, or 3.14d, or
even re/^hello.*/. Those things may well be useful. But the benefit
you gain from *user-defined* literals is that of letting the end user
make the design decisions, rather than the language designer. And
that's a subtly different thing.

So, to summarise, the real problem with user defined literal proposals
is that the benefit they give hasn't yet proven sufficient to push
anyone to properly address all of the design-time details. We keep
having high-level "would this be useful" debates, but never really
focus on the key question, of what, in precise detail, is the "this"
that we're talking about - so people are continually making arguments
based on how they conceive such a feature might work. A really good
example here is the p"C:\" question. Is the proposal that the "string
part" of the literal is just a normal string? If so, then how do you
address this genuine issue that not all paths are valid? What about
backslash-escapes (p"C:\temp")? Is the string a raw string or not? If
the proposal is that the path-literal code can define how the string
is parsed, then *how does that work*?

The OP even made this point explicitly:

> I'm not discussing possible implementation of this feature just yet, we can 
> get to
> that point later when there is a general understanding that this is worth 
> considering.

I don't think we *can* agree on much without the implementation
details (well, other than "yes, it's worth discussing, but only if
someone proposes a properly specified design" ;-))

Paul

[Python-ideas] Re: Custom string prefixes

2019-08-28 Thread Chris Angelico

On Wed, Aug 28, 2019 at 2:40 PM Andrew Barnert  wrote:
> > People can be trusted with powerful features that can introduce
> > complexity. There's just not a lot of point introducing a low-value
> > feature that adds a lot of complexity.
>
> But it really doesn’t add a lot of complexity.
>
> If you’re not convinced that really-raw string processing is doable, drop 
> that.
>
> Since the OP hasn’t given a detailed version of his grammar, just take mine: 
> a literal token immediately followed by one or more identifier characters 
> (that couldn’t have been munched by the literal) is a user-suffix literal. 
> This is compiled into code that looks up the suffix in a central registry and 
> calls it with the token’s text. That’s all there is to it.
>

What is a "literal token", what is an "identifier character", and how
does this apply to your example of having digits, a decimal point, and
then a suffix? What if you want to have a string, and what if you want
to have that string contain backslashes or quotes? If you want to say
that this doesn't add complexity, give us some SIMPLE rules that
explain this.

And make absolutely sure that the rules are identical for EVERY
possible custom prefix/suffix, because otherwise you're opening up the
problem of custom prefixes changing the parser again.

> Compare that adding Decimal (and Fraction, as you said last time) literals 
> when the types aren’t even builtin. That’s more complexity, for less benefit. 
> So why is it better?
>

Actually no, it's a lot less complexity, because it's all baked into
the language. You don't have to have the affix registry to figure out
how to parse a script into AST. The definition of a "literal" is given
by the tokenizer, and for instance, "-1+2j" is not a literal. How is
this going to impact your registry? The distinction doesn't matter to
Decimal or Fraction, because you can perform operations on them at
compile time and retain the results, so "-1.23d" would syntactically
be unary negation on the literal Decimal("1.23"), and -4/5f would be
unary negation on the integer 4 and division between that and
Fraction(5). But does that work with your proposed registry? What is a
"literal token", and would it need to include these kinds of things?
What if some registered types need to include them and some don't?

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HCHB5PAT5YXY4ZOW66YCYQM5PLWUE2UG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> On Aug 27, 2019, at 18:19, Chris Angelico  wrote:
> 
>>> On Wed, Aug 28, 2019 at 10:52 AM Andrew Barnert  wrote:
>>> 
>>> On Aug 27, 2019, at 14:41, Chris Angelico  wrote:
>>> All the examples about Windows paths fall into one of two problematic boxes:
>>> 
>>> 1) Proposals that allow an arbitrary prefix to redefine the entire
>>> parser - basically impossible for anything sane
>>> 
>>> 2) Proposals that do not allow the prefix to redefine the parser, and
>>> are utterly useless, because the rest of the string still has to be
>>> valid.
>> 
>> 3) Proposals that do not allow the prefix to redefine the parser for the 
>> entire program, but do allow it to manually parse anything the tokenizer can 
>> recognize as a single (literal) token.
>> 
>> As I said, I haven’t tried to implement this example as I have with the 
>> other examples, so I can’t promise that it’s doable (with the current 
>> tokenizer, or with a reasonable change to it). But if it is doable, it’s 
>> neither insane nor useless. (And evenif it’s not doable, that’s just two 
>> examples that affixes can’t solve—Windows paths and general “super-raw 
>> strings”. They still solve all of the other examples.)
> 
> So what is the definition of "a single literal token" when you're
> creating a path-string? You want this to be valid:
> 
> x = path"C:\"
> 
> For this to work, the path prefix has to redefine the way the parser
> finds the end of the token, does it not?

I’m not sure (maybe about 60% at best), but I think last time I checked this, 
the tokenizer actually hits the error without munching the rest of the file.

If I’m wrong, then you would need to add a “really raw string literal” builtin 
that any affixes that want really raw string literals could use, but that’s all 
you’d have to do.

And I really don’t think it’s worth getting this in-depth into just one of the 
possible uses that I just tossed off as an aside, especially without actually 
sitting down and testing anything. 

>> Look at the plethora of suffixes C has for number and character literals. 
>> Look at how many things people still can’t do with them that they want to.
> 
> I don't know how many there are. The only ones I can think of are "f"
> for single-precision float, and the long and unsigned suffixes on
> integers.

Of the top of my head, there are also long long integers, and long doubles, and 
wide and three Unicode suffixes for char. Those probably aren’t all of them. 
And your compiler probably has extensions for “legacy” suffixes and nonstandard 
types like int128 or decimal64 and so on.

> Python doesn't have these because very few programs need to
> care about whether a float is single-precision or double-precision, or
> how large an int is.

Right, but the issue isn’t which ones, but how many. C doesn’t have decimals or 
fractions, and other things like datetime objects have been suggested in this 
thread, and even more in the two earlier threads. If there are too many useful 
kinds of constants, there are too many to make them all builtins.

>> Do you think Python users are incapable of the kind of restraint and taste 
>> shown by C++ users, and therefore we can’t trust Python users with a tool 
>> that might possibly (but we aren’t sure) if abused badly enough make code 
>> harder to visually parse?
> 
> People can be trusted with powerful features that can introduce
> complexity. There's just not a lot of point introducing a low-value
> feature that adds a lot of complexity.

But it really doesn’t add a lot of complexity.

If you’re not convinced that really-raw string processing is doable, drop that.

Since the OP hasn’t given a detailed version of his grammar, just take mine: a 
literal token immediately followed by one or more identifier characters (that 
couldn’t have been munched by the literal) is a user-suffix literal. This is 
compiled into code that looks up the suffix in a central registry and calls it 
with the token’s text. That’s all there is to it.

Compare that adding Decimal (and Fraction, as you said last time) literals when 
the types aren’t even builtin. That’s more complexity, for less benefit. So why 
is it better?

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MC5OSKYMU4IK3U4KPQ3ETJ5YQG2F6EYT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 27, 2019, at 18:59, Steven D'Aprano  wrote:
> 
> On Tue, Aug 27, 2019 at 10:07:41AM -0700, Andrew Barnert wrote:
> 
>>> How is that different from passing a string argument to a function or 
>>> class constructor that can parse that token however it wants?
>>> 
>>>   x'...'
>>> 
>>>   x('...')
>>> 
>>> Unless there is some significant difference between the two, what does 
>>> this proposal give us?
>> 
> 
>> Before I get into this, let me ask you a question. What does the j 
>> suffix give us?
> 
> I'm going to answer that question, but before I answer it, I'm going to 
> object that this analogy is a poor one. This proposal is *in no way* a 
> proposal for a new compile-time literal.

Yes, you’re the same person who got hung up on the fact that these affixes 
don’t really give us “literals” back in either 2013 or 2016, and I don’t want 
to rehash that argument. I could point out that nobody cares that -1 isn’t 
really a literal, and almost nobody cares that the CPython optimizer 
special-cases its way around that, and the whole issue with Python having three 
different definitions of “literal” that don’t coincide, and so on, but we 
already had this conversation and I don’t think anyone but the two of us cared.

What matters here is not whether things like the OP’s czt'abc' or my 1.23f or 
1.23d are literals to the compiler, but whether they’re readable ways to enter 
constant values to the human reader. 

If so, they’re useful. Period.

Now, it’s possible that even though they’re useful, the feature is still not 
worth adding because of Chris’s issue that it can be abused, or because there’s 
an unavoidable performance cost that makes it a bad idea to rely on them, or 
because they’re not useful in _enough_ code to be worth the effort, or 
whatever. Those are questions worth discussing. But arguing about whether they 
meet (one of the three definitions of) “literal” is not relevant.

> This proposal is for mere syntactic sugar allowing us to drop the 
> parentheses from a tiny subset of function calls, those which take a 
> single string argument.

And to drop the quotes as well. And to avoid polluting the global namespace 
with otherwise-unused one-character function names.

Can you honestly tell me that you see no significant readability difference 
between these examples:

vec = [1.23f, 2.5f, 1.11f]
vec = [f('1.23'), f('2.5'), f('1.11')]

I think anyone would agree that the former is a lot more readable. Sure, you 
have to learn what the f suffix means, but once you do, it means all of the 
dozens of constants in the module are more readable. (And of course most people 
reading this code will probably be people who are used to 3D code and already 
_expect_ that format, since that’s how you write it in C, in shaders, etc.)

> And even then, only when the argument is a 
> string literal:
> 
>czt'abc'  # Okay.
> 
>s = 'abc'
>czt's'  # Oops, wrong, doesn't work.

Sure, just like you can’t apply an r or f prefix to a string expression.

> But, to answer your question, what does the j suffix give us?
> 
> Damn little. Unless there is a large community of Scipy and Numpy users 
> who need complex literals, I suspect that complex literals are one of 
> the least used features in Python.
> 
> I do a lot of maths in Python, and aside from experimentation in the 
> interactive interpreter, I think I can safely say that I have used 
> complex literals exactly zero times in code.

I don’t think your experience here is typical. I can’t think of a good way to 
search GitHub python repos for uses of j, but a hacky search immediately turned 
up this numpy issue:https://github.com/numpy/numpy/issues/13179: 

> A fast way to get the inverse of angle, i.e., exp(1j * a) = cos(a) + 1j * 
> sin(a). Note that for large angle arrays, exp(1j*a)needlessly triples memory 
> use…

That doesn’t prove that people actually call it with `1j * a` instead of 
`complex(0, a)`, but it does seem likely.

>> You can write complex numbers without it just fine:
> [...]
> 
> Indeed. And if we didn't already have complex literals, would we accept 
> a proposal to add them now? I doubt it.

I’m not sure. I assume you’d be against it, but I suspect that most of the 
people who use it today would be for it.

But if we had custom affixes, I think everyone would be happy with “just define 
a custom j suffix”. Would anyone really argue that they need the performance 
benefit or compile-time handling? How often do you evaluate zillions of 
constants in the middle of a tight loop? And what other argument would there be 
for adding it to the grammar and the compiler and forcing every project to use 
it?

Which is exactly what I think of the Decimal and Fraction suffixes, contrary to 
what Chris says. There will be a small number of projects than get a lot of 
readability benefit, but every other project gains nothing, so why add it as a 
builtin for every project?

And I don’t see why float32 is any different from Decimal

[Python-ideas] Re: Custom string prefixes

2019-08-27 Thread Steven D'Aprano

On Tue, Aug 27, 2019 at 10:07:41AM -0700, Andrew Barnert wrote:

> > How is that different from passing a string argument to a function or 
> > class constructor that can parse that token however it wants?
> > 
> >x'...'
> > 
> >x('...')
> > 
> > Unless there is some significant difference between the two, what does 
> > this proposal give us?
> 

> Before I get into this, let me ask you a question. What does the j 
> suffix give us?

I'm going to answer that question, but before I answer it, I'm going to 
object that this analogy is a poor one. This proposal is *in no way* a 
proposal for a new compile-time literal.

If it were, it might be interesting: I would be very interested to hear 
more about literals for a Decimal type, say, or regular expressions. But 
this proposal doesn't offer that.

This proposal is for mere syntactic sugar allowing us to drop the 
parentheses from a tiny subset of function calls, those which take a 
single string argument. And even then, only when the argument is a 
string literal:

czt'abc'  # Okay.

s = 'abc'
czt's'  # Oops, wrong, doesn't work.

But, to answer your question, what does the j suffix give us?

Damn little. Unless there is a large community of Scipy and Numpy users 
who need complex literals, I suspect that complex literals are one of 
the least used features in Python.

I do a lot of maths in Python, and aside from experimentation in the 
interactive interpreter, I think I can safely say that I have used 
complex literals exactly zero times in code.

> You can write complex numbers without it just fine:
[...]

Indeed. And if we didn't already have complex literals, would we accept 
a proposal to add them now? I doubt it. But if you think we would, how 
about a proposal to add quaternions?

q = 3 + 4i + 2j - 7k

> But would anyone ever write that when they can write it like this:
> 
> 1 + 2j

Given that complex literals are already a thing, of course you are 
correct that if I ever needed a complex literal, I would use the literal 
syntax.

But that's the point: it is *literal syntax* handled by the compiler at 
compile time, not syntactic sugar for a runtime function call that has 
to inefficiency parse a string.

Because it is built-in to the language, we don't have to do this:

def c(astring):
assert isinstance(astring, str)
# Parse the string at runtime
real, imag = ...
return complex(real, imag)

z = c"1.23 + 4.56j"

(I'm aware that the complex constructor actually does parse strings 
already, so in *this specific* example we don't have to write our own 
parser. But that doesn't apply in the general case.)

That is nothing like complex literals:

py> from dis import dis
py> dis(compile('1+2j', '', 'eval'))
  1   0 LOAD_CONST   2 ((1+2j))
  3 RETURN_VALUE

# Hypothetical byte-code generated from custom string prefix 
py> dis(compile("c'1+2j'", '', 'eval'))
  1   0 LOAD_NAME0 (c)
  3 LOAD_CONST   0 ('1+2j')
  6 CALL_FUNCTION1 (1 positional, 0 keyword pair)
  9 RETURN_VALUE

Note that in the first case, we generate a complex literal at compile 
time; in the second case, we generate a *string* literal at compile 
time, which must be parsed at runtime.

This is not a rhetorical question: if we didn't have complex literals, 
why would you write your complex number as a string, deferring parsing 
it until runtime, when you could parse it in your head at edit-time and 
call the constructor directly?

z = complex(1.23, 4.56)  # Assuming there was no literal syntax.

> I don’t think so. What does the j suffix give us? The two extra 
> keystrokes are trivial. The visual noise of the parens is a bigger 
> deal.

I don't think it is. I think the big deals in this proposal are:

- you have something that looks like a kind of string czt'...' 
  but is really a function call that might return absolutely 
  anything at all;

- you have a redundant special case for calling functions that
  take a single argument, but only if that argument is a string
  literal;

- you encourage people to write cryptic single-character 
  functions, like v(), x(), instead of meaningful expressions
  like Version() and re.compile();

- you encourage people to defer parsing that could be efficiently 
  done in your head at edit time into slow and likely inefficient
  string parsing done at runtime;

- the OP still hasn't responded to my question about the ambiguity
  of the proposal (is czt'...' a one three-letter prefix, or three 
  one-letter prefixes?)

all of which *hugely* outweighs the gain of being able to avoid a pair 
of parentheses.

[...]

> And the exact same thing is true in 3D or CUDA code that uses a lot of 
> float32 values. [...] I actually have to go through a string for 
> implementation reasons (because otherwise Python would force me to go 
> through a float64 and distort the values)

Indeed, but this proposal doesn't help

[Python-ideas] Re: Custom string prefixes

Thanks, Andrew, you're able to explain this much better than I do.
Just wanted to add that Python *already* has ways to grossly abuse
its syntax and create unreadable code. For example, I can write

>>> о = 3
>>> o = 5
>>> ο = 6
>>> (о, o, ο)
(3, 5, 6)

But just because some feature CAN get abused, doesn't mean it 
ACTUALLY gets abused in practice. People want to write nice, readable
code, because they will ultimately be the ones to support it.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZYKQZ2OIQSVKIQ6LMOSWAYLW5UZQP5VG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Wed, Aug 28, 2019 at 10:52 AM Andrew Barnert  wrote:
>
> On Aug 27, 2019, at 14:41, Chris Angelico  wrote:
> > All the examples about Windows paths fall into one of two problematic boxes:
> >
> > 1) Proposals that allow an arbitrary prefix to redefine the entire
> > parser - basically impossible for anything sane
> >
> > 2) Proposals that do not allow the prefix to redefine the parser, and
> > are utterly useless, because the rest of the string still has to be
> > valid.
>
> 3) Proposals that do not allow the prefix to redefine the parser for the 
> entire program, but do allow it to manually parse anything the tokenizer can 
> recognize as a single (literal) token.
>
> As I said, I haven’t tried to implement this example as I have with the other 
> examples, so I can’t promise that it’s doable (with the current tokenizer, or 
> with a reasonable change to it). But if it is doable, it’s neither insane nor 
> useless. (And evenif it’s not doable, that’s just two examples that affixes 
> can’t solve—Windows paths and general “super-raw strings”. They still solve 
> all of the other examples.)
>

So what is the definition of "a single literal token" when you're
creating a path-string? You want this to be valid:

x = path"C:\"

For this to work, the path prefix has to redefine the way the parser
finds the end of the token, does it not? Otherwise, you still have the
same problems you already do - backslashes have to be escaped. That's
why I say that, without being able to redefine the parser, this is
completely useless, as a "path string" might as well just be a
"string".

Which way is it?

> > That line of argument is valid for anything that is specifically
> > defined by the language.
>
> Yes, and? “Literal token” is specifically defined by the language. “Literal 
> token with attached tag” will also be specifically defined by the language. 
> The only thing open to customization is what that token gets compiled to.
>

I don't understand. Are you saying that the prefix is not going to be
able to change how backslashes are handled, or that it is? If you keep
the tokenizer exactly the same and just add a token in front of it,
then things like path"C:\" will be considered to be incomplete and
will continue to consume source code until the next quote (or throw
SyntaxError for EOL inside string literal). Or is your idea of
"literal token" something other than that?

If a "literal token" is simply a string literal, then how is this
actually helping anything? What do you achieve?

> Look at the plethora of suffixes C has for number and character literals. 
> Look at how many things people still can’t do with them that they want to.

I don't know how many there are. The only ones I can think of are "f"
for single-precision float, and the long and unsigned suffixes on
integers. Python doesn't have these because very few programs need to
care about whether a float is single-precision or double-precision, or
how large an int is.

> Look at the way user literals work in C++. While technically you can argue 
> that they are “syntax customization”, in practice the customization is highly 
> constrained. Is it _impossible_ to use that feature to write code that can’t 
> be parsed by a human reader? I don’t know if I could prove that it’s 
> impossible. However, I do know that it’s not easy. And that none of the 
> examples, or real-life uses, that I’ve seen have done so.
>

I also have not yet seen any good examples of user literals in C++.

> Do you think Python users are incapable of the kind of restraint and taste 
> shown by C++ users, and therefore we can’t trust Python users with a tool 
> that might possibly (but we aren’t sure) if abused badly enough make code 
> harder to visually parse?
>

People can be trusted with powerful features that can introduce
complexity. There's just not a lot of point introducing a low-value
feature that adds a lot of complexity.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/THQWD6GA4ESFXE6GRO3BJKSRBQWLAP2X/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 27, 2019, at 14:41, Chris Angelico  wrote:
> 
>> On Wed, Aug 28, 2019 at 6:03 AM Andrew Barnert  wrote:
>> 
>>> On Tuesday, August 27, 2019, 11:12:51 AM PDT, Chris Angelico 
>>>  wrote:
>>> If your conclusion here were "and that's why Python needs a proper
>>> syntax for Decimal literals", then I would be inclined to agree with
>>> you - a Decimal literal would be lossless (as it can entirely encode
>>> whatever was in the source file), and you could then create the
>>> float32 values from those.
>> 
>> I think builtin Decimal literals are a non-starter. The type isn't even 
>> builtin.
>> 
> 
> Not sure that's a total blocker, but in any case, I'm not arguing for
> that - I'm just saying that everything up to that point in your
> argument would be better served by a Decimal literal than by any
> notion of "custom literals".

No, it really couldn’t. A builtin Decimal literal would arguably serve the 
Decimal use case better (but I’m not even sure about that one; see below), but 
it doesn’t serve the float32 case that you’re responding to.

>> But they're not. You didn't even attempt to answer the comparison with 
>> complex that you quoted. The problem that `j` solves is not that there's no 
>> way to create complex values losslessly out of floats, but that there's no 
>> way to create them _readably_, in a way that's consistent with the way you 
>> read and write them in every other context. Which is exactly the problem 
>> that `f` solves. Adding a Decimal literal would not help that at all—letting 
>> me write `f(1.23d)` instead of `f('1.23')` does not let me write `1.23f`.
>> 
> TBH I don't quite understand the problem. Is it only an issue with
> negative zero? If so, maybe you should say so, because in every other
> way, building a complex out of a float added to an imaginary is
> perfectly lossless.

Negative zero is an irrelevant side issue that Serhiy brought up. It means j is 
not quite perfect—and yet j is still perfectly usable despite that. Ignore 
negative zero.

The problem that j solves is dead simple: 1 + 2j is more readable than 
complex(1, 2). And it matches what you write and read in other contexts besides 
Python. That’s the only problem j solves. But it’s a problem worth solving, at 
least for code that uses a lot of complex numbers. Without it, even if you 
wanted to pollute the namespace with a single-letter global so you could write 
c(1, 2) or 1 + j(2), it _still_ wouldn’t be nearly as readable or as familiar. 
That’s why we have j. There is literally no other benefit, and yet it’s enough.

And the problem that f solves would be exactly the same: 1.23f is more readable 
than calling float32, and it matches what you read and write in other contexts 
besides Python (like, say, C or shader code). Even if you wanted to pollute the 
namespace with a single-letter global f, it still wouldn’t be as readable or as 
familiar. That’s why we should have f. There is literally no other benefit, but 
I think it’s enough benefit, for enough programs, that we should be allowed to 
do it. Just like j.

Unlike j, however, I don’t think it’s useful in enough programs that it should 
be builtin. And I think the same is probably true for Decimal. And for most of 
the other examples that have come up in this thread. Which is why I think we’d 
be better served with something akin to C++ allowing you to explicitly register 
affixes for your specific program, than something like C with its too-big-to 
remember-but-still-not-enough-for-many-uses zoo of builtin affixes.

>> Also, as the OP has pointed out repeatedly and nobody has yet answered, if I 
>> want to write `f(1.23d)` or `f('1.23')`, I have to pollute the global 
>> namespace with a function named `f` (a very commonly-used name); if I want 
>> to write `1.23f`, I don't, since the converter gets stored in some 
>> out-of-the-way place like `__user_literals_registry__['f']` rather than `f`. 
>> That seems like a serious benefit to me.
>> 
> Maybe. But far worse is that you have a very confusing situation that
> this registered value could be different in different programs. 

Sure, and the global f could also be different in different programs—or even in 
different modules in the same program. So what?

1.23f would always have the same meaning everywhere, it’s just that the meaning 
is something like __user_literals__['f']('1.23') instead of 
globals()['f']('1.23').

Yes, of course that is something new to be learned, if you’re looking at a 
program that does a lot of 3D math, or a lot of decimal math, or a lot of 
Windows path stuff, or whatever, people are likely to have used this feature so 
you’ll need to know how to look up the f or d or whatever. But that really 
isn’t a huge hardship, and I think the benefits outweigh the cost. 

> In
> contrast, f(1.23d) would have the same meaning everywhere: call a
> function 'f' with one parameter, the Decimal value 1.23. Allowing
> language syntax to vary between programs is a mess that needs a

[Python-ideas] Re: Custom string prefixes

On Wed, Aug 28, 2019 at 6:03 AM Andrew Barnert  wrote:
>
> On Tuesday, August 27, 2019, 11:12:51 AM PDT, Chris Angelico 
>  wrote:
> > If your conclusion here were "and that's why Python needs a proper
> > syntax for Decimal literals", then I would be inclined to agree with
> > you - a Decimal literal would be lossless (as it can entirely encode
> > whatever was in the source file), and you could then create the
> > float32 values from those.
>
> I think builtin Decimal literals are a non-starter. The type isn't even 
> builtin.
>

Not sure that's a total blocker, but in any case, I'm not arguing for
that - I'm just saying that everything up to that point in your
argument would be better served by a Decimal literal than by any
notion of "custom literals".

> But they're not. You didn't even attempt to answer the comparison with 
> complex that you quoted. The problem that `j` solves is not that there's no 
> way to create complex values losslessly out of floats, but that there's no 
> way to create them _readably_, in a way that's consistent with the way you 
> read and write them in every other context. Which is exactly the problem that 
> `f` solves. Adding a Decimal literal would not help that at all—letting me 
> write `f(1.23d)` instead of `f('1.23')` does not let me write `1.23f`.
>

TBH I don't quite understand the problem. Is it only an issue with
negative zero? If so, maybe you should say so, because in every other
way, building a complex out of a float added to an imaginary is
perfectly lossless.

> Also, I think you're the one who brought up performance earlier? `%timeit 
> np.float32('1.23')` is 671ns, while `%timeit np.float32(d)` with a 
> pre-constructed `Decimal(1.23)` is 2.56us on my laptop, so adding a Decimal 
> literal instead of custom literals actually encourages _slower_ code, not 
> faster.
>

No, I didn't say that. I have no idea why numpy would take longer to
work with a Decimal than a string, and that's the sort of thing that
could easily change from one version to another. But the main argument
here is about readability, not performance.

> Also, as the OP has pointed out repeatedly and nobody has yet answered, if I 
> want to write `f(1.23d)` or `f('1.23')`, I have to pollute the global 
> namespace with a function named `f` (a very commonly-used name); if I want to 
> write `1.23f`, I don't, since the converter gets stored in some 
> out-of-the-way place like `__user_literals_registry__['f']` rather than `f`. 
> That seems like a serious benefit to me.
>

Maybe. But far worse is that you have a very confusing situation that
this registered value could be different in different programs. In
contrast, f(1.23d) would have the same meaning everywhere: call a
function 'f' with one parameter, the Decimal value 1.23. Allowing
language syntax to vary between programs is a mess that needs a LOT
more justification than anything I've seen so far.

> > But you haven't made the case for generic string prefixes or any sort
> > of "arbitrary literal" that would let you import something that
> > registers something to make your float32 literals.
>
> Sure I did; you just cut off the rest of the email that had other cases.

Which said basically the same as the parts I quoted.

> And ignored most of what you quoted about the float32 case.

What did I ignore?

> And ignored the previous emails by both me and the OP that had other cases. 
> Or can you explain to me how a builtin Decimal literal could solve the 
> problem of Windows paths?

All the examples about Windows paths fall into one of two problematic boxes:

1) Proposals that allow an arbitrary prefix to redefine the entire
parser - basically impossible for anything sane

2) Proposals that do not allow the prefix to redefine the parser, and
are utterly useless, because the rest of the string still has to be
valid.

So no, you still haven't made a case for arbitrary literals.

> Here's a few more: Numeric types that can't be losslessly converted to and 
> from Decimal, like Fraction.

If you want to push for Fraction literals as well, then sure. But
that's still very very different from *arbitrary literal types*.

> Something more similar to complex (e.g., `quat = 1.0x + 0.0y + 0.1z + 1.0w`). 
> What would Decimal literals do for me there?
>

Quaternions are sufficiently niche that it should be possible to
represent them with multiplication.

quat = 1.0 + 0.0*i + 0.1*j + 1.0*k

With appropriate objects i, j, k, it should be possible to craft
something that implements quaternion arithmetic using this syntax.
Yes, it's not quite as easy as 4+3j is, but it's also far FAR rarer.
(And remember, even regular complex numbers are more advanced than a
lot of languages have syntactic support for.)

> I think your reluctance and the OP's excitement here both come from the same 
> source: Any feature that gives you a more convenient way to write and read 
> something is good, because it lets you write things in a way that's 
> consistent with your

[Python-ideas] Re: Custom string prefixes

> But you haven't made the case for generic string prefixes or any sort
> of "arbitrary literal" that would let you import something that
> registers something to make your float32 literals.

The case can be made as follows: different people use different parts
of the Python language. Andrew would love to see the support for
decimals, fractions and float32s (possibly float16s too, and maybe
even posit numbers). Myself, I miss datetime and regular expression
literals. Other people on the 2013 thread argued at length in favor of
supporting sql-literals, which would allow them to be used in a much
safer manner. Then there are those who want to write complex 
numbers in a natural fashion, but they already got their wish granted.

In short, the needs vary, and not all of the functionality belongs to the
python standard library either.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YBL6FQIN35VJ3FMJO6RWOQT5GGFJ6RCU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

 On Tuesday, August 27, 2019, 11:12:51 AM PDT, Chris Angelico 
 wrote:
 
 >On Wed, Aug 28, 2019 at 3:10 AM Andrew Barnert via Python-ideas
> wrote:
>> Before I get into this, let me ask you a question. What does the j suffix 
>> give us? You can write complex numbers without it just fine:
>>
>>    c = complex
>>    c(1, 2)
>>
>> And you can even write a j function trivially:
>>
>>    def j(x): return complex(0, x)
>>    1 + j(2)
>>
>> But would anyone ever write that when they can write it like this:
>>
>>    1 + 2j
>>
>> I don’t think so. What does the j suffix give us? The two extra keystrokes 
>> are trivial. The visual noise of the parens is a bigger deal. The real issue 
>> is that this matches the way we conceptually think of complex numbers, and 
>> the way we write them in other contexts. (Well, the way electrical engineers 
>> write them; most of the rest of us use i rather than j… but still, having to 
>> use j instead of i is less of an impediment to reading 1+2j than having to 
>> use function syntax like 1+i(2).
>>
>> And the exact same thing is true in 3D or CUDA code that uses a lot of 
>> float32 values. Or code that uses a lot of Decimal values. In those cases, I 
>> actually have to go through a string for implementation reasons (because 
>> otherwise Python would force me to go through a float64 and distort the 
>> values), but conceptually; there are no strings involved when I write this:
>>
>>    array([f('0.2'), f('0.3'), f('0.1')])
>>
>> … and it would be a lot more readable if I could write it the same way I do 
>> in other programming languages:
>>
>>    array([0.2f, 0.3f, 0.1f])
>>
>> Again, it’s not about saving 4 keystrokes per number, and the visual noise 
>> of the parens is an issue but not the main one (and quotes are barely any 
>> noise by comparison); it’s the fact that these numeric values look like 
>> numeric values instead of looking like strings
>>
> If your conclusion here were "and that's why Python needs a proper> syntax 
> for Decimal literals", then I would be inclined to agree with> you - a 
> Decimal literal would be lossless (as it can entirely encode> whatever was in 
> the source file), and you could then create the> float32 values from those.
I think builtin Decimal literals are a non-starter. The type isn't even 
builtin. You surely wouldn't want to incur the cost of importing it to every 
Python session. And implementing some kind of lazy import mechanism in the 
middle of the json module is one thing, but in the middle of the compiler? So 
how _could_ you implement them? (While we're at it, what would that do to 
MicroPython and… one of the browser Pythons, I forget which… that have 100% 
syntax compatibility with Python but leave out much of the stdlib, including 
decimal? Sure, nobody ever promised they could do that, but it's a happy 
accident that they could, and do we want to break that capriciously?)
Maybe you could come up with some kind of DecimalLiteral object that doesn't 
actually act like a number, but can be converted to all of the different 
numeric types as needed (so, e.g., if you add or radd one to a `float` it 
converts to a `float`, etc.). That works great in languages like Swift and 
Haskell, but I don't think there's a feasible design for a dynamically-typed 
language.
So, even if Decimal literals really were the only thing we needed, a way to 
register Decimal literals may be the best way to do that.
But they're not. You didn't even attempt to answer the comparison with complex 
that you quoted. The problem that `j` solves is not that there's no way to 
create complex values losslessly out of floats, but that there's no way to 
create them _readably_, in a way that's consistent with the way you read and 
write them in every other context. Which is exactly the problem that `f` 
solves. Adding a Decimal literal would not help that at all—letting me write 
`f(1.23d)` instead of `f('1.23')` does not let me write `1.23f`.
Also, I think you're the one who brought up performance earlier? `%timeit 
np.float32('1.23')` is 671ns, while `%timeit np.float32(d)` with a 
pre-constructed `Decimal(1.23)` is 2.56us on my laptop, so adding a Decimal 
literal instead of custom literals actually encourages _slower_ code, not 
faster.
Also, as the OP has pointed out repeatedly and nobody has yet answered, if I 
want to write `f(1.23d)` or `f('1.23')`, I have to pollute the global namespace 
with a function named `f` (a very commonly-used name); if I want to write 
`1.23f`, I don't, since the converter gets stored in some out-of-the-way place 
like `__user_literals_registry__['f']` rather than `f`. That seems like a 
serious benefit to me.
> But you haven't made the case for generic string prefixes or any sort
> of "arbitrary literal" that would let you import something that
> registers something to make your float32 literals.

Sure I did; you just cut off the rest of the email that had other cases. And 
ignored most of what you quoted about the

[Python-ideas] Re: Custom string prefixes

 On Tuesday, August 27, 2019, 11:42:23 AM PDT, Serhiy Storchaka 
 wrote:

 > 27.08.19 20:07, Andrew Barnert via Python-ideas пише:
>> Before I get into this, let me ask you a question. What does the j suffix 
>> give us? You can write complex numbers without it just fine:
>> 
>>      c = complex
>>      c(1, 2)
>> 
>> And you can even write a j function trivially:
>> 
>>      def j(x): return complex(0, x)
>>      1 + j(2)
>> 
>> But would anyone ever write that when they can write it like this:
>> 
>>      1 + 2j
>
> And it has its limitation. How would you write complex(-0.0, 1.0)?

And yet, despite that limitation, many people find it useful, and use it on a 
daily basis. Are you suggesting that Python would be better off without the `j` 
suffix because of that problem?   ___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VNZM2WFLBY6P3MVS3OBK7L23ROMVDKM7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Wed, Aug 28, 2019 at 3:10 AM Andrew Barnert via Python-ideas
 wrote:
> Before I get into this, let me ask you a question. What does the j suffix 
> give us? You can write complex numbers without it just fine:
>
> c = complex
> c(1, 2)
>
> And you can even write a j function trivially:
>
> def j(x): return complex(0, x)
> 1 + j(2)
>
> But would anyone ever write that when they can write it like this:
>
> 1 + 2j
>
> I don’t think so. What does the j suffix give us? The two extra keystrokes 
> are trivial. The visual noise of the parens is a bigger deal. The real issue 
> is that this matches the way we conceptually think of complex numbers, and 
> the way we write them in other contexts. (Well, the way electrical engineers 
> write them; most of the rest of us use i rather than j… but still, having to 
> use j instead of i is less of an impediment to reading 1+2j than having to 
> use function syntax like 1+i(2).
>
> And the exact same thing is true in 3D or CUDA code that uses a lot of 
> float32 values. Or code that uses a lot of Decimal values. In those cases, I 
> actually have to go through a string for implementation reasons (because 
> otherwise Python would force me to go through a float64 and distort the 
> values), but conceptually; there are no strings involved when I write this:
>
> array([f('0.2'), f('0.3'), f('0.1')])
>
> … and it would be a lot more readable if I could write it the same way I do 
> in other programming languages:
>
> array([0.2f, 0.3f, 0.1f])
>
> Again, it’s not about saving 4 keystrokes per number, and the visual noise of 
> the parens is an issue but not the main one (and quotes are barely any noise 
> by comparison); it’s the fact that these numeric values look like numeric 
> values instead of looking like strings
>

If your conclusion here were "and that's why Python needs a proper
syntax for Decimal literals", then I would be inclined to agree with
you - a Decimal literal would be lossless (as it can entirely encode
whatever was in the source file), and you could then create the
float32 values from those.

But you haven't made the case for generic string prefixes or any sort
of "arbitrary literal" that would let you import something that
registers something to make your float32 literals.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/57CA2ZAIFXUZMF2ISNBS4UTESPX2ZA4G/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 27, 2019, at 08:36, Steven D'Aprano  wrote:
> 
> I don't wish to say that parsing strings to extract information is 
> always an anti-pattern:
> 
> http://cyrille.martraire.com/2010/01/the-string-obsession-anti-pattern/
> 
> after all we often need to process data coming from config files or 
> other user-input, where we have no choice but to accept a string.
> 
> But parsing string *literals* usually is an anti-pattern, especially 
> when there is a trivial transformation from the string to the 
> constructor arguments, e.g. 123/4567 --> Fraction(123, 4567).

But there are plenty of cases where parsing string literals is the current 
usual practice. Decimal is obvious, as well as most other non-native numeric 
types. Path objects even more so. Pandas users seem to always build their 
datetime objects out of MMDDTHHMMSS strings. And so on. 

So the status quo doesn’t mean nobody parses string literals, it means people 
_explicitly_ parse string literals. And the proposed change doesn’t mean more 
string literal parsing, it means making some of the existing, uneliminable uses 
less visually prominent and more readable. (And, relevant to the blog you 
linked, it seems to make it _less_ likely, not more, that you’d bind the string 
rather than the value to a name, or pass it around and parse it repeatedly, or 
the other bar practices they were talking about.)

I’ll admit there are some cases where I might sacrifice performance for 
convenience if we had this feature. For example, F1/3 (or 1/3F with suffixes) 
would have to mean at least Fraction(1) / 3, if not Fraction('1') / 3, or even 
that plus an extra LOAD_ATTR. That is clearly going to be more expensive than 
F(1, 3) meaning Fraction(1, 3), but I’d still do it at the REPL, and likely in 
real code as well. But I don’t think that choice would make my code worse 
(because when setup costs matter, I _wouldn’t_ make that choice), so I don’t 
see that as a problem.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YQMFAPYI6TU3APEYWROQN5GFMVG2I3TF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

> Unless there is some significant difference between the two, what does 
> this proposal give us?

The difference between `x'...'` and `x('...')`, other than visual noise, is the
following:

- The first "x" is in its own namespace of string prefixes. The second "x"
  exists in the global namespace of all other symbols.

- Python style discourages too short variable names, especially in libraries,
  because they have increased chance of clashing with other symbols, and
  generally may be hard to understand. At the same time, short names for
  string prefixes could be perfectly fine: there won't be too many of them
  anyways. The standard prefixes "b", "r", "u", "f" are all short, and nobody
  gets confused about them.

- Barrier of entry. Today you can write `from re import compile as x` and then
  write `x('...')` to denote a regular expression (if you don't mind having `x` 
as
  a global variable). But this is not the way people usually write code. People
  write the code the way they are taught from examples, and the examples 
  don't speak about regular expression objects. The examples only show
  regular expressions-as-strings, so many python users don't even realize
  that regular expressions can be objects.

  Now, if the string prefixes were available, library authors would think "Do we
  want to export such functionality for the benefit of our users?" And if they
  answer yes, then they'll showcase this in the documentation and examples,
  and the user will see that their code has become cleaner and more 
  understandable.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YYN7DVRINV7HGZ5ENVFBBERB2LY2SCM7/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 27, 2019, at 08:52, Steven D'Aprano  wrote:
> 
>> On Tue, Aug 27, 2019 at 05:24:19AM -0700, Andrew Barnert via Python-ideas 
>> wrote:
>> 
>> There is a possibility in between the two extremes of “useless” and 
>> “complete monster”: the prefix accepts exactly one token, but can 
>> parse that token however it wants.
> 
> How is that different from passing a string argument to a function or 
> class constructor that can parse that token however it wants?
> 
>x'...'
> 
>x('...')
> 
> Unless there is some significant difference between the two, what does 
> this proposal give us?

Before I get into this, let me ask you a question. What does the j suffix give 
us? You can write complex numbers without it just fine:

c = complex
c(1, 2)

And you can even write a j function trivially:

def j(x): return complex(0, x)
1 + j(2)

But would anyone ever write that when they can write it like this:

1 + 2j

I don’t think so. What does the j suffix give us? The two extra keystrokes are 
trivial. The visual noise of the parens is a bigger deal. The real issue is 
that this matches the way we conceptually think of complex numbers, and the way 
we write them in other contexts. (Well, the way electrical engineers write 
them; most of the rest of us use i rather than j… but still, having to use j 
instead of i is less of an impediment to reading 1+2j than having to use 
function syntax like 1+i(2).

And the exact same thing is true in 3D or CUDA code that uses a lot of float32 
values. Or code that uses a lot of Decimal values. In those cases, I actually 
have to go through a string for implementation reasons (because otherwise 
Python would force me to go through a float64 and distort the values), but 
conceptually; there are no strings involved when I write this:

array([f('0.2'), f('0.3'), f('0.1')])

… and it would be a lot more readable if I could write it the same way I do in 
other programming languages:

array([0.2f, 0.3f, 0.1f])

Again, it’s not about saving 4 keystrokes per number, and the visual noise of 
the parens is an issue but not the main one (and quotes are barely any noise by 
comparison); it’s the fact that these numeric values look like numeric values 
instead of looking like strings

The fact that they look the same as the same values in other contexts like a 
C++ program or a GLSL shader is a pretty large added bonus. But I don’t think 
that’s essential to the value here. If you forced me to use prefixes instead of 
suffixes (I don’t think there’s any good reason for that, but who knows how the 
winds of bikeshedding may blow), I’d still prefer f2.3 to f('2.3'), because it 
still looks like a number, as it should.

I know this is doable, because I’ve written an import hook that does it, plus I 
have a decade of experience with another popular language (C++) that has 
essentially the same feature.

What about the performance cost of these values not being constants? A 
decorator that finds np.float32 calls on constants and promoted them to 
constants by hacking the bytecode is pretty trivial to write, or you can load 
the whole array in one go from a bytes constant and put the readable version in 
a comment, or whatever. But anything that’s slow enough to be worth optimizing 
is doing a huge matmul or pushing zillions of values back and forth to the GPU 
or something else that swamps the setup cost, even if the setup cost involves a 
few dozen string parses, so it never matters. At least not for me.

—-

For a completely different example—but one that I’ve also already given earlier 
in this thread, so I won’t belabor it too much:

path'C:\'

bs"this\ space won’t have a backslash before it, also \e[22; is an escape 
sequence and of course \a is still a bell because I’m using the rules from 
C/JS/etc."

bs37"this\ space has a backslash before it without raising a warning or an 
error even in Python 3.15 because I’ve implemented the 3.7 rules”

… and so on.

Some of these _could_ be done with a raw string and a (maybe slightly more 
complicated) function call, but at least the first one is impossible to do that 
way.

Unlike the numeric suffixes, this one I haven’t actually implemented a hacky 
version of, and I don’t know of any other languages that have an identical 
feature, so I can’t promise it’s feasible, but it seems like it should be.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RKGU2ZQUQDBSIFQOBXQPO4UVSYIF4NEF/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-27 Thread Steven D'Aprano

On Tue, Aug 27, 2019 at 05:24:19AM -0700, Andrew Barnert via Python-ideas wrote:

> There is a possibility in between the two extremes of “useless” and 
> “complete monster”: the prefix accepts exactly one token, but can 
> parse that token however it wants.

How is that different from passing a string argument to a function or 
class constructor that can parse that token however it wants?

x'...'

x('...')

Unless there is some significant difference between the two, what does 
this proposal give us?


-- 
Steven
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EASIYWVRMVS3QNNWFOVQT7EIZFOUAPBU/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

2019-08-27 Thread Steven D'Aprano

On Tue, Aug 27, 2019 at 08:22:22AM -, stpa...@gmail.com wrote:

> The string (or number) prefixes add new power to the language

I don't think they do. It's just syntactic sugar for a function call. 
There's nothing that czt'...' will do that czt('...') can't already do.

If you have a proposal that allows custom string prefixes to do 
something that a function call cannot do, I've missed it.

> If a certain feature can potentially be misused shouldn't deter us
> from adding it, if the benefits are significant.

Very true, but so far I see nothing in this proposal that suggests that 
the benefits are more significant than avoiding having to type a pair of 
parentheses. Every benefit I have seen applies equally to the function 
call version, but without the added complexity to the language of 
allowing custom string prefixes.

> And the benefits in terms of readability can be significant.

I don't think they will be. I think they will encourage cryptic 
one-character function names disguised as prefixes:

v'...' instead of Version(...)
x'...' instead of re.compile(...)

to take two examples from your proposal. At least this is somewhat 
better:

sql'...'

but that leaves the ambiguity of not knowing whether that's a chained 
function call s(q(l(...))) or a single sql(...).

I believe it will also encourage inefficient and cryptic string parsing 
instead of more clear use of seperate arguments. Your earlier example:

frac'123/4567'

The Fraction constructor already accepts such strings, and it is 
occasionally handy for parsing user-input. But using it to parse string 
literals gives slow, inefficient code for little or no benefit:

[steve@ando cpython]$ ./python -m timeit -s 'from fractions import 
Fraction' 'Fraction(123, 4567)'
2 loops, best of 5: 18.9 usec per loop

[steve@ando cpython]$ ./python -m timeit -s 'from fractions import 
Fraction' 'Fraction("123/4567")'
5000 loops, best of 5: 52.9 usec per loop

Unless you can suggest a way to parse arbitrary strings in arbitrary 
ways at compile-time, these custom string prefixes are probably doomed 
to be slow and inefficient.

The best thing I can say about this is that at least frac'123/4567' 
would probably be easy to understand, since the / syntax for fractions 
is familiar to most people from school. But the same cannot be said for 
other custom prefixes:

cf'[0; 37, 7, 1, 2, 5]'

Perhaps you can guess the meaning of that cf-string. Perhaps you can't. 
A hint might point you in the right direction:

assert cf'[0; 37, 7, 1, 2, 5]' == Fraction(123, 4567)

(By the way, the semi-colon is meaningful and not a typo.)

To the degree that custom string prefixes will encourage cryptic one and 
two letter names, I think that this will hurt readability and clarity of 
code. But if the reader has the domain knowledge to recognise what "cf" 
stands for, this may be no worse than (say) "re" (regular expression).

In conventional code, we might call the cf function like this:

cf([0, 37, 7, 1, 2, 5])  # Single list argument.
cf(0, 37, 7, 1, 2, 5)# *args version.

Either way works for me. But it is your argument that replacing the 
parentheses with quote marks is "more readable":

cf([0, 37, 7, 1, 2, 5])
cf'[0; 37, 7, 1, 2, 5]'

not just a little bit more readable, but enough to make up for the 
inefficiency of having to write your own parser, deal with errors, 
compile a string literal, parse it at runtime, and only then call the 
actual cf constructor and return a cf object.

Even if I accepted your claim that swapping (...) for '...' was more 
readable, I am skeptical that the additional work and runtime 
inefficiency would be worth the supposed benefit.

I don't wish to say that parsing strings to extract information is 
always an anti-pattern:

http://cyrille.martraire.com/2010/01/the-string-obsession-anti-pattern/

after all we often need to process data coming from config files or 
other user-input, where we have no choice but to accept a string.

But parsing string *literals* usually is an anti-pattern, especially 
when there is a trivial transformation from the string to the 
constructor arguments, e.g. 123/4567 --> Fraction(123, 4567).

[...]
> Exactly. You look at string "1.10a" and you know it must be a version string,
> because you're a human, you're smart. The compiler is not a human, it has no
> idea. To the Python interpreter it's just a PyUnicode object of length 5. It's
> meaningless. But when you combine this string with a prefix into a single
> object, it gains power. It can have methods or special behaviors. It can have
> a type, different from `str`, that can be inspected when passing this object 
> to
> another function.

Everything you say there applies to ordinary function call syntax too:

Version('1.10a')

can have methods, special behaviours, a type different from str, etc. 
Not one of those benefits comes from *custom string prefixes*. They all 
come from the use of a

[Python-ideas] Re: Custom string prefixes

On Aug 27, 2019, at 01:42, Chris Angelico  wrote:
> 
> Will these "custom
> prefixes" be able to define anything syntactically? If not, why not
> just use a function call? And if they can, then you have created an
> absolute monster, where a v-string in one context can have completely
> different syntactic influence on what follows it than a v-string in
> another context.

There is a possibility in between the two extremes of “useless” and “complete 
monster”: the prefix accepts exactly one token, but can parse that token 
however it wants.

That’s pretty close to what C++ does, and pretty close to the way my hacky 
proof of concept last time around worked, and I don’t think that only works 
because those are suffix-only designs. 

(That being said, if you do allow “really raw” string literals as input to the 
user prefixes/suffixes to handle the path'C:\' case, then it’s possible to 
invent cases that would tokenize differently with and without the feature—in 
fact. I just did—and therefore it _might_ be possible to invent cases that 
parse validly but differently, in which case the monster is lurking after all. 
Someone might want to look more carefully at the C++ rules for that?)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GLZPVGS5NSWZ64DJMQBPETZFT3TDJEGH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Aug 26, 2019, at 23:43, Serhiy Storchaka  wrote:
> 
> 27.08.19 06:38, Andrew Barnert via Python-ideas пише:
>>  * JSON (register the stdlib or simplejson or ujson),
> 
> What if the JSON suffix for?

I think you’d mainly want it in combination with percent-, html-, or 
uu-equals-decoding, which makes it a potential stress test of the “multiple 
affixes” or “affixes with modifiers” idea. Which I think is important, because 
I like what the OP came up with for that idea, so I want to push it beyond just 
the “regex with flags” example to see if it breaks.

Maybe URL, which often has the same html and percent encoding issues, would be 
a better example? I personally don’t need to decode URLs that often in Python 
(unlike in, say, ObjC, where there’s a smart URL class that you use in place of 
strings all over the place), but maybe others do?

> JSON is virtually a subset of Python except that that it uses true, false and 
> null instead of True, False and None.

Is it _virtually_ a subset, or literally so, modulo those three values? I don’t 
know off the top of my head. Look at all the trouble caused by Crockford just 
assuming that the syntax he’d defined was a strict subset of JS when actually 
it isn’t quite.

Actually, now that I think of it, I do know. Python has allow_nan on by 
default, so you’d need to also `from math import nan as NaN` and `from math 
import inf as Infinity`. But is that it? I’m not sure.

And of course if you’ve done this:

jdec = json.JSONDecoder(parse_float=Decimal)
__register_prefix__(jdec.decode, 'j')

… then even j'1.1' and 1.1 are no longer the same values.

 Not to mention what you get if you registered Pandas’s JSON reader instead of 
the stdlib’s.
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FICNLZX77UVOYS6P7VG2AYFNUVM3FBMX/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

On Tue, Aug 27, 2019 at 6:25 PM  wrote:
> You're correct that, devoid of context, `v"smth..."` is not very meaningful. 
> The
> "v" suffix could mean "version", or "verbose", or "volatile", or "vectorized",
> or "velociraptor", or whatever. Luckily, the code is almost always exists
> within a specific context. It solves a particular problem, and works within a
> particular domain, and makes perfect sense for people working within that
> domain.
>
> This isn't much different than, say, `np.` suffix, which means "numpy" in the
> domain of numerical computations, NP-completeness for some mathematicians,
> and "no problem" for regular users.

Syntactically, the "np." prefix (not suffix fwiw) actually means "look
up the np object, then locate an attribute called ". That's true of every prefix you could ever get, and they're
always looked up at run time; the attribute name always follows the
exact same syntactic rules no matter what the prefix is. Literals, on
the other hand, are part of syntax - a different string type prefix
can change the way the entire file gets parsed. Will these "custom
prefixes" be able to define anything syntactically? If not, why not
just use a function call? And if they can, then you have created an
absolute monster, where a v-string in one context can have completely
different syntactic influence on what follows it than a v-string in
another context. At least with attribute lookups, you can parse a file
without knowing what "np" actually means, and even examine things at
run-time.

ChrisA
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZDKVSRIASJQFORKF7FPARBYFELGUDBM2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes