[Python-ideas] Re: An interface for `mro` methods similar to the interface for `iter`, `getattr`, etc...

2023-03-24 Thread Steven D'Aprano
Hi Samuel,

Classes already have a `obj.mro()` public method that wraps their private 
dunder `__mro__`.

Non-classes (instances) don't have a `__mro__`.

> Additionally, the class named `object` should have a method named
> `__mro__` and any class which inherits from `object` may override
> `__mro__` or inherit the default `object.__mro__`.

Guido's time machine strikes again. This has existed since Python 2.2 
or thereabouts, so more than twenty years ago:

>>> object.__mro__
(,)

>>> object.mro()
[]


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ZYGLKVXQKWJX3KHNZGIJRGRXJBJBGIP4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Ampersand operator for strings

2023-03-06 Thread Steven D'Aprano
On Mon, Mar 06, 2023 at 10:33:26AM +0100, Marc-Andre Lemburg wrote:

> def join_words(list_of_words)
> return ' '.join([x.strip() for x in list_of_words])

That's not Rob's suggestion either.

Rob's suggestion is an operator which concats two substrings with 
exactly one space between them, without stripping leading or trailing 
whitespace of the result.

Examples:

a = "\nHeading:"
b = "Result\n\n"
a & b

would give "\nHeading: Result\n\n"

s = "my hovercraft\n"
t = "is full of eels\n"
s & t

would give "my hovercraft is full of eels\n"

I find the concept is very easy to understand: "concat with exactly one 
space between the operands".  But I must admit I'm struggling to think 
of cases where I would use it.

I like the look of the & operator for concatenation, so I want to like 
this proposal. But I think I will need to see real world code to 
understand when it would be useful.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2OZXSXETHQLWJQYAM2S3SJVACPEIPDSZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Ampersand operator for strings

2023-03-05 Thread Steven D'Aprano
On Sun, Mar 05, 2023 at 10:49:12PM -0500, David Mertz, Ph.D. wrote:

> Is it really that much longer to write `f"{s1} {s2}"` when you want that?

That's not the same as Rob's proposal.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CV5I7EFO7LHFN6QNRHZC475MS46TROLA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Combinations of keywords

2023-02-21 Thread Steven D'Aprano
Hello and welcome!

Trying to squeeze many lines of code into one line is a bad idea. It 
makes it hard to read, and will be impossible for Python where 
statements and expressions are different and must be on separate lines.

Extra lines are cheap. Python does not encourage people trying to cram 
as much code as possible in one line.

Your first suggestion:

code and something if except

which expands to this:

try:
code
except:
something

encourages the Most Diabolical Python Anti-Pattern:

https://realpython.com/the-most-diabolical-python-antipattern/

If you are using plain `except` like that, you probably should stop, 
99.9% of the time it is a very, very bad idea.

> It is usefull for code like this :
> #import A_WINDOW_MODULE and import A_UNIX_MODULE if except

If your code blocks are a single statement, you can write:

try: import windows_module
except ImportError: import unix_module

but most people will say that is ugly and be spread out:

try: 
import windows_module
except ImportError:
import unix_module


Your second suggestion if just the "ternary if operator":

CODE if CONDITION else CODE

and already works, so long as the two codes and the condition are 
expressions.

result = Sign_in(user) if button_i_dont have_a_acount.press() else 
Log_in(user)

Most people will say that when the two expressions are *actions* rather 
than *values* it is better to use the if...else statement rather than 
trying to squash it into one line.

# Beautiful code :-)
if button_i_dont have_a_acount.press():
Sign_in(user)
else:
Log_in(user)

# Ugly code :-(
if button_i_dont have_a_acount.press(): Sign_in(user)
else: Log_in(user)

# Even more ugly but fortunately this is not allowed :-)
if button_i_dont have_a_acount.press(): Sign_in(user) else: Log_in(user)


___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TUBJ4AJPGSAUBMUGWBSGR4YB777RD5QH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Multiple arguments to str.partition and bytes.partition

2023-01-08 Thread Steven D'Aprano
On Sun, Jan 08, 2023 at 05:30:30PM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> 
>  > On Sat, Jan 07, 2023 at 10:48:48AM -0800, Peter Ludemann wrote:
>  > > You can get almost the same result using pattern matching. For example, 
> your
>  > > "foo:bar;baz".partition(":", ";")
>  > > can be done by a well-known matching idiom:
>  > > re.match(r'([^:]*):([^;]*);(.*)', 'foo:bar;baz').groups()
>  > 
>  > "Well-known" he says :-)
> 
> It *is* well-known to those who know.  Just because you don't like
> regex doesn't mean it's not well-known.

I like regexes plenty, for what they are good for. But my *liking* them 
or not is irrelevant as to whether this example is "well-known" or not.

I'm not the heaviest regex user in the world, but I've used my share, 
and I've never seen this particular line noise before. (Hey, I like 
Forth. Sometimes line noise is great.)

I mean, if all you are doing is splitting the source by some separators 
regardless of order, surely this does the same job and is *vastly* more 
obvious?

>>> re.split(r'[:;]', 'foo:bar;baz')
['foo', 'bar', 'baz']

If the order matters:

>>> re.match('(.*):(.*);(.*)', 'foo:bar;baz').groups()
('foo', 'bar', 'baz')

Or use non-greedy wildcards if you need them:

>>> re.match('(.*?):(.*?);(.*)', 'foo:b:ar;ba;z').groups()
('foo', 'b:ar', 'ba;z')



>  > I think that the regex solution is also wrong because it requires you 
>  > to know *exactly* what order the separators are found in the source 
>  > string.
> 
> But that's characteristic of many examples.

Great. Then for *those* structured examples you can happily write your 
regex and put the separators in the order you expect.

But I'm talking about *unstructured* examples where you don't know the 
order of the separators, you want to split on whichever one comes first 
regardless of the order, and you need to know which separator that was.


[...]
> Examples where the order of separators doesn't matter?  In most of the
> examples I need, swapping order is a parse error.

Okay, then you *mostly* don't need this.


>  > and it splits the string all at once instead of one split per call.
> 
> So does the original proposal, that's part of the point of it, I
> think.

str.partition does *one* three way split, into (head, sep, tail). If you 
want to continue to partition the tail, you have to call it again. To 
me, that fixed "one bite per call" design is fundamental to partition(). 
If we wanted an arbitrary number of splits we'd use, um, split() :-)

Of course we can debate the pros and cons of each, that's what this 
thread is for.


> Parsing is hard.  Both regex and r?partition are best used as low-
> level tools for tokenizing, and you're asking for trouble if you try
> to use them for parsing past a certain point.

Right! I agree! And that is why I want partition to accept multiple 
separators and split on the first one found. I find myself needing to do 
that, well, not "all the time" by any means, but often enough that its 
an itch I want scratched.


> My breaking point for
> regex is somewhere around the authority example,

Heh, I've written much more complicated examples. It was kinda fun, 
until I came back to it a month later and couldn't understand what the 
hell it did! :-)


> but I wouldn't push
> back if my project's style guide said to to break that up.  I *would*
> however often prefer regexp to r?partition because it would allow
> character classes, and in most of the areas I work with (mail, URIs,
> encodings) being able to detect lexical errors by using character
> classes is helpful.

I'm not sure I quite understand you there, but if I do, I would prefer 
to split the string and then validate the head and tail afterwards, 
rather than just have the regex fail.


> And I would prefer "one bite per call" partition
> to a partition at multiple points.  Where I'm being pretty fuzzy, the
> .split methods are fine.

I think we agree here.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WXWJA5ZR7XUH7D77CVSW455F36FUCIV5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Multiple arguments to str.partition and bytes.partition

2023-01-07 Thread Steven D'Aprano
On Sat, Jan 07, 2023 at 10:48:48AM -0800, Peter Ludemann wrote:
> You can get almost the same result using pattern matching. For example, your
> "foo:bar;baz".partition(":", ";")
> can be done by a well-known matching idiom:
> re.match(r'([^:]*):([^;]*);(.*)', 'foo:bar;baz').groups()

"Well-known" he says :-)

I think that is a perfect example of the ability to use regexes for 
obfuscation. It gets worse if you want to partition on a regex 
metacharacter like '.'

I think that the regex solution is also wrong because it requires you 
to know *exactly* what order the separators are found in the source 
string. If we swap the semi-colon and the colon in the source, but not 
the pattern, the idiom fails:

>>> re.match(r'([^:]*):([^;]*);(.*)', 'foo;bar:baz').groups()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'groups'

So that makes it useless for the case where you want to split of any of 
a number of separators, but don't know which order they occur in.

You call it "almost the same result" but it is nothing like the result 
from partition. The separators are lost, and it splits the string all at 
once instead of one split per call. I think this would be a closer 
match:

```
>>> re.split(r'[:;]', 'foo:bar;baz', maxsplit=1)
['foo', 'bar;baz']
```

but even there we lose the information of which separator was 
partitioned on.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MAKXHSBD2YG3JFGECGAOPOIPZHZU27SW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Multiple arguments to str.partition and bytes.partition

2023-01-07 Thread Steven D'Aprano
+1 on the idea of having `partition` and `rpartition` take multiple 
separators.

Keep it nice and simple: provided with multiple separators, `partition` 
will split the string on the first separator found in the source string.

In other words, `source.partition(a, b, c, d)` will split on a /or/ b 
/or/ c /or/ d, whichever comes first on the left.

Here is a proof of concept to give the basic idea:


```
def partition(source, *seps):
if len(seps) == 0:
raise TypeError('need at least one separator')
indices = [(i, sep) for sep in seps if (i:=source.find(sep)) != -1]
if indices:
pos, sep = min(indices, key=lambda t: t[0])
return (source[:pos], sep, source[pos + len(sep):])
else:
return (source, '', '')
```

That is not the most efficient implementation, but it shows the basic 
concept. Example:

>>> partition('abc-def+ghi;klm', ';', '-', '+')
('abc', '-', 'def+ghi;klm')
>>> partition('def+ghi;klm', ';', '-', '+')
('def', '+', 'ghi;klm')


However there are some complications that need resolving. What if the 
separators overlap? E.g. we might have '-' and '--' as two separators. 
We might want to choose the shortest separator, or the longest. That 
choice should be a keyword-only argument.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GKMJOSEJYIQIUH2S3VTLVWGICKA2AFVN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-12-25 Thread Steven D'Aprano
On Sat, Dec 24, 2022 at 11:34:19AM -0500, Shironeko wrote:
> 
> Is the => syntax needed? as far as I can think of, the only time where 
> late evaluation is needed is when the expression references the other 
> arguments.

You are missing the most common case, the motivating case, for 
late-bound defaults: mutable defaults.

def spam(x, y=>[]):
pass

Here the intention is to have y's default be a *different* list each 
time you call spam(x), instead of the same list each time.

The ability for default values to refer to other parameters is a Nice To 
Have, not a Must Have. It has been a very long time since I have read 
the PEP, and I don't remember whether it reviews other languages to see 
what functionality they provide for defaults, but I don't think many 
other languages allow you to set the default of one parameter to be 
another parameter.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/HS7WJGA4CROKOCVVIECEXBNKEFOBQ3LQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-23 Thread Steven D'Aprano
On Fri, Dec 23, 2022 at 06:02:39PM +0900, Stephen J. Turnbull wrote:

> Many would argue that (POSIX) locales aren't a good fit for
> anything. :-)

:-)

> I agree that it's kind of hard to see anything more complex than a
> fixed table for the entire Unicode repertoire belonging in str,
> though.

I think for practical reasons, we don't want to overload the builtin str 
class with excessive complexity. But the string module? Or third-party 
libraries?


> (I admit that my feeling toward Erdogan makes me less
> sympathetic to the Turks. :-)

Does that include the 70% or more Turks who disapprove of Erdoğan?

There are at least 35 surviving Turkic languages, including Azerbaijani, 
Turkmen, Qashqai, Balkan Gagauz, and Tatar. Although Turkish is the 
single largest of them, it only makes up about 38% of all Turkic 
speakers.

All up, there are about 200 million speakers of Turkic languages. That's 
more than Germanic languages (excluding English) or Japanese. If any 
special case should be a special case, it is the Turkish I Problem.

But as I said, probably not in the builtin str class.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QRKEYJ2DM2HCJWQEAVLLUNBBJYQ44WBY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-21 Thread Steven D'Aprano
On Tue, Dec 20, 2022 at 11:55:49PM -0800, Jeremiah Paige wrote:
> @property
> def data(self):
> return f"{self}"

By my testing, on Python 3.10, this is slightly faster still:

@property
def data(self):
return "".join((self,))

That's about 14% faster than the f-string version.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CCZG6ALFEV3B67LENW5ZDJG5XSHKREG4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-20 Thread Steven D'Aprano
On Wed, Dec 21, 2022 at 01:18:46AM -0500, David Mertz, Ph.D. wrote:

> I'm on my tablet, so cannot test at the moment. But is `str.upper()` REALLY
> wrong about the Turkish dotless I (and dotted capital I) currently?!

It has to be. Turkic languages like Turkish, Azerbaijani and Tatar 
distinguish dotted and non-dotted I's, leading to a slew of problems 
infamously known as "The Turkish I problem".

(Other languages use undotted i's but not in the same way, e.g. Irish 
roadsigns in Gaelic usually drop the dot to avoid confusion with í. And 
don't confuse the undotted i with the Latin iota ɩ, which is a 
completely different letter to the Greek iota ι. Alphabets are hard.)

In Turkic languages, we have:

Letter:   ıIiİ
---  ---  ---  ---  ---
Lowercase:ııii
Uppercase:IIİİ

Swapping case can never add or remove a dot. (The technical name for the 
dot is "tittle".) Which is perfectly logical, of course.

But most other people with Latin-based alphabets mix the dotted and 
dotless letters together, leading to this lossy table:

Letter:   ıIiİ
---  ---  ---  ---  ---
Lowercase:ıiii
Uppercase:IIIİ

which is the official Unicode case conversion, which Python follows.

>>> "ıIiİ".lower()
'ıiii̇'
>>> "ıIiİ".upper()
'IIIİ'

Just to make the Turkish I problem even more exciting, you aren't 
supposed to use Turkish rules when changing the case of foreign proper 
nouns. So the popular children's book "Alice Harikalar Diyarında" (Alice 
in Wonderland) should use *both* sets of rules when uppercasing to give 
"ALICE HARİKALAR DİYARINDA".

Sometimes the dot can be very significant.

https://gizmodo.com/a-cellphones-missing-dot-kills-two-people-puts-three-m-382026


> That feels like a BPO needed if true.

We do whatever the Unicode standard says to do. They say that 
localisation issues are out of scope for Unicode.


-- 
Steve

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SGQAVETZR6AZ3SS55LNVYL3TLKX6SUZ4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-20 Thread Steven D'Aprano
On Wed, Dec 21, 2022 at 09:42:51AM +1100, Cameron Simpson wrote:

> With str subtypes, the case that comes to my mind is mixing str 
> subtypes.
[...]
> So, yes, for many methods I might reasonably expect a new html(str). But 
> I can contrive situations where I'd want a plain str

The key word there is *contrive*.

Obviously there are methods that are expected to return plain old 
strings. If you have a html.extract_content() method which extracts the 
body of the html document as plain text, stripping out all markup, there 
is no point returning a html object and a str will do. But most methods 
will need to keep the markup, and so they will need to return a html 
object.

HTML is probably not the greatest example for this issue, because I 
expect that a full-blown HTML string subclass would probably have to 
override nearly all methods, so in this *specific* case the status quo 
is probably fine in practice. The status quo mostly hurts *lightweight* 
subclasses:

class TurkishString(str):
def upper(self):
return TurkishString(str.upper(self.replace('i', 'İ')))
def lower(self):
return TurkishString(str.lower(self.replace('I', 'ı')))

That's fine so long as the *only* operations you do to a TurkishString 
is upper or lower. As soon as you do concatenation, substring 
replacement, stripping, joining, etc you get a regular string.

So we've gone from a lightweight subclass that needs to override two 
methods, to a heavyweight subclass that needs to override 30+ methods.

This is probably why we don't rely on subclassing that much. Easier to 
just write a top-level function and forget about subclassing.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Q6JQVEUAQXGX6EMAFVGYGGF7ZENUSMRP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-20 Thread Steven D'Aprano
On Mon, Dec 19, 2022 at 05:53:38PM -0800, Ethan Furman wrote:

> Personally, every other time I've wanted to subclass a built-in data type, 
> I've wanted the built-in methods to return my subclass, not the original 
> class.

Enums are special. But outside of enums, I cannot think of any useful 
situation where the desirable behaviour is for methods on a subclass to 
generally return a superclass rather than the type of self.

Its normal behaviour for operations on a class K to return K instances, 
not some superclass of K. I dare say there are a few, but they don't 
come to mind.


> All of which is to say:  sometimes you want it one way, sometimes the 
> other.  ;-)

Yes, but one way is *overwhelmingly* more common than the other. 
Builtins make the rare form easy and the common form hard.

> Metaclasses, anyone?

Oh gods, we shouldn't need to write a metaclass just to get methods that 
create instances of the calling class instead of one of its 
superclasses.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TOBBYPYOYBJV2FBC6PQFKZMNK46JCT3Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-19 Thread Steven D'Aprano
On Mon, Dec 19, 2022 at 03:48:01PM -0800, Christopher Barker wrote:
> On Mon, Dec 19, 2022 at 3:39 AM Steven D'Aprano  wrote
> 
> > In any case, I was making a larger point that this same issue applies to
> > other builtins like float, int and more.
> 
> 
> Actually, I think the issue is with immutable types, rather than builtins.

No.

>>> class MyList(list):
... def frobinate(self):
... return "something"
... 
>>> (MyList(range(5)) + []).frobinate()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'list' object has no attribute 'frobinate'

And of course, by default, MyList slices are MyLists too, right? No.

>>> type(MyList(range(5))[1:])


This is less of an issue for dicts because there are few dict methods 
and operators which return dicts.

Speaking of dicts, the dict.fromkeys method cooperates with subclasses. 
That proves that it can be done from a builtin. True, it is a 
classmethod rather than an instance method, but any instance method can 
find out its own class by calling `type()` (or the internal, C 
equivalent) on `self`. Just as we can do from Python.

> And that’s just the nature of the beast.

Of course it is not. We can write classes in Python that cooperate with 
subclasses. The only difference is that builtins are written in C. There 
is nothing fundamental to C that forces this behaviour. It's a choice.


-- 
Steve

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BA6M5Y5ZLPNSGHDRU7U6SBSFCAZAU3MS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-19 Thread Steven D'Aprano
On Mon, Dec 19, 2022 at 01:02:02AM -0600, Shantanu Jain wrote:

> collections.UserString can take away a lot of this boilerplate pain from
> user defined str subclasses.

At what performance cost?

Also:

>>> s = collections.UserString('spam and eggs')
>>> isinstance(s, str)
False

which pretty much makes UserString useless for any code that does static 
checking or runtime isisinstance checks.

In any case, I was making a larger point that this same issue applies to 
other builtins like float, int and more.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UYRYTKMO3L5GSB2F5A4N5I6J3LTA7DQE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-19 Thread Steven D'Aprano
On Sun, Dec 18, 2022 at 10:23:18PM -0500, David Mertz, Ph.D. wrote:

> I'd agree to "limited", but not "hostile."  Look at the suggestions I
> mentioned: validate, canoncialize, security check.  All of those are
> perfectly fine in `.__new__()`.

No, they aren't perfectly fine, because as soon as you apply any 
operation to your string subclass, you get back a plain vanilla string 
which bypasses your custom `__new__` and so does not perform the 
validation or security check.

> But this much (say with a better validator) gets you static type checking,
> syntax highlighting, and inherent documentation of intent.

Any half-way decent static type-checker will immediately fail as soon as 
you call a method on this html string, because it will know that the 
method returns a vanilla string, not a html string. And that's exactly 
what mypy does:

[steve ~]$ cat static_check_test.py 
class html(str):
pass

def func(s:html) -> None:
pass

func(html('').lower())

[steve ~]$ mypy static_check_test.py 
static_check_test.py:7: error: Argument 1 to "func" has incompatible 
type "str"; expected "html"
Found 1 error in 1 file (checked 1 source file)


Same with auto-completion. Either auto-complete will correctly show you 
that what you thought was a html object isn't, and fail to show any 
additional methods you added; or worse, it will wrongly think it is a 
html object when it isn't, and allow you to autocorrect methods that 
don't exist.

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2JPILXSBEPUKHG4E5GH5KJFNOGNWXDYB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Idea: Tagged strings in python

2022-12-18 Thread Steven D'Aprano
On Sun, Dec 18, 2022 at 07:38:06PM -0500, David Mertz, Ph.D. wrote:

> However, if you want to allow these types to possibly *do* something with
> the strings inside (validate them, canonicalize them, do a security check,
> etc), I think I like the other way:
> 
> #2
> 
> class html(str): pass
> class css(str): pass

The problem with this is that the builtins are positively hostile to 
subclassing. The issue is demonstrated with this toy example:

class mystr(str):
def method(self):
return 1234

s = mystr("hello")
print(s.method())  # This is fine.
print(s.upper().method())  # This is not.


To be useable, we have to override every string method that returns a 
string. Including dunders. So your class becomes full of tedious boiler 
plate:

def upper(self):
return type(self)(super().upper())
def lower(self):
return type(self)(super().lower())
def casefold(self):
return type(self)(super().casefold())
# Plus another 29 or so methods

This is not just tedious and error-prone, but it is inefficient: calling 
super returns a regular string, which then has to be copied as a 
subclassed string and the original garbage collected.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/O7PU5FLLGNR7IR2V667LDPBBOEXF5NFU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Better (?) PRNG - follow up

2022-12-06 Thread Steven D'Aprano
On Tue, Dec 06, 2022 at 07:58:09PM -0500, David Mertz, Ph.D. wrote:

> You have an error in the code you posted. You never use R2 after one 
> call to SystemRandom.

Ah so I do, thanks for picking that up!

James, see how *easy* it is for experts to notice bugs, at least some of 
them, in a short piece of code that gets right to the point? The more 
extraneous and irrelevant code you give, and that includes old dead 
comments, the more places for bugs to hide and the harder it is for 
people reading to follow the code and spot the error.

Here is the corrected code (only one line needs to be fixed) and its 
output. Note that the conclusion doesn't change.


```python
import random
from statistics import mean, stdev
R2 = random.SystemRandom()
d2 = [R2.randint(1, 9) for i in range(10)]
triples = 0  # Count the number of triples.
for i in range(len(d2)-3):
a, b, c = d2[i:i+3]
if a == b == c:  triples += 1

print("mean =", mean(d2))
print("stdev =", stdev(d2))
print("runs of three:", triples)
```

And the output:

```
mean = 4.9922
stdev = 2.5837929328526306
runs of three: 1204
```



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6OWHLTZW5X65DRVYO4OOR2XHGRPAZ4SP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Better (?) PRNG - follow up

2022-12-06 Thread Steven D'Aprano
Thanks for posting your code, but at 178 lines (most of which are either 
commented out or irrelevent to your question) its a hard slog to work out what 
you're doing.

And as for the seemingly endless sequence of "Random number ... Value entered", 
what did information did you think we would get from that? Did you think we 
would study each and every pair of lines, looking for a pattern?

When asking questions about code, it helps to [post a **minimal** 
example](https://stackoverflow.com/help/minimal-reproducible-example) that we 
[can easily run](http://sscce.org/).

In this case, we can compare the Mersenne Twister PRNG with the operating 
system's cryptographically strong RNG and see if we get similar results.

The beauty of a PRNG like the Mersenne Twister is that it is perfectly 
repeatable if you know the seed. So if you repeat this *exact* code, you should 
get the *exact* same output.

(Well, almost. Technically only the output of `random.random` is guaranteed not 
to change from one version to another. But in practice `random.randint` is also 
very stable.)


```python
import random
from statistics import mean, stdev
random.seed(299)
d1 = [random.randint(1, 9) for i in range(10)]
triples = 0  # Count the number of triples.
for i in range(len(d1)-3):
a, b, c = d1[i:i+3]
if a == b == c:  triples += 1

print("mean =", mean(d1))
print("stdev =", stdev(d1))
print("runs of three:", triples)
```

If you run that code, the output you get in Python 3.10 should be:

```text
mean = 4.99025
stdev = 2.586159666323446
runs of three: 1244
```

So out of 10-3 = 7 groups of three random integers, 1244 or 1.2% are a 
triplet (a run of three).

Now let's compare the same with the OS's cryptographically strong RNG. Because 
this uses environmental randomness, there's no seed, so we can't replicate the 
results, but you should usually get something similar.


```python
import random
from statistics import mean, stdev
R2 = random.SystemRandom()
d2 = [random.randint(1, 9) for i in range(10)]
triples = 0  # Count the number of triples.
for i in range(len(d2)-3):
a, b, c = d2[i:i+3]
if a == b == c:  triples += 1

print("mean =", mean(d2))
print("stdev =", stdev(d2))
print("runs of three:", triples)
```

And output should usually be something like this:

```text
mean = 4.99198
stdev = 2.5882045258492763
runs of three: 1228
```

Virtually no difference.

Of course there is a *tiny* chance that you could get something outlandishly 
unlikely, like a run of fifty thousand 6s in a row. If such a fluke was 
impossible, it wouldn't be *random*.

You can see that there is no significant difference between the Mersenne 
Twister and the much more expensive crypto-strength SystemRandom.

The MT is a **very** vigorously studied PRNG. It has **excellent** statistical 
properties, without being too expensive to generate random numbers. And it 
allows replicable results: if you know the seed, you can repeat the output.

The only real disadvantage of MT is that it is not secure for cryptographic 
purposes.

(By the way, using `SystemRandom`, I also got 134 runs of four and 14 runs of 
five.)

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OX3OIARPY3CV37M4LX46SFKO7FT2LCPH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Enhancing variable scope control

2022-12-04 Thread Steven D'Aprano
On Sun, Dec 04, 2022 at 01:34:13PM -0800, Bruce Leban wrote:

>  I agree with most criticism of this proposal, although I'll note that 
> the one place where I'd like something like this is at top level. I 
> often write something like this at top level:
> 
> __part1 = (some calculation)
> __part2 = (some other calculation)
> THING = combine(__part1, __part2)
> __part1 = __part2 = None

A couple of stylistic points...


* I don't know if you have a personal naming convention for double 
  leading underscore names, but to Python and the rest of the community,
  they have no special meaning except inside a class. So you might want 
  to save your typing and just use a single leading underscore for
  private names.

* You probably don't want to assign the left over private names 
  `__part1` and `__part2` to None. Yes, that frees the references to the 
  objects they are bound to, but it still leaves the names floating 
  around in your globals.

Instead, use `del`, which explicitly removes the names from the current 
namespace, and allows the objects to be garbage collected:

_part1 = (some calculation)
_part2 = (some other calculation)
THING = combine(_part1, _part2)
del _part1, _part2

In which case I'm not sure I would even bother with the leading 
underscores.


> If they are large objects and I forget to explictly delete the 
> references, then they won't be garbage collected.

Very true. And when you do forget, what are the consequences? I daresay 
that your program still runs, and there are no observable consequences.

> Looking at all these options, is the cost of adding anything actually 
> worth the benefit? Probably not.

Agreed.

Given how rare it is for this sort of thing to actually matter, I think 
that the correct solution is "remember to del the variable when you are 
done" not "let's complicate the language".


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2ATDFK2JB6UGGEOCQXRP3CG7M5AA3CW3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Enhancing variable scope control

2022-12-02 Thread Steven D'Aprano
On Fri, Dec 02, 2022 at 03:48:36AM -0700, Anony Mous wrote:
> These objections -- such as they are -- are all applicable to every
> instance of "I wrote a function" or even "I named a variable" in any
> particular namespace: not just within imports, either. Anywhere. Global
> context. Sub functions. etc.

The fact that you can't tell the difference between **non-collisions** 
of names in different namespaces, and **actual collisions** of names in 
the same namespace, makes me sad.

The idea that you have absolutely nothing to learn from the hard-won 
experience of other developers is why we have so much unmaintainable, 
bad code in the world :-(

> coddle weak and/or lazy programmers while crippling the rest.

Oh, we have a Real Programmer™ here.

http://www.catb.org/jargon/html/story-of-mel.html

I'm glad that your Python pre-processor works for you.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TCGKWTKAVENHUSMB2KD7LMQFIZH23TWC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Enhancing variable scope control

2022-12-01 Thread Steven D'Aprano
On Thu, Dec 01, 2022 at 03:19:38PM -0700, Anony Mous wrote:

> I'd love to hear a justification -- any justification -- against what I'm
> talking about, because to date, I've never run into one. Many have tried,
> too. :)

What is the `str` interface? Perhaps not the best example, because 
strings have so many methods, but it will do. Under Python's design, we 
know what methods strings have. We absolutely categorically know that if 
`type(obj) is str` then we can rely on a specific interface, namely 47 
methods and 33 dunders.

What is the `str` interface with monkey-patching allowed? It's 
impossible to predict. The string interface depends on which methods 
have been monkey-patched in.

It gets worse if we allow monkey-patching of individual instances as 
well as the entire class. Now you don't necessarily know that two 
strings support the same operations, even if their type is the same.

If you have a string, you don't know whether or not it will support the 
TestDottedQuad method.

These are points of friction that can make the language harder to use. 
How is a beginner supposed to know which methods are native to strings, 
and which are monkey-patched in? Where is `str.TestDottedQuad` 
implemented and documented?

If you have a class and method you are not familiar with, say 
`Widget.frob()`, in a large application you don't know well, how do you 
find where `frob` was added to the class? It could have come from 
anywhere.

What happens if you go to monkey-patch string with TestDottedQuad and 
*some completed unrelated library* has beat you to it and already done 
so? Monkey-patching is safe so long as you are the only one doing it. As 
soon as libraries get in the act, things go down hill very quickly.

These are not insurmountable problems. Python supports powerful 
intraspection tools. Most classes that we write in pure Python, using 
the `class` keyword, support monkey-patching not just the class but 
individual instances as well, and it is considered a feature that most 
classes are extendable in that way. We mostly deal with that feature by 
*not using it*.

The Ruby communittee learned the same lesson: the best way to use 
monkey-patching is to not use monkey-patching.

https://avdi.codes/why-monkeypatching-is-destroying-ruby/

So the ability to monkey-patch classes and instances is considered to be 
feature of marginal usefulness. Sure, sometimes its handy, but mostly it 
just adds complexity.

When it comes to builtins, the deciding factor is that the builtins are 
programmed in C (in CPython, other interpreters may do differently), and 
for speed and efficiency, and immutability, they usually don't include a 
`__dict__` that you can monkey-patch into. The class itself may have a 
read-only mapping proxy rather than a dict you can add items into.

So in principle Python supports monkey-patching, but in practice CPython 
at least typically removes that support from most builtin types for 
efficiency reasons. And because monkey-patching is frowned upon, we 
don't miss it.


> In any case, I wanted that ability because I was doing a lot of interesting
> things to strings, and str was doing a stellar job of making that more
> difficult.

I find that implausible. You can do whatever interesting things you like 
with strings, you just can't use method syntax.

You can't use `mystring.TestDottedQuad()` but you can use 
`TestDottedQuad(mystring)`, which is just a change in order (and one 
character fewer to type).

I suppose the one thing you can't easily do with function syntax is give 
each individual instance its own distinct method. So if you have three 
instances, spam, eggs, cheese, with monkey-patching you can give all 
three instances their own frob() method, each method doing something 
different. But that way leads to chaos.


> Again, explain a danger to a nominal class user that arises due to adding
> NEW, immutable in the current instance, functions/functionality to a class
> that do not alter the existing base functionality. ANY danger.

As I said above, that's fine when you are the only one doing it. But as 
soon as monkey-patching becomes popular, and everyone starts using it, 
then you have to deal with conflicts.

Library A and library B both want to monkey-patch strings with the same 
method. Now you can only use one or the other. If they just used 
functions, they would be fine, because A.TestDottedQuad and 
B.TestDottedQuad live in separate namespaces, but with monkey-patching 
they both try to install into the same namespace, causing a clash.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/NM2F743TS4U6WIESVLZQ6MC3IWUCFKEE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Enhancing variable scope control

2022-11-30 Thread Steven D'Aprano
On Wed, Nov 30, 2022 at 01:27:32PM -0700, Anony Mous wrote:

> the string class, which has some annoying tight couplings to "string" and
> 'string')

What does that mean?

It sounds like you are complaining that the syntax for creating strings 
creates strings. Or possibly the other way around: that strings are 
created by string syntax. What did you expect?

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OUEVCJ6JIVFW6KQHGSFUH6E45UQDEKRY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Better (?) PRNG

2022-11-15 Thread Steven D'Aprano
Wes, what purpose do you think these data dumps have?


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PRKTQ6USD76KTGBHI2EEG2WLL26DZYYU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Better (?) PRNG

2022-11-15 Thread Steven D'Aprano
On Tue, Nov 15, 2022 at 05:10:06AM -0500, Wes Turner wrote:

> There should be better software random in python.

Better is what way? What do you think the current MT PRNG lacks?


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CN77I6Q2K2PLB6JK2QR76L6WEE5POHKW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Better (?) PRNG

2022-11-15 Thread Steven D'Aprano
On Sat, Nov 05, 2022 at 01:37:30AM -0500, James Johnson wrote:
> I wrote the attached python (3) code to improve on existing prng functions.
> I used the time module for one method, which resulted in
> disproportionate odd values, but agreeable means.

First the good news: your random number generator at least is well 
distributed. We can look at the mean and stdev, and it is about the same 
as the MT PRNG:

>>> import random, time, statistics
>>> def rand(n):
... return int(time.time_ns() % n)
... 
>>> data1 = [rand(10) for i in range(100)]
>>> data2 = [random.randint(0, 9) for i in range(100)]
>>> statistics.mean(data1)
4.483424
>>> statistics.mean(data2)
4.498849
>>> statistics.stdev(data1)
2.8723056046255744
>>> statistics.stdev(data2)
2.8734388686467534

There's no real difference there.

But let's look at rising and falling sequences. Let's walk through the 
two runs of random digits, and take a +1 if the value goes up, -1 if it 
goes down, and 0 if it stays the same. With a *good quality* random 
sequence, one time in ten you should get the same value twice in a row 
(on average). And the number of positive steps and negative steps should 
be roughly the same. We can see that with the MT output:

>>> steps_mt = []
>>> for i in range(1, len(data2)):
... a, b = data2[i-1], data2[i]
... if a > b: steps_mt.append(1)
... elif a < b: steps_mt.append(-1)
... else: steps_mt.append(0)
... 
>>> steps_mt.count(0)
99826

That's quite close to the expected 100,000 zeroes we would expect. And 
the +1s and -1s almost cancel each other out, with only a small 
imbalance:

>>> sum(steps_mt)
431

The ratio of -ve steps to +ve steps should be 1, and in this sample we 
get 0.99904 which is pretty close to what we expect.

These results correspond to sample probabilities:

* Probability of getting the same as the previous number: 
  0.09983  (theorectical 0.1)
* Probability of getting a larger number than the previous:
  0.45030  (theoretical 0.45)
* Probability of getting a smaller number than the previous: 
  0.44987  (theoretical 0.45)

So pretty close, and it supports the claim that MT is a very good 
quality RNG.

But if we do the same calculation with your random function, we get 
this:

>>> steps.count(0)
96146
>>> sum(steps)
-82609

Wow! This gives us probabilities:

* Probability of getting the same as the previous number: 
  0.09615  (theorectical 0.1)
* Probability of getting a larger number than the previous:
  0.41062  (theoretical 0.45)
* Probability of getting a smaller number than the previous: 
  0.49323  (theoretical 0.45)

I ran the numbers again, with a bigger sample size (3 million instead of 
1 million) and the bias got much worse. The ratio of -ve steps to +ve 
steps, instead of being close to 1, was 1.21908. That's a 20% bias.

The bottom line here is that your random numbers, generated using the 
time, have strong correlations between values. And that makes it a much 
poorer choice for a PRNG.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/D6SV3NQ6OHAPKBGM27UBF5KUMUHABDP5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Better (?) PRNG

2022-11-15 Thread Steven D'Aprano
On Tue, Nov 15, 2022 at 12:19:48AM -0600, James Johnson wrote:

> The random function in Python is not really adequate for a magic eight ball
> program,

That is an astonishing claim for you to assert without evidence.

Python's PRNG is the Mersenne Twister, which is one of the most heavily 
studied and best PRNGs known. Yes, it's a bit old now, but it still gets 
the job done, and is more than adequate for non-security related 
randomness, including magic eight ball programs. There are PRNGs which 
are faster, or use less memory, or have better statistical properties, 
but not usually all at once. The MT is the work-horse of modern PRNGs.

What properties do you think it lacks which are necessary for a magic 
eight ball program?

Arguably, the Mersenne Twister is hugely overkill for a M8B program. It 
has a period of at least 2**19937-1, or about 10**6001 (that's a 1 
followed by 6001 zeroes). Which means that if you ran your M8B a billion 
billion billion times a second, it would take more than a billion 
billion billion [five thousand more billions] billion years before the 
cycle of responses started again.

Of course there are cleverer ways of predicting the output of a Mersenne 
Twister than waiting for the cycle to repeat, but it's a magic eight 
ball. Who cares? Even a 32-bit linear congruential generator, so long as 
it is not utterly dire, would do the job.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DTZ4BPDUIVJGSLPRUOLPDHHAS6YMTX7T/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)

2022-11-08 Thread Steven D'Aprano
On Tue, Nov 08, 2022 at 09:55:04PM +, Barry wrote:

> But anyone that is suitably motivated can implement this.

This is true for every function in a Turing Complete language. Perhaps 
we should start using iota or jot? :-)

https://en.wikipedia.org/wiki/Iota_and_Jot

A "suitably motivated" person could implement ismount, islink, the 
entire os and Pathlib modules, and more. But they probably won't do as 
good a job of it as what we have.

On systems that support junction points, they are as much a fundamental 
file system object as symlinks, directories and mount points. 
Non-experts will probably have to google for hints how to implement 
this, and the internet is full of bad advice. On Stackoverflow, I find 
this question:

https://stackoverflow.com/questions/17174703/symlinks-on-windows

which starts off by giving the false information (or at least obsolete) 
that Windows doesn't support symlinks only shortcuts (NTFS has supported 
symlinks since at least Windows Vista, in 2006), and then later gives a 
solution for detecting junction points which requires ctypes.

Most Python coders are using Windows. Surely it is time to do better 
for them than "just roll your own"? 


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VLUZSVAS6TJVRTQNRGHZJ7AQIVFEMGIS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add mechanism to check if a path is a junction (for Windows)

2022-11-08 Thread Steven D'Aprano
On Mon, Nov 07, 2022 at 07:31:36PM -, Charles Machalow wrote:

> I propose adding a mechanism to both pathlib.Path and os.path to check 
> if a given path is a junction or not. Currently is_symlink/islink 
> return False for junctions.

+1 on a function is_junction.

I am neutral on the question of whether that function should:

1. only exist on Windows,
2. or exist on other platforms but always return False.

Prior art suggests the second is probably better: when Python doesn't 
support symbolic links, `os.islink` exists but always returns False.

https://docs.python.org/3/library/os.path.html#os.path.islink

I am also neutral on whether ismount() on Windows should always return 
True for junctions, as well as mount points. I leave that to Windows 
experts to decide.

-1 on adding a flag parameter to existing functions.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/M36XXF4QZ6VJJSTEDPOTAGEODC35NVLM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Feedback before submission of PEP 661: Sentinel Values

2022-09-10 Thread Steven D'Aprano
I haven't read the PEP yet, so this should not be read as either support 
of or opposition to the design, just commenting on one aspect.

On Sat, Sep 10, 2022 at 01:42:38PM -0700, Christopher Barker wrote:

> > The current design is that sentinels with the same name from the same
> > module will always be identical. So for example `Sentinel("name") is
> > Sentinel("name")` will be true.
> 
> 
> Hmm -- so it's a semi-singleton -- i.e. behaves like a singlton, but only
> in a particular module namespace.
[...]
> It would be nice if they could be true singletons, but that would require a
> python-global registry of some sort, which is, well, impossible.

Easy peasey. Add a second, optional, parameter to the Sentinel callable 
(class?) to specify the namespace, defaulting to "the current module", 
then register the object to `builtins`.

Of course, interpreter-global state should be discouraged, except for 
the interpreter itself :-) so we probably shouldn't do that.

But a namespace parameter will have another advantage: it will allow 
packages to refactor their initialisation code to a sub-module:

```
# package/__init__.py
from package.setup import *

# package/setup.py
import package
import sentinel
MySentinel = sentinel.Sentinel("name", namespace=package)
```

Yes, that's a circular import, but I think it is a safe one.

> I would like to see a handful of common sentinels defined. Is there a plan
> to put a few that are useful for the stdlib in the `sentinel` module?

We already have None. What other "common sentinels" do you expect to be 
useful across the stdlib?


> Maybe that doesn't belong in the PEP, but I like the idea of
> 
> a) every similar use in the stdlib uses the same one

Do you have examples of such stdlib "similar use" of sentinels? Apart 
from None of course.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Q4T4UOTNUEBRJOIYV7VDYQSSG4CPDALK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-02 Thread Steven D'Aprano
On Fri, Sep 02, 2022 at 12:53:47AM -, Steve Jorgensen wrote:

> I didn't say that I was talking about a file. In fact, today, I'm 
> talking about an object that manages a subprocess. If a caller tries 
> to call a method of the manager to interact with the subprocess when 
> the subprocess has not yet been started or after it has been 
> terminated, then I want to raise an appropriate exception.

Your API shouldn't allow the object to be created and returned until it 
has been started.

Long ago I read a great blog post describing this idiom as an 
anti-pattern:

obj = SomeObject(args)
obj.make_it_work()
obj.do_something()

where constructing the object isn't sufficient to start using it, you 
have to call a second, magical method (sometimes called "init()"!) to 
make it actually usable. Unfortunately I lost the URL and have never 
been able to find it again, but it made a big impression on me.

You will notice that we don't do that with Python builtins:

# we don't do this!
f = open('filename', 'w')
f.open()  # Actually open it for writing.
f.write('stuff')

# Or this:
d = dict(key=value)
d.init() # Can't use the dict until we make it work.
d.update(something)

Returning a dict in an initialiated but invalid state, which needs a 
second method call to make it valid, would be an obvious antipattern. 
Why would we do such a thing? Its a bad design! Just have the dict 
constuctor return the dict in the fully valid state.

Which of course we do: for most (all?) objects in the stdlib, creating 
the object *fully* initialises it, so it is ready to use, immediately, 
without needing a special "make it work" method call.

Of course there may be some exceptions to that rule, but they are 
uncommon.

> I am 
> raising a custom exception, and it annoys me that it has to simply 
> inherit from Exception when I think that an invalid state condition is 
> a common enough kind of issue that it should have a standard exception 
> class in the hierarchy.

Perhaps you should take your lead from file objects and inherit 
from ValueError. Trying to interact with a terminated process is 
analogous to writing to a closed file.

What sort of features or APIs do you expect to inherit from this 
InvalidStateError class? 


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SCTMGDJZ3THNN4EJDBTYJUV5XW77YTQD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-02 Thread Steven D'Aprano
On Fri, Sep 02, 2022 at 06:49:37AM +0800, Matthias Görgens wrote:
> >
> > If the target of the call isn't in an appropriate state, isn't that a
> > bug in the constructor that it allows you to construct objects that are
> > in an invalid state?
> >
> > You should fix the object so that it is never in an invalid state rather
> > than blaming the caller.
> >
> 
> You can't really do that with files that have been closed.

Files are not *constructed* in a closed state, or at least `open` 
doesn't return file objects that have to be closed. They are returned in 
an open state. Files can be closed afterwards.

A closed file is not *invalid*, it is just closed. Typical file APIs go 
back to the earliest days of computing, long before object oriented 
programming was a thing, and we're stuck with them. If we were designing 
file I/O from scratch today, we might not have any such concept of open 
and closed files.

But regardless, we don't need an InvalidStateError exception for closed 
files, because (1) they're not invalid, and (2) we already have an 
exception for that, ValueError.

As I mentioned in a previous post, perhaps that should have been some 
sort of IOError, but we're stuck with ValueError.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4FDFLQ5S2XHP2L3UACW2S2UMWRBQNKKZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-02 Thread Steven D'Aprano
On Thu, Sep 01, 2022 at 03:11:29PM -0700, Bruce Leban wrote:

> * a stream-like object that has been closed and you attempt to read from or
> write data to it.

That would be a ValueError:

>>> f.write('a')
Traceback (most recent call last):
  File "", line 1, in 
ValueError: I/O operation on closed file.

Its arguable that this could (should?) have been some sort of IOError 
instead, but that ship has sailed.


> * a random number generator that has not been initialized with a seed (in
> the case where you have a constructor which doesn't also initialize it).

That would be a bug in the constructor.


> * a hash function which you try to compute the digest without having added
> any data to it.

That shouldn't be an error at all:

>>> a = hashlib.sha256()
>>> a.hexdigest()
'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4XGZ7XDM5TOJPTEHQ44PBVCPQB6BJOOJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add InvalidStateError to the standard exception hierarchy

2022-09-01 Thread Steven D'Aprano
On Thu, Sep 01, 2022 at 09:40:05PM -, Steve Jorgensen wrote:

> I frequently find that I want to raise an exception when the target of 
> a call is not in an appropriate state to perform the requested 
> operation. Rather than choosing between `Exception` or defining a 
> custom exception, it would be nice if there were a built-in 
> `InvalidStateError` exception that my code could raise.

If the target of the call isn't in an appropriate state, isn't that a 
bug in the constructor that it allows you to construct objects that are 
in an invalid state?

You should fix the object so that it is never in an invalid state rather 
than blaming the caller.

I believe that the interpreter may sometimes raise RuntimeError for 
cases where objects are in a broken internal state, but I've never seen 
one in real life. Oh, no, I was thinking of SystemError. Nevertheless, 
for my own classes, I've sometimes used RuntimeError for broken internal 
state. I have always considered that a bug in my code.

> In cases where I want to define a custom exception anyway, I think it 
> would be nice if it could have a generic `InvalidStateError` exception 
> class for it to inherit from.

What functionality do you expect this InvalidStateError superclass to 
provide that isn't already provided by Exception?


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/IZ7MFY36GYPMW5XPCUHZRRUB6GMK26EH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add __name__ to functools.partial object

2022-08-31 Thread Steven D'Aprano
On Tue, Aug 30, 2022 at 03:28:06PM -0400, Wes Turner wrote:

> - Copying __qual/name__ would definitely be a performance regression

Doubtful that it would be meaningful. It's just a lookup and assignment. 
>From the perspective of the partial object, it's just

self.__name__ = func.__name__

(give or take), and that's not likely to be expensive, especially if 
implemented in C as I think partial objects are. 

The string itself doesn't have to be copied, just the reference to it, 
which is fast. And its just a once-off cost for creating the partial 
object, it isn't going to slow down calling them.

Even if the cost of this is measurable, it is unlikely to be significant 
to anyone unless their entire app is just creating millions of partial 
objects without actually calling them or using them in any way.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/EVF5OM53KWWBCFXTFP2APDC4DXRFTGDV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add __name__ to functools.partial object

2022-08-31 Thread Steven D'Aprano
On Mon, Aug 29, 2022 at 09:31:25PM -0700, Charles Machalow wrote:
> Hey folks,
> 
> I propose adding __name__ to functools.partial.

https://github.com/python/cpython/issues/91002

___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/BBQW6Z6HQYI5DMLA4DPS5ZSQUYTKIBID/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-20 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 01:34:35AM +0100, Rob Cliffe via Python-ideas wrote:

> To me, the natural implementation of slicing on a non-reusable iterator 
> (such as a generator) would be that you are not allowed to go backwards 
> or even stand still:
>     mygen[42]
>     mygen[42]
> ValueError: Element 42 of iterator has already been used

How does a generic iterator, including generators, know whether or not 
item 42 has already been seen?

islice for generators is really just a thin wrapper around an iterator 
that operates something vaguely like this:

for i in range(start):
next(iterator)  # throw the result away
for i in range(start, end):
yield next(iterator)

It doesn't need to keep track of the last index seen, it just blindly 
advances through the iterator, with some short-cuts for the sake of 
efficiency.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/A74YRQJ4QID72GE5I2A3QKOU6NHLJNCD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Steven D'Aprano
On Tue, Jun 21, 2022 at 12:13:08AM +1000, Chris Angelico wrote:

> Nice analogy. It doesn't hold up.
> 
> Consider this function:
> 
> def f(stuff, max=>len(stuff)):
> stuff.append(1)
> print(max)
> 
> f([1,2,3])
> 
> How would you use lazy evaluation to *guarantee* the behaviour here?

By "the behaviour" I presume you want `max` evaluated before the body of 
the function is entered, rather than at its point of use.

Same way your implementation does: ensure that the interpreter 
fully evaluates `max` before entering the body of the function.


> The only way I can imagine doing it is basically the same as I'm
> doing: that late-bound argument defaults *have special syntax and
> meaning to the compiler*. If they were implemented with some sort of
> lazy evaluation object, they would need (a) access to the execution
> context, so you can't just use a function; 

Obviously you can't just compile the default expression as a function 
*and do nothing else* and have late bound defaults magically appear from 
nowhere.

Comprehensions are implemented as functions. Inside comprehensions, the 
walrus operator binds to the caller's scope, not the comprehension scope.

>>> def frob(items):
... thunk = ((w:=len(items)) for x in (None,))
... next(thunk)
... return ('w' in locals(), w)
... 
>>> frob([1, 2, 3, 4, 5])
(True, 5)

That seems to be exactly the behaviour needed for lazy evaluation 
thunks, except of course we don't need all the other goodies that 
generators provide (e.g. send and throw methods).

One obvious difference is that currently if we moved that comprehension 
into the function signature, it would use the `items` from the 
surrounding scope (because of early binding). It has to be set up in 
such a way that items comes from the correct scope too.

If we were willing to give up fast locals, I think that the normal LEGB 
lookup will do the trick. That works for locals inside classes, so I 
expect it should work here too.


> (b) guaranteed evaluation on function entry,

If that's the behaviour that people prefer, sure. Functions would need 
to know which parameters were:

1. defined with a lazy default;
2. and not passed an argument by the caller (i.e. actually using 
   the default)

and for that subset of parameters, evaluate them, before entering the 
body of the function. That's kinda what you already do, isn't it?

One interesting feature here is that you don't have to compile the 
default expressions into the body of the function. You can stick them in 
the code object, as distinct, introspectable thunks with a useful repr. 
Potentially, the only extra code that needs go inside the function body 
is a single byte-code to instantiate the late-bound defaults.

Even that might not need to go in the function body, it could be part of 
the CALL_FUNCTION and CALL_FUNCTION_KW op codes (or whatever we use).


> (c) the ability to put it in the function header.

Well sure. But if we have syntax for a lazily evaluated expression it 
would be an expression, right? So we can put it anywhere an expression 
can go. Like parameter defaults in a function header.

The point is, Rob thought (and possibly still does, for all I know) that 
lazy evaluation is completely orthogonal to late-bound defaults. The PEP 
makes that claim too, even though it is not correct. With a couple of 
tweaks that we have to do anyway, and perhaps a change of syntax (and 
maybe not even that!) we can get late-bound defaults *almost* for free 
if we had lazy evaluation.

That suggests that the amount of work to get *both* is not that much 
more than the work needed to get just one. Why have a car that only 
drives to the mall on Thursdays when you can get a car that can drive 
anywhere, anytime, and use it to drive to the mall on Thursday as well?

> Please stop arguing this point. It is a false analogy and until you
> can demonstrate *with code* that there is value in doing it, it is a
> massive red herring.

You can make further debate moot at any point by asking Python-Dev for a 
sponsor for your PEP as it stands right now. If you think your PEP is 
as strong as it can possibly be, you should do that.

(You probably want to fix the broken ReST first.)

Chris, you have been involved in the PEP process for long enough, as 
both a participant of discussions and writer of PEPs, that you know damn 
well that there is no requirement that all PEPs must have a working 
implementation before being accepted, let alone being *considered* by 
the community.

Yes, we're all very impressed that you are a competent C programmer who 
can write an initial implementation of your preferred design. But your 
repeated gate-keeping efforts to shut down debate by wrongly insisting 
that only a working implementation may be discussed is completely out of 
line, and I think you know it.

Being a C programmer with a working knowledge of the CPython internals 
is not, and never has been, a prerequisite for 

[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Steven D'Aprano
On Tue, Jun 21, 2022 at 03:15:32AM +0100, Rob Cliffe wrote:

> Why do people keep obscuring the discussion of a PEP which addresses 
> Problem A by throwing in discussion of the (unrelated) Problem B?
> (Chris, and I, have stated, ad nauseam, that these *are* unrelated 
> problems.

Chris says:

"Even if Python does later on grow a generalized lazy evaluation
feature, it will only change the *implementation* of late-bound
argument defaults, not their specification."

So you are mistaken that they are unrelated.

Chris could end this debate (and start a whole new one!) by going to the 
Python-Dev mailing list and asking for a sponsor, and if he gets one, 
for the Steering Council to make a ruling on the PEP. He doesn't *need* 
consensus on Python-Ideas. (Truth is, we should not expect 100% 
agreement on any new feature.)

But any arguments, questions and criticisms here which aren't resolved 
will just have to be re-hashed when the core devs and the Steering 
Council read the PEP. They can't be swept under the carpet.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2NAYYR4YX33KRFH5NH3RNHXXTNX2OVSS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-20 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 02:21:16AM +0100, Rob Cliffe via Python-ideas wrote:

> Sorry, but I think all this talk about lazy evaluation is a big red herring:
>     (1) Python is not Haskell or Dask.

Python is not Haskell, but we stole list comprehensions and pattern 
matching from it. Python steals concepts from many languages.

And Python might not be Dask, but Dask is Python.

https://www.dask.org/


>     (2) Lazy evaluation is something Python doesn't have,

Python has lazily evaluated sequences (potentially infinite sequences) 
via generators and iterators. We also have short-circuit evaluation, 
which is a form of lazy evaluation. There may be other examples as well.

We may also get lazy importing soon:

https://peps.python.org/pep-0690/

At last one of Python's direct competitors in the scientific community, 
R, has lazy evaluation built in.


> and would be 
> a HUGE amount of work for Chris (or anyone) to implement

I don't know how hard it is to implement lazy evaluation, but speaking 
with the confidence of the ignorant, I expect not that hard if you don't 
care too much about making it super efficient. A lazy expression, or 
thunk, is basically just a zero-argument function that the interpreter 
knows to call.

If you don't care about getting Haskell levels of efficiency, that's 
probably pretty simple to implement.

Rewriting Python from the ground up to be completely lazy like Haskell 
would be a huge amount of work. Adding some sort of optional and 
explicit laziness, like R and F# and other languages use, would possibly 
be little more work than just adding late-bound defaults.

Maybe.


> And in the unlikely event 
> that Chris (or someone) DID implement it, I expect there would be a 
> chorus of "No, no, that's not how (I think) it should work at all".

The idea is that you plan your feature's semantics before writing an 
implementation. Even if you plan to "write one to throw away", and do 
exploratory coding, you should still have at least a vague idea of the 
desired semantics before you write a single line of code.


>     (3)  Late-bound defaults that are evaluated at function call time, 
> as per PEP 671, give you an easy way of doing something that at present 
> needs one of a number of workarounds (such as using sentinel values) all 
> of which have their drawbacks or awkward points.

Yes, we've read the PEP thank you :-)

Late-bound defaults also have their own drawbacks. It is not a question 
of whether this PEP has any advantages. It clearly does! The question is 
where the balance of pros versus cons falls.


>     (4) The guarantee that a late-bound default WILL be executed at 
> function call time, can be useful, even essential (it could be 
> time-dependent or it could depend on the values - default or otherwise - 
> of other parameters whose values might be changed in the function 
> body).

Okay. But a generalised lazy evaluation mechanism can be used to 
implement PEP 671 style evaluation.

Let me see if I can give a good analogy... generalised lazy evaluation 
is like having a car that can drive anywhere there is a road, at any 
time of the day or night. Late-bound defaults is like having a car that 
can only drive to the local mall and back, and only on Thursdays.

That's okay if you want to drive to the local mall on Thursdays, but if 
you could only have one option, which would be more useful?

> Sure, I appreciate that there are times when you might want to 
> defer the evaluation because it is expensive and might not be needed, but:
>     (5) If you really want deferred evaluation of a parameter default, 
> you can achieve that by explicitly evaluating it, *at the point you want 
> it*, in the function body.  Explicit is better than implicit.

That's not really how lazy evaluation works or why people want it.

The point of lazy evaluation is that computations are transparently and 
automatically delayed until you actually need them. Lazy evaluation is 
kind of doing the same thing for CPUs as garbage collection does for 
memory. GC kinda sorta lets you pretend you have infinite memory (so 
long as you don't actually try to use it all at once...). Lazy 
evaluation kinda sorta lets you pretend your CPU is infinitely fast (so 
long as you don't try to actually do too much all at once).

If you think about the differences between generators and lists, that 
might help. A generator isn't really like a list that you just evaluate 
a few lines later. Its a completely different way of thinking about 
code, and often (but not always) better.


> IMO lazy evaluation IS a different, orthogonal proposal.

Late-bound defaults is a very small subset of lazy evaluation.

But yes, lazy evaluation is a different, bigger concept.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived 

[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-20 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 11:03:45PM -0700, Jeremiah Paige wrote:

> What if next grew a new argument? Changing the signature of a builtin is a
> big change, but surely not bigger than new syntax? If we could ask for the
> number  of items returned the original example might look like
> 
> >>> first, second = next(iter(items), count=2)

There are times where "Not everything needs to be a one liner" applies.

# You can skip the first line if you know items is already an iterator.
it = iter(items)
first, second, third = (next(it) for i in range(3))

That's crying out to be made into a helper function. Otherwise our 
one-liner is:

# Its okay to hate me for this :-)
first, second, third = (lambda obj: (it:=iter(obj)) and (next(it) for i in 
range(3)))(items)

But that's basically islice. So:

# Its okay to put reusable helper functions in a module.
# Not everything has to be syntax.
first, second, third = itertools.islice(items, 3)

I think that we have a working solution for this problem; the only 
argument is whether or not that problem is common enough, or special 
enough, or the solution clunky enough, to justify a syntax solution.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/3G4VAU3UWM7OXUO2VRYM2I2BKZBIHBIW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add tz argument to date.today()

2022-06-19 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 11:47:22AM -0700, Lucas Wiman wrote:
> Since "today" depends on the time zone, it should be an optional argument
> to date.today(). The interface should be the same as datetime.now(tz=None),
> with date.today() returning the date in the system time zone.

Sure, that sounds pretty straight-forward. I suggest you open a feature 
request on the bug tracker. Hopefully you won't need a PEP.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/ERUKPS4OYKBF3YBIBRMUXPN45ZCAMX3U/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-19 Thread Steven D'Aprano
Okay, I'm convinced.

If we need this feature (and I'm not convinced about that part), then it 
makes sense to keep the star and write it as `spam, eggs, *... = items`.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/X3YKGYUOQ22DLCLOB42IJQFX57K5LW7F/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-19 Thread Steven D'Aprano
On Sun, Jun 19, 2022 at 12:21:50AM -0700, Lucas Wiman wrote:

> Using either * or / could lead to some odd inconsistencies where a missing
> space is very consequential, eg:
> x, / = foo  # fine
> x, /= foo  # syntax error?
> x / = foo  # syntax error
> x /= foo  # fine, but totally different from the first example.

Good point!

Despite what people say, Python does not actually have significant 
whitespace (or at least, no more than most other languages). It has 
significant *indentation*, which is not quite the same.

So in general, although we *recommend* spaces around the equals sign, 
we shouldn't *require* it.

If `/=` and `/ =` have different meanings, then we shouldn't use the 
slash for this. Likewise for the asterisk `* =`.

> That said, the * syntax feels intuitive in a way that / doesn’t. I’d
> suggest:
> x, *… = foo
> This seems unambiguous and fairly self-explanatory.

"Self-explanatory". This is how we got Perl and APL o_O

What do we need the star for?

x, *... = items
x, ... = items

Convince me that adding another meaning for the star symbol is a 
good idea.

(That's assuming that we want the feature at all.)


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6ECHTP3L756W7B5BVLAZJHFVBCQB5Y5Q/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-19 Thread Steven D'Aprano
On Fri, Jun 17, 2022 at 11:32:09AM -, Steve Jorgensen wrote:

> That leads me to want to change the proposal to say that we give the 
> same meaning to "_" in ordinary destructuring that it has in 
> structural pattern matching, and then, I believe that a final "*_" in 
> the expression on the left would end up with exactly the same meaning 
> that I originally proposed for the bare "*".
> 
> Although that would be a breaking change, it is already conventional 
> to use "_" as a variable name only when we specifically don't care 
> what it contains following its assignment, so for any code to be 
> affected by the change would be highly unusual. 

Not so: it is very common to use `_()` as a function in 
internationalisation.

https://stackoverflow.com/questions/3077227/mercurial-python-what-does-the-underscore-function-do

If we are bike-shedding symbols for this feature, I am a bit dubious 
about the asterisk. It already gets used in so many places, and it can 
be confused for `a, b, *x` with the name x lost.

What do people think about

first, second, / = items

where / stands for "don't advance the iterator"?

I like it because it reminds me of the slash in "No Smoking" signs, and 
similar. As in "No (more) iteration".


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/B2IIGM5TMC7YWVXHQPD6HKT4IMHHSJDP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-17 Thread Steven D'Aprano
On Fri, Jun 17, 2022 at 06:32:36AM +0100, Rob Cliffe wrote:

> The bar for adding a new hard keyword to Python is very high.

Likewise for new syntax.


> The suggestion is to add a new keyword to a PEP which absolutely doesn't 
> need it, 

*shrug*

The match...case statement didn't "need" keywords either, we could have 
picked symbols instead if we wanted to look like APL. Remember that 
keywords have advantages as well as disadvantages. Given the existence 
of community support for keywords, the PEP should make the case that 
symbols are better in this case.

Even if that's only "a majority prefer symbols".


> on the grounds that it **might** (**not** would - we can't know 
> without a spec) give compatibility with some fictional vapourware which 
> - for all people keep talking about it - hasn't happened in years, isn't 
> happening (AFAIK nobody is working on it), doesn't have anything 
> remotely close to even an outline specification (people disagree as to 
> what it should do), very likely never will happen, and at best won't 
> happen for years.

I think that is broadly accurate. Harsh but fair: nobody has a concrete 
plan for generalising "defer" keyword would do. It is still vapourware.


[...]
>     def f(x = later -y):
> Is that a late-bound default of -y?  Bad luck; it's already legal syntax 
> for an early-bound default of `later` minus `y`.

Good catch.


>         Late-bound defaults are meant to be evaluated at function 
> call time (and in particular, not some way down in the function body when 
> the parameter gets used).

Not necessarily.

I don't recall if this has been raised in this thread before, but it is 
possible to delay the evaluation of the default value until it is 
actually needed. I believe that this is how Haskell operates pretty much 
everywhere. (Haskell experts: do I have that correct?)

I expect Chris will be annoyed at me raising this, but one way of 
implementing this would be to introduce a generalised "lazy evaluation" 
mechanism, similar to what Haskell does, rather than special-casing 
late-bound defaults. Then late-bound defaults just use the same 
mechanism, and syntax, as lazily evaluated values anywhere else.

I expect that this is the point that David is making: don't introduce 
syntax for a special case that will be obsolete in (mumble mumble...) 
releases.

David's point would be stronger if he could point to a concrete plan to 
introduce lazy evaluation in Python. The Zen of Python gives us some 
hints:

Now is better than never.
Although never is often better than *right* now.

which possibly suggests that the Zen was written by elves:

"I hear it is unwise to seek the council of elves, for they will answer 
with yes and no."

Chris may choose to reject this generalised lazy evaluation idea, but if 
so it needs to go into a Rejected Ideas section. Or he may decide that 
actually having a generalised lazy evaluation idea is *brilliant* and 
much nicer than making defaults a special case.

(I think that the Zen has something to say about special cases too.)

This raises another choice: should lazy defaults be evaluated before 
entering the body of the function, or at the point where the parameter 
is used? Which would be more useful?

# `defer n=len(items)`
def func(items=[], n=>len(items)):
items.append("Hello")
print(n)

func()

Printing 1 would require a generalised lazy mechanism, but printing 0 
is independent of the mechanism. As it stands, the PEP requires 0. Which 
would be better or more useful?

I guess Chris will say 0 and David will say 1, but I might be wrong 
about either of them.

One way or the other, these are the sorts of questions that the 
discussion is supposed to work out, and the PEP is supposed to 
reference. There are well over 600 emails in this thread and the 
Steering Council should not be expected to read the whole thing, the PEP 
is supposed to be an honest and fair summary of alternatives and 
rejected ideas.

Chris is welcome to push for a particular proposal. That is the purpose 
of the PEP process. He is also supposed to give dissenting arguments and 
alternatives fair airing in the PEP itself, even if only in a Rejected 
Ideas section.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/KCDWSAQDA2TSBSLIQ3FBDVI2MN3OPJVD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-17 Thread Steven D'Aprano
On Thu, Jun 16, 2022 at 08:31:19AM +1000, Chris Angelico wrote:
> On Thu, 16 Jun 2022 at 08:25, Steven D'Aprano  wrote:

> > Under the Specification section, the PEP explicitly refers to
> > behaviour which "may fail, may succeed", and different behaviour which
> > is "Highly likely to give an error", and states "Using names of later
> > arguments should not be relied upon, and while this MAY work in some
> > Python implementations, it should be considered dubious".
> >
> > So, yes, the PEP *punts* on the semantics of the feature, explicitly
> > leaving the specification implementation-dependent.
> >
> 
> One very very specific aspect of it is left undefined. Are you really
> bothered by that?

Yes.

This is not just some minor, trivial implementation issue, it cuts right 
to the core of this feature's semantics:

* Which arguments can a late-bound parameter access?

* When the late-bound default is evaluated, what is the name resolution 
  rule? (Which variables from which scopes will be seen?)

These are fundamental details related to the meaning of code, not relatively
minor details such as the timing of when a destructor will run.


If we have:

```
items = ['spam', 'eggs']
def frob(n=>len(items), items=[]):
print(n)

```

we cannot even tell whether `frob()` will print 0 or 2 or raise an 
exception.

I described this underspecification as a weakness of the PEP. As I said 
at the time, that was my opinion. As the PEP author, of course it is 
your perogative to leave the semantics of this feature underspecified, 
hoping that the Steering Council will be happy with implementation- 
dependent semantics.

For the benefit of other people reading this, in case it isn't clear, 
let me try to explain what the issue is.

When late-bound defaults are simulated with the `is None` trick, we 
write:

```
def frob(n=None, items=[]):
# If we enter the body of the function,
# items is guaranteed to have a value.
if n is None:
n = len(items)
print(n)
```

and there is never any doubt about the scoping rules for `len(items)`. 
It always refers to the parameter `items`, never to the variable in the 
surrounding scope, and because that parameter is guaranteed to be bound 
to a value, so the simulated default `len(items)` cannot fail with 
NameError. We can reason about the code's meaning very easily.

If we want "real" late-bound defaults to match that behaviour, 
`n=>len(items)` must evaluate `len(items)` *after* items is bound 
to a value, even though items occurs to the right of n.

Under the PEP though, this behaviour is underspecified. The PEP 
describes this case as implementation dependent. Any of the following 
behaviours would be legal when `frob()` is called:

* n=>len(items) evaluates the parameter `items`, *after* it gets
  bound to the default of [], and so n=0 (that is, it has the same
  semantics as the status quo);

* n=>len(items) evaluates the parameter `items`, but it isn't bound
  to a value yet (because `items` occurs to the right of n), and so
  evaluating the default raises (presumably) UnboundLocalError;

* n=>len(items) evaluates the variable items from the surrounding scope,
  and so evaluates to n=2; if no such variable exists, it will presumably
  raise NameError.

With the behaviour unspecified, we can't predict whether the above 
frob() example is legal or what it will do if it is. It could vary not 
only between CPython and other Pythons, but from one version of CPython 
and another.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JDZB3TVVNJIZD4X7QJ35D2PLWLYCN5DJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-17 Thread Steven D'Aprano
On Fri, Jun 17, 2022 at 02:36:48AM -, Steve Jorgensen wrote:

> Is there anything that I can do, as a random Python user to help move 
> this to the next stage?

If you think the PEP is as complete and persuasive as possible right 
now, you can offer moral support and encouragement. Or you can suggest 
some improvements, and see whether the PEP author agrees.

It is up to the PEP author to decide whether, in his opinion, the PEP is 
sufficiently complete to move forward, or whether it needs more work. 
Other options include leaving it deferred/incomplete, to withdraw it, or 
solicit for somebody to take it over.

If the PEP author abandons it, you could ask to take it over, or you 
could write your own competing PEP as an alternative.

If the author decides to move forward, he needs to ask for a core dev to 
sponsor it. Assuming he gets one, that will start the next round of 
debate, followed by a request to the Steering Council to make a decision 
whether to accept it as is, demand some changes, or reject it.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/QPS6IIXH6M3C545ZEQ6ALKOECMUUF5XP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Bare wildcard in de-structuring to ignore remainder and stop iterating (restart)

2022-06-17 Thread Steven D'Aprano
On Fri, Jun 17, 2022 at 02:50:50AM -, Steve Jorgensen wrote:

> What are people's impressions of this idea. Is it valuable enough to 
> pursue writing a PEP?

I don't think it is useful enough to dedicate syntax to it. If you are 
proposing this idea, it is your job to provide evidence that it is 
useful.

That should be actual, real-world use-cases, not just toy snippets 
like `(first, second, *) = items` with no context.

Examples of where and why people would use it. **Especially** the why 
part. Examples of the work-arounds people have to use in its place, or 
reasons why islice won't work.

"People don't know about islice" is not a convincing argument -- people 
won't know about this either.

Actual code is much more convincing than made up examples. Code from the 
stdlib that would benefit from this is a good place to start.


> If so, then what should I do in writing the PEP to make sure that it's 
> somewhat close to something that can potentially be accepted? Perhaps, 
> there is a guide for doing that?

Read the PEPs. Start with the PEP 1, which is exactly the guide you are 
looking for. Then PEP 5, although it probably won't apply to this. (But 
it is useful to know regardless.)

https://peps.python.org/pep-0001/

https://peps.python.org/pep-0005/

I suggest you read a variety of both successful and unsuccessful PEPs. I 
recommend PEPs 450, 506 and 584 as *especially* good, not that I'm the 
least bit biased *wink*

This is also a good PEP to read, as it is an example of an extremely 
controversial (at the time) PEP that nevertheless was successful:

https://peps.python.org/pep-0572/

This is another PEP which was, believe it or not, controversial at the 
time:

https://peps.python.org/pep-0285/

This is an example of an excellent PEP that gathered support from 
stakeholders in the Numpy community before even raising the issue on 
this mailing list:

https://peps.python.org/pep-0465/

There are many more excellent PEPs, I have just mentioned a few of my 
personal favs. Others may have other opinions.

Remember that even the best PEPs may be rejected or deferred, and resist 
the temptation to attribute all criticism to bad faith and spite. Don't 
be That Guy.

This is an excellent blog post to read:

https://www.curiousefficiency.org/posts/2011/04/musings-on-culture-of-python-dev.html

I recommend that you gather feedback from a variety of places, starting 
here. The Ideas topic on Python's Discourse is another good place. You 
might also try Reddit's r/python and the "Python Forum" here:

https://python-forum.io

and perhaps the comp.lang.python newsgroup, also available as a mailing 
list. Be prepared for a ton of bike-shedding. People may hate the syntax 
even if they like the idea.
 
https://en.wikipedia.org/wiki/Law_of_triviality

Good luck!


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/DCC3DE7OP7OLZFYZTPCGO6MGOY2NS3KH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-15 Thread Steven D'Aprano
On Thu, Jun 16, 2022 at 12:02:04AM +1000, Chris Angelico wrote:
> On Wed, 15 Jun 2022 at 22:38, Steven D'Aprano  wrote:
> > There's no consensus that this feature is worth the added complexity, or
> > even what the semantics are. The PEP punts on the semantics, saying that
> > the behaviour may vary across implementations.
> 
> Excuse me? I left one or two things open-ended, where they're bad code
> and I'm not going to lock the language into supporting them just
> because the reference implementation happens to be able to, but
> "punts"? That's a bit much. The semantics are QUITE specific.

Under the Specification section, the PEP explicitly refers to 
behaviour which "may fail, may succeed", and different behaviour which 
is "Highly likely to give an error", and states "Using names of later 
arguments should not be relied upon, and while this MAY work in some 
Python implementations, it should be considered dubious".

So, yes, the PEP *punts* on the semantics of the feature, explicitly 
leaving the specification implementation-dependent.


> > There's no consensus on the syntax, which may not matter, the Steering
> > Council can make the final decision if necessary. But with at least four
> > options in the PEP it would be good to narrow it down a bit. No soft
> > keywords have been considered.
> 
> """Choice of spelling. While this document specifies a single syntax
> `name=>expression`..."""
> 
> The PEP specifies *one* option.

The part of the sentence you replaced with an ellipsis says "alternate 
spellings are similarly plausible." The very next sentence says

"Open for consideration are the following"

and a couple of paragraphs later you even explicitly refer to a 
second proof of concept implemention.

The PEP has a preferred syntax, as is its right, but it lists three 
alternatives still under consideration.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/23N2VZFHWOLR6ZKFMOU26QXFSFHSY7I4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-15 Thread Steven D'Aprano
On Wed, Jun 15, 2022 at 01:58:28PM +0100, Rob Cliffe via Python-ideas wrote:
> Please.  This has been many times by several people already.  No-one is 
> going to change their mind on this by now.  There's no point in 
> rehashing it and adding noise to the thread.

Rob, there's no rule that only "people who support this PEP" are allowed 
to comment. If it is okay for you to say you like this PEP even more now 
than previously, it is okay for David to say that his opinion hasn't 
changed. Especially since David even pointed out one potential change 
which might lead him to support the PEP, or at least shift to "neutral".


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/42PANNPF4YTDOQQ76BWUPFRVV7YGBUAQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-15 Thread Steven D'Aprano
On Tue, Jun 14, 2022 at 11:59:44AM +0100, Rob Cliffe via Python-ideas wrote:

> I used to prefer `:=` but coming back to this topic after a long 
> interval I am happy with `=>` and perhaps I even like it more, Chris.
> The PEP status is "Draft".  What are the chances of something happening 
> any time soon, i.e. the PEP being considered by the Steering Committee?  

There's no Sponsor, so it isn't being considered by the SC. That much is 
objectively true.

Beyond that, the following is all my personal opinion, and should not be 
taken as definitive or official in any way. Importantly, I have *not* 
read back through the entire thread to refresh my memory. However, I 
have re-read the PEP in detail.

There's no consensus that this feature is worth the added complexity, or 
even what the semantics are. The PEP punts on the semantics, saying that 
the behaviour may vary across implementations.

There's no consensus on the syntax, which may not matter, the Steering 
Council can make the final decision if necessary. But with at least four 
options in the PEP it would be good to narrow it down a bit. No soft 
keywords have been considered.

In my opinion, there are weaknesses in the PEP:

- lack of any reference to previous discussions;

- no attempt to gather feedback from other forums;

- no review of languages that offer choice of early or late binding;

- little attempt to justify why this is better than the status quo; the 
  PEP seems to take the position that it is self-evident that Python 
  needs this feature, rather than being a balanced document setting out
  both pros and cons;

- little or no attempt in the PEP to answer objections;

- examples are all chosen to show the feature in the best possible 
  light, rather than to show both the good and bad; (e.g. no examples
  show the parameter with annotations)

- failure to acknowledge that at least one of the suggested syntaxes
  is visually ambiguous with existing syntax.

E.g. this would be legal with the PEP's second choice of spelling:

def func(spam, eggs:=(x:=spam)):

Even if the parser can distinguish the two uses of `:=` there, its 
awfully cryptic. In and of itself, that's not necessarily a fatal flaw 
(e.g. slicing) but the benefits have to outweigh the negatives, and the 
PEP should be a balanced discussion of both.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/V5K2JFT44A57ZXV2GS3OS6MQW2YKXMQN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-15 Thread Steven D'Aprano
On Mon, Jun 13, 2022 at 07:41:12AM -0400, Todd wrote:

> This has been proposed many times. You can check the mailing list history.
> Such proposals have been even less popular then PEP 671, since it requires
> a new keyword, which is generally avoided at nearly all costs,

Now that Python is using a PEG parser, adding a soft keyword is no big 
deal. We could use a named keyword:

def spam(arg = defer default_expression):
pass

without affecting code that used "defer" as a variable or function name. 
We could even write:

def spam(defer = defer defer()): ...

where the same word "defer" refers to a parameter, a soft keyword, and a 
function call, all in the same function signature. Needless to say one 
should not make a habit of this. But it would be allowed.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5V2SEWXFCVDI6RHBVX4QME4ZAYL4NPPO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: PEP 671 (late-bound arg defaults), next round of discussion!

2022-06-15 Thread Steven D'Aprano
On Wed, Jun 15, 2022 at 10:44:28AM -, Mathew Elman wrote:

> Could this be the behaviour of passing in an Ellipsis? e.g.
> 
> def foo(defaults_to_one=1):
> return defaults_to_one
> 
> assert foo(...) == foo()

It isn't clear to me whether your question is a request for clarification
(does the PEP mean this...?) or a request for a change in behaviour
(could you change the PEP to do this...?).

Why would you want to type `foo(...)` when you could just type `foo()`?


> The only place that I am aware of the Ellipsis being used is in index 
> notation (numpy).
> So this would have likely an impact on __getitem__ or the slice object.

Ellipsis has been around for over twenty years so we have to assume it 
would have an impact on thousands of programs. We don't just care about 
famous, popular libraries like numpy, we care about breaking little 
scripts used by one person too.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/M23GNYPCZAVCFOURUBCGURL64U6DEWBR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A “big picture” question.

2022-06-10 Thread Steven D'Aprano
On Fri, Jun 10, 2022 at 09:59:36PM -0500, James Johnson wrote:
> I guess I was jumping to conclusions. Thank you for taking the time to look
> at my email.
> 
> I apologize if I wasted your time.

No stress -- opening issues up for discussion is not a waste of time.

This would be a good time to mention that there have been previous 
requests to have more control of what optimizations the Python byte-code 
compiler performs, mostly for the benefit of profiling applications.

While the compiler doesn't do many, or any, large complex optimizations 
like a C compiler may do, it does do some keyhole optimizations. 
Sometimes those keyhole optimizations interfere with the ability of 
programs to analyse Python code and report on code coverage.

While the keyhole optimization doesn't change the semantics of the code, 
it does change the structure of it, and makes it harder to analyse 
whether or not each clause in a statement is covered by tests.

So other people have also requested the ability to tell the compiler to 
turn off all optimizations.

Another factor is that as we speak, Mark Shannon is doing a lot of work 
on optimization for the CPython byte-code compiler, including adding JIT 
compilation techniques. (PyPy has had this ability for many years.) So 
it is possible that future compiler optimizations may start to move into 
the same areas that C/C++ compiler optimizations take, possibly even 
changing the meaning of code.

It would be good to plan ahead, and start considering more fine grained 
optimization control, rather than the underpowered -O and -OO flags we 
have now.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T3YBQRFVSQUJ2O6R6ON5ZTN77A6XGSZK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: A “big picture” question.

2022-06-10 Thread Steven D'Aprano
On Wed, Jun 08, 2022 at 06:51:54AM -0500, James Johnson wrote:

> When an amateur develops code incorrectly, s/he sometimes ends up with a
> code object that doesn’t run because of intermediate compiler optimizations.

If that happens, that's a bug in the compiler. Optimizations should 
never change the meaning of code.

If you have an example of this, where the compiler optimization changes
the meaning of Python code beyond what is documented, please raise a bug 
report for it.

But I doubt you will find any, because Python performs very, very few 
optimizations of the sort you are referring to.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/RRLF5OJFKYIO6WFCZS3RFMZDNIBPBI3P/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: default as a keyword argument for dict.get and dict.pop

2022-06-07 Thread Steven D'Aprano
On Tue, Jun 07, 2022 at 02:28:51PM -, martineznicolas41...@gmail.com wrote:

> Do you know if there has been discussions around why is the default 
> argument is positional only in the dict methods get and pop?

Its probably just left over from earlier versions of Python when builtin 
functions only used positional arguments.

Positional arguments are a little faster than keyword arguments, and 
especially for builtin functions, easier to program.

You could try making an enhancement request on the bug tracker and see 
if any one is willing to do the work.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/XX5IGYNZHBGVBWTKDAYRV7IH7P44TLDB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Addition to fnmatch.py

2022-06-06 Thread Steven D'Aprano
Why don't you use the version from the itertools recipes?


```
from itertools import tee, filterfalse
def partition(pred, iterable):
"Use a predicate to partition entries into false entries and true entries"
# partition(is_odd, range(10)) --> 0 2 4 6 8   and  1 3 5 7 9
t1, t2 = tee(iterable)
return filterfalse(pred, t1), filter(pred, t2)
```


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TOQKBCZNXZRH5BVFPPFOPJTVPA3KOZ54/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Addition to fnmatch.py

2022-06-06 Thread Steven D'Aprano
On Mon, Jun 06, 2022 at 06:17:32PM +0200, Benedict Verhegghe wrote:

> There still is something wrong. I get the second list twice:
> 
> odd, even = partition(lambda i: i % 2, range(20))
> print(list(odd))
> [1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
> print(list(even))
> [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Confirmed. But if you replace range(20) with `iter(range(20))`, it 
works correctly.

The plot thickens. The duplicated list is not from the second list 
created, but the second list *evaluated*. So if you run:

odd, even = partition(...)

as before, but evaluate *even* first and odd second:

print(list(even))
print(list(odd))

it is odd that is doubled, not even.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OJIA6HIH6UJYE7X7LU7W46QI2X6UPTK6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Add .except() and .only(), and .values_at(). instance methods to dict

2022-06-05 Thread Steven D'Aprano
On Sun, Jun 05, 2022 at 09:11:41PM -, Steve Jorgensen wrote:

> m = {'a': 123, 'b': 456, 'c': 789}
> m.except(('a', 'c'))  # {'b': 456}
> m.only(('b', 'c'))  # {'b': 456, 'c': 789}
> m.values_at(('a', 'b'))  # [123, 456]

Maybe I'm a bit slow because I haven't had my morning coffee yet, but I 
had to read those three times before I could work out what they actually 
do. Also because I got thrown by the use of the keyword `except`, and 
thought initially that this was related to try...except.

And lastly because you say that these are extremely common, but I've 
never used these operations in 20+ years. Or at least not often enough 
to remember using them, or wishing that they were standard methods.

These operations are so simple that it is easier to follow the code than 
to work out the meaning of the method from the name:

# Return a new dict from an old dict with all but a set of keys.
new = {key:value for key, value in old.items() if key not in exclusions}

# Return a new dict from an old dict with only the included keys.
new = {key:value for key, value in old.items() if key in inclusions}

# Same as above, but only those matching the included values.
new = {key:value for key, value in old.items() if value in inclusions}

# Return the values (as a set) of a subset of keys.
values = {value for key, value in old.items() if key in inclusions}

# Use a list instead, and exclude keys.
values = [value for key, value in old.items() if key not in exclusions]

# Same, but use a predicate function.
values = [value for key, value in old.items() if not predicate(key)]

# How about a list of values matching *either* a blacklist of keys 
# and a whitelist of values, *or* a predicate function on both?
values = [value for key, value in old.items()
  if (key not in blacklist and value in whitelist)
  or predicate(key, value)]


Comprehensions are great!

Yes, they take a few extra characters to write, but code is written much 
more than it is read. Once we have learned comprehensions, it is much
easier to read a simple comprehension than to try to decipher a short 
name like "only". Only what? Only those that match a given list of keys, 
or values, or a predicate function, or something else?


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/766W3IAURXW3FRTJNH2657XHIXYL7LHY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-04 Thread Steven D'Aprano
On Sun, Jun 05, 2022 at 07:03:32AM +1000, Chris Angelico wrote:

> How is redundancy fundamentally good,

I don't know, you will have to ask somebody who is arguing that 
"redundancy is fundamentally good", which is not me. Redundancy can be 
either good or bad.

https://www.informationweek.com/government/redundancy-in-programming-languages


I didn't think that would be a contraversial position to take, 
programming and IT frequently makes use of redundancy, e.g.

- multiple routes to a destination
- warm and hot spare servers
- RAID, backups
- documentation versus "read the source"
- unit tests, regression tests, etc
- checksums and error correcting codes
- eyword parameters when positional would do
- making allowance for "the bus factor" in projects
  ("why do we need two people who understands this?")

Python uses redundant colons after statements that introduce a block; 
other languages use redundant variable declarations and semicolons.

Outside of IT, we even have a proverb about it: don't put all your eggs 
in one basket. Redundancy is used to make systems more resilient against 
failure:

- Spare tyres, spare keys, etc.
- Subject-Verb Agreement Rules in language.
- Double-entry book keeping.
- Using two or more locks on doors.
- Belts and braces, seatbelts and airbags, etc.

I'm not saying that redundancy is always good, but your argument that 
"all" (your word) it does is to "introduce the possibility of error" (as 
if there are no other sources of error in programming!) doesn't stand up 
to even the most cursory consideration.

In this specific example, `spam, eggs, cheese = islice(values, 3)`, I 
think that the cost of the redundancy is minimal and the benefit 
non-zero. Which one "wins" is a matter of taste and the programmer's own 
personal value judgement of risk versus opportunity.

If you don't like it, fine, but I do, and if I were BDFL of Python, I 
wouldn't add syntax to the language just to remove this redundancy.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AUYTGNSKMHO7RB7ZIJRXJIAAGMEZPRNP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-04 Thread Steven D'Aprano
On Sat, Jun 04, 2022 at 11:16:18PM +1000, Chris Angelico wrote:

> > Redundancy is good:
> >
> > # Obviously, clearly wrong:
> > spam, eggs, cheese = islice(myvalues, 5)
> 
> Yes but which part is wrong?

You're a professional programmer, so I am confident that you know the 
answer to that :-)

It's a logic error. Unlike trivial and obvious spelling errors ('import 
colections'), most logic errors are not amenable to trivial fixes. You 
have to read the code (which may be as little as the surrounding one or 
two lines, or as much as the entire damn program) to understand what the 
logic is supposed to be before you can fix it.

Welcome to programming :-)

A day may come when computers will write and debug their own code, 
making human programmers obsolete, but it is not this day.

Until then, I will take all the help I can get. I'm not going to 
introduce unnecessary redundancy for no reason, but nor am I going to go 
out of my way to remove it when it is helpful.


> Redundancy introduces possibilities of desynchronization.

Indeed, and that is why I never write documentation or comments -- they 
are redundant when you have the source.

*wink*

All joking aside, of course you are correct. But we don't typically 
worry too much about such minor risks when we do things like imports:

from collections import Counter
# oh no, now Counter and collections.Counter may become desynced!

or assertions (which should always succeed, and so they are redundant -- 
right up to the moment when they fail). Or when we grab a temporary 
reference to a sub-expression to avoid repeating ourselves.

We balance the risks against the benefits, and if the risks are small 
compared to the benefits, we don't fret about redundancy.

And sometimes, in the face of noisy communication channels, hardware 
failure, or hostile environments, redundancy is all but essential.


> ANY code can be wrong, but it's entirely possible for the second one 
> to be right, and the first one is undoubtedly wrong.

Indeed.

When all else is equal -- which it may not always be -- we should prefer 
code which is obviously correct over code which merely has no obvious 
bugs.

A line of code like `spam, eggs, cheese = islice(myvalues, 3)` is 
obviously correct in the sense that there is no discrepency between the 
left and right hand sides of the assignment, and any change which fails 
to keep that invariant is an obvious bug.

The proposed equivalent `spam, eggs, cheese, * = myvalues` may be more 
convenient to write, but you no longer have that invariant. How much you 
care about that loss will depend on your risk tolerance compared to your 
laziness (one of Larry Wall's three virtues of programmers -- although 
opinions differ on whether he is right or not).


> Obviously sometimes it's unavoidable, but I don't think we can
> genuinely accept that the redundancy is *good*.

You have convinced me! I'm now removing all my RAID devices!

*wink*

Would it have helped if I had said redundancy is *sometimes* good?



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JGOKYNXFQA4OG4QR7DUB2TGHUYV3OYWY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-04 Thread Steven D'Aprano


On Sat, Jun 04, 2022 at 10:04:39AM -, Steve Jorgensen wrote:

> OK. That's not terrible. It is a redundancy though, having to re-state 
> the count of variables that are to be de-structured into on the left.

Redundancy is good:

# Obviously, clearly wrong:
spam, eggs, cheese = islice(myvalues, 5)

# Not obviously right.
spam, eggs, cheese, * = myvalues

We don't have to squeeze every bit of redundancy out of code.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/Y43RVWOYVZEXIUTS2D2RTLD7VJBWN4EP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Null wildcard in de-structuring to ignore remainder and stop iterating

2022-06-04 Thread Steven D'Aprano
On Sat, Jun 04, 2022 at 07:31:58AM -, Steve Jorgensen wrote:

> A contrived use case:
> 
> with open('document.txt', 'r') as io:
> (line1, line2, *) = io
> 

with open('document.txt', 'r') as io:
line1 = io.readline()
line2 = io.readline()

It would be lovely if readlines() took a parameter to specify the number 
of lines to return:

line1, line2 = io.readlines(2)  # Doesn't work :-(

but alas and alack, the readlines() method has exactly the wrong API for 
that. I don't know what use the hint parameter is for readlines, it 
seems totally useless to me, and the wrong abstraction, counting 
bytes/characters instead of lines.

Maybe we could add a keyword only argument?

line1, line2 = io.readlines(count=2)

Or there's always the explicit:

line1, line2 = [io.readline() for j in (1, 2)]

No need for new syntax for something so easy.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CJHUXBIQ2FJL332AS22YBULI2CPF2IT4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Expand the try-expect syntax to support conditional expect block

2022-06-03 Thread Steven D'Aprano
On Fri, Jun 03, 2022 at 06:43:59PM -, Nadav Misgav wrote:

> should I try to implement this? seems there is some acceptance

If you want to experiment with an implementation just for your own 
pleasure, then go ahead. Or if you think that the implementation is 
trivial and simple, and a working implementation will strengthen the 
case for this proposal. But I think it is too early for you to spend 
many hours working on an implementation.

As a syntax change, this will need a PEP before it can be accepted, and 
before you waste time writing a PEP you need a core developer sponsor.

If you think that the feedback in this thread is sufficient to justify 
it, you can ask for a sponsor here and on the Python-Dev mailing list.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/74ICDQKFXOQG7EJP2LV2CJ2236OZ6SV5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Python array multiplication for negative values should throw an error

2022-05-31 Thread Steven D'Aprano
On Mon, May 30, 2022 at 02:31:35PM -, fjwillemsen--- via Python-ideas wrote:

> In Python, array multiplication is quite useful to repeat an existing 
> array, e.g. [1,2,3] * 2 becomes [1,2,3,4,5,6].

It certainly does not do that.

>>> [1, 2, 3]*2
[1, 2, 3, 1, 2, 3]

Also, that's a list, not an array. Python does have arrays, from the 
`array` module. And of course there are numpy arrays, which behave 
completely differently, performing scalar multiplication rather than 
sequence replication:

>>> from numpy import array
>>> array([1, 2, 3])*2
array([2, 4, 6])


> However, operations such as [numpy array] * -1 are very common to get 
> the inverse of an array of numbers.

If that is common, there's a lot of buggy code out there! *wink*

Multiplying a numpy array by the scalar -1 performs scalar 
multiplication, same as any other scalar. To get the inverse of a numpy 
array, you need to use numpy.linalg.inv:

>>> import numpy.linalg
>>> arr = array([[1, 2], [3, 4]])
>>> numpy.linalg.inv(arr)
array([[-2. ,  1. ],
   [ 1.5, -0.5]])


> The confusion here stems from the lack of type checking: while the 
> programmer should check whether the array is a NumPy array or a Python 
> array, this is not always done, giving rise to difficult to trace 
> cases where [1,2,3] * -1 yields [] instead of [-1,-2,-3].

This confusion has nothing to do with multiplication by -1. As the 
earlier example above shows, scalar multiplication on a numpy array 
and sequence replication on a list always give different results, not 
just for -1. (The only exception is multiplication by 1.)

I am afraid that this invalidates your argument from Numpy arrays. It 
simply isn't credible that people are accidentally passing lists instead 
of numpy arrays, and then getting surprised by the result **only** when 
multiplying by a negative value. Its not just negatives that are 
different.


> I can not think of good reasons why Python array multiplication should 
> not throw an error for negative multipliers, because it is meaningless 
> to have array multiplication by negative value in the way it is 
> intended in Python.

Its not meaningless, it is far more *useful* than an unnecessary and 
annoying exception would be. For example, here is how I might pad a list 
to some minimum length with zeroes:

mylist.extend([0]*(minlength - len(mylist)))


If this was 1991 and Python was brand new, then the behaviour of 
sequence replication for negative values would be up for debate. But 
Python is 31 years old and there is 31 years worth of code that relies 
on this behaviour, so we would need **extraordinarily strong** reasons 
to break all that code.

Not an extraordinarily weak argument based on confusion between numpy 
array scalar multiplication and list replication. Sorry to be blunt.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/4OVPWC35WCGLNKSACH5PWWM77XR7425P/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: TextIOBase: Make tell() and seek() pythonic

2022-05-26 Thread Steven D'Aprano
On Tue, May 24, 2022 at 04:31:13AM -, mguin...@gmail.com wrote:

> seek() and tell() works with opaque values, called cookies.
> This is close to low level details, but it is not pythonic.

Even after reading the issue you linked to, I am not sure I understand 
either the issue, or your suggested solution.

I *think* that the issue is this:

Suppose we have a text file containing four characters (to be precise: 
code points).

aΩλz

namely U+0061 U+03A9 U+03BB U+007A. You would like tell() and seek() to 
accept indexes 0, 1, 2, 3, 4 which would move the file pointer to:

0 moves to the start of the file, just before the a
1 moves to just before the Ω
2 moves to just before the λ
3 moves to just before the z
4 moves to after the z (EOF).

**But** in reality, the file position cookies for that file will depend 
on the encoding used. For UTF-8, the valid cookies are:

0 moves to the start of the file, just before the a
1 moves to just before the Ω
3 moves to just before the λ
5 moves to just before the z
6 moves to after the z (EOF).

Other encodings may give different cookies.

If you seek() to position 4, say, the results will be unpredictable but 
probably not anything good.

In other words, the tell() and seek() cookies represent file positions 
in **bytes**, even though we are reading or writing a text file.

You would like the cookies to be file positions measured in 
**characters** (or to be precise, code points).

Am I close?



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/2DGW5KFVOCDSHKZH6SUQADJXC3TKKUIS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: TextIOBase: Make tell() and seek() pythonic

2022-05-26 Thread Steven D'Aprano
On Thu, May 26, 2022 at 08:28:24PM +1000, Steven D'Aprano wrote:

> Narrow builds were UCS-2; wide builds were UTC-32.

To be more precise, narrow builds were sort of a hybrid between an 
incomplete version of UTF-16 and a superset of UCS-2.

Like UTF-16, if your code point was above U+, it would be 
represented by a pair of surrogate code points. But like UCS-2, that 
surrogate pair was seen as two characters rather than one.

(If you think this is complicated and convoluted, yes, yes it is.)


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/FH3OQNWXY5CKA6MEEMRBZ5B4C7WY5BYK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: TextIOBase: Make tell() and seek() pythonic

2022-05-26 Thread Steven D'Aprano
On Wed, May 25, 2022 at 06:16:50PM +0900, Stephen J. Turnbull wrote:
> mguin...@gmail.com writes:
> 
>  > There should be a safer abstraction to these two basic functions.
> 
> There is: TextIOBase.read, then treat it as an array of code units
> (NOT CHARACTERS!!)

No need to shout :-)

Reading the full thread on the bug tracker, I think that when Marcel 
(mguinhos) refers to "characters", he probably is thinking of "code 
points" (not code units, as you put it).

Digression into the confusing Unicode terminology, for the benefit of 
those who are confused... (which also includes me... I'm writing this 
out so I can get it clear in my own mind).

A *code point* is an integer between 0 and 0x10 inclusive, each of 
which represents a Unicode entity.

In common language, we call those entities "characters", although they 
don't perfectly map to characters in natural language. Most code points 
are as yet unused, most of the rest represent natural language 
characters, some represent fragments of characters, and some are 
explicitly designated "non-characters".

(Even the Unicode consortium occasionally calls these abstract entities 
characters, so let's not get too uptight about mislabelling them.)

Abstract code points 0...0x10FFF are all very well and good, but they 
have to be stored in memory somehow, and that's where *code units* come 
into it: a *code unit* is a chunk of memory, usually 8 bits, 16 bits, or 
32 bits.

https://unicode.org/glossary/#code_unit

The number of code units used to represent each code point depends on 
the encoding used:

* UCS-2 is a fixed size encoding, where 1 x 16-bit code unit represents 
  a code point between 0 and 0x.

* UTF-16 is a variable size encoding, where 1 or 2 x 16-bit code units 
  represents a code point between 0 and 0x10.

* UCS-4 and UTF-32 are (identical) fixed size encodings, where 1 x 
  32-bit code unit represents each code point.

* UTF-8 is a variable size encoding, where 1, 2, 3 or 4 x 8-bit code 
  units represent each code point.

* UTF-7 is a variable size encoding which uses 1-8 7-bit code units. 
  Let's not talk about that one.

That's Unicode. But TextIOBase doesn't just support Unicode, it also 
supports legacy encodings which don't define code points or code units. 

Nevertheless we can abuse the terminology and pretend that they do, e.g. 
most such legacy encodings use a fixed 1 x 8-bit code unit (a byte) to 
represent a code point (a character). Some are variable size, e.g. 
SHIFT-JIS. So with this mild abuse of terminology, we can pretend that 
all(?) those old legacy encodings are "Unicode".

TL;DR:

Every character, or non-character, or bit of a character, which for the 
sake of brevity I will just call "character", is represented by an 
abstract numeric value between 0 and 0x10 (the code point), which in 
turn is implemented by a chunk of memory between 1 and N bytes in size, 
for some value of N that depends on the encoding.


> One thing you don't seem to understand: Python does *not* know about
> characters natively.  str is an array of *code units*.

Code points, not units.

Except that even the Unicode Consortium sometimes calls them 
"characters" in plain English. E.g. the code point U+0041 which has 
numeric value 0x41 or 65 in decimal represents the character "A".

(Other code points do not represent natural language characters, but if 
ASCII can call control characters like NULL and BEL "characters", we can 
do the same for code points like U+FDD0, official Unicode terminology be 
damned.)


> This is much
> better than the pre-PEP-393 situation (where the unicode type was
> UTF-16, nowadays except for PEP 383 non-decodable bytes there are no
> surrogates to worry about), 

Narrow builds were UCS-2; wide builds were UTC-32.

The situation was complicated in that your terminal was probably UTF-16, 
and so a surrogate pair that Python saw as two code points may have been 
displayed by the terminal as a single character.


> but Python doesn't care if you use NFD,

The *normalisation forms* NFD etc operate at the level of code points, 
not encodings.

I believe you may be trying to distinguish between what Unicode calls 
"graphemes", which is very nearly the same as natural language 
characters (plus control characters, noncharacters, etc), versus plain 
old code points.

For example, the grapheme (natural character) ü may be normalised as the 
single code point

U+00FC LATIN SMALL LETTER U WITH DIAERESIS
 
or as a sequence of code points:

U+0075 LATIN SMALL LETTER U
U+0308 COMBINING DIAERESIS

I believe that dealing with graphemes is a red-herring, and that is not 
what Marcel has in mind.


-- 
Steve
(the other one)
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 

[Python-ideas] Re: Whitespace handling for indexing

2022-05-24 Thread Steven D'Aprano
There are at least three existing ways to already do this.

(foo["bar"]
 ["baz"]
 ["eggs"]
 ["spam"]) = 1

foo["bar"][
"baz"][
"eggs"][
"spam"] = 1

foo["bar"]\
 ["baz"]\
 ["eggs"]\
 ["spam"] = 1


I think the first one is the clear winner.

The difficulty with your proposal is that without the indent, it is 
ambiguous:

foo["bar"]
["baz"]
["eggs"]
["spam"] = value

The first three lines of that are legal code. Pointless, but legal. It 
is only when we get to the last line, the assignment, that it fails, and 
only because the unpacking assignment target is a literal. If it were a 
name, it could succeed:

[spam] = value  # succeeds with value = (1,) for example

unpacks `value` and assigns the results to the list of names `[spam]`.

So this syntax will *require* indentation to avoid the ambiguity. But 
that breaks the rule that indentation is only required for a block 
following a keyword such as class, def, for, if etc.

Okay, so perhaps it is not quite a hard rule, more of a convention or 
expectation, but *requiring* such indentation would still violate 
it. So given that there are already at least three adequate solutions to 
the problem, I don't see the need to complicate the language to support 
another.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CASG7V5AN7BJCOXXCEYWD26LQOMFD3UJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Expand the try-expect syntax to support conditional expect block

2022-05-19 Thread Steven D'Aprano
On Wed, May 18, 2022 at 10:23:16PM +0300, Serhiy Storchaka wrote:

> try:
> expression block
> expect Exception if condition else ():
> expression block

Aside from the typo "expect" when it should be `except`, that is a 
terrible thing to do to the people reading your code :-(


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JDOPFJEY75Z7DESTYCF3LGXRYQCXPPYR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Awaiting until a condition is met

2022-05-15 Thread Steven D'Aprano
On Sun, May 15, 2022 at 03:34:15PM +0100, MRAB wrote:
> On 2022-05-15 03:10, Aaron Fink wrote:
> >Hello.
> >
> >I've been thinking it would be nice to be able to use await to suspend 
> >execution until a condition is met (as represented by a function or 
> >lambda which returns a boolean.
[...]
> It's still just polling, so it could be expensive to check very often. 
> All the callback is returning is True/False, which isn't very helpful 
> for deciding how often it should check.

https://www.youtube.com/watch?v=18AzodTPG5U


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5WKGELADXCYMD5VAIGM33IP63RVDSCEY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Improving -x switch on CLI interface

2022-05-15 Thread Steven D'Aprano
On Sun, May 15, 2022 at 07:32:56PM -0300, Rafael Nunes wrote:

> Hi. Thanks for asking.
> In .BAT files we could write some Windows Shell commands before python code
> and python interpreter would ignore them.

Hi Rafael,

You have answered Eric's question with *what* you would do (write 
Windows Shell commands in a .BAT file) but a use-case is *why* you would 
do it.

What sort of real-world problems are you solving by turning a .py file 
into a .bat file with Windows shell commands at the beginning?

Which shell commands do you expect to use?

The next question would be, couldn't you re-write those shell commands 
as Python code? Or have the .BAT file call Python and run the .py file?



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S36CLS3NKSLJCQCXVOJ7ZQRUP3TZXBW3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Heterogeneous numeric data in statistics library

2022-05-12 Thread Steven D'Aprano
Users of the statistics module, how often do you use it with 
heterogeneous data (mixed numeric types)?

Currently most of the functions try hard to honour homogeneous data, 
e.g. if your data is Decimal or Fraction, you will (usually) get Decimal 
or Fraction results:

>>> statistics.variance([Decimal('0.5'), Decimal(2)/3, Decimal(5)/2])
Decimal('1.231481481481481481481481481')
>>> statistics.variance([Fraction(1, 2), Fraction(2, 3), Fraction(5, 2)])
Fraction(133, 108)

With mixed types, the functions usually try to coerce the values into a 
sensible common type, honouring subclasses:

>>> class MyFloat(float):
... def __repr__(self):
... return "MyFloat(%s)" % super().__repr__()
... 
>>> statistics.mean([1.5, 2.25, MyFloat(1.0), 3.125, 1.75])
MyFloat(1.925)

but that's harder than you might expect and the extra complexity causes 
some significant performance costs. And not all combinations are 
supported (Decimal is particularly difficult).

If you are a user of statistics, how important to you is the ability to 
**mix** numeric types, in the same data set?

Which combinations do you care about?

Would you be satisfied with a rule that said that the statistics 
functions expect homogeneous data and that the result of calling the 
functions on mixed types is not guaranteed?



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AGMUQK7DQOCWU2X7VBNTCA2F3AUMDJIW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Time to relax some restrictions on the walrus operator?

2022-05-08 Thread Steven D'Aprano
On Sun, May 08, 2022 at 04:00:47PM +0100, Rob Cliffe wrote:

> Yes, I know unrestricted use of the walrus can lead to obfuscated code 
> (and some of Steven's examples below might be cited as instances ).  

They're intended as the simplest, least obfuscatory examples of using 
the walrus operator that is not pointless. That is, an example of the 
walrus as a sub-expression embedded inside another expression.

If you think my examples are obfuscated, then that is an argument in 
favour of keeping the status quo.

I could have given an example like this:

((a, b) := [1, 2])

but there is no good reason to use the walrus operator there, it is not 
a sub-expression, and it Just Works if you use the assignment statement 
instead.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S7MU7ONVRAVYPXYTYMRGW32NYU3L7RIE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Time to relax some restrictions on the walrus operator?

2022-05-08 Thread Steven D'Aprano
On Sun, May 08, 2022 at 03:59:07PM +0100, MRAB wrote:

> > # Currently a syntax error.
> > results = (1, 2, (a, b) := (3, 4), 5)
> >
> Doesn't ':=' have a lower precedence than ',', so you're effectively 
> asking it to bind:
> 
> (1, 2, (a, b))
> 
> to:
> 
> ((3, 4), 5)

Possibly. Insert additional parentheses as needed to make it work :-)

results = (1, 2, ((a, b) := (3, 4)), 5)


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/SLMS6N7Y6GGS7ACNWOD77QA5D5WMJZCT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Time to relax some restrictions on the walrus operator?

2022-05-08 Thread Steven D'Aprano
I don't have an opinion one way or the other, but there is a discussion 
on Discourse about the walrus operator:

https://discuss.python.org/t/walrus-fails-with/15606/1


Just a quick straw poll, how would people feel about relaxing the 
restriction on the walrus operator so that iterable unpacking is 
allowed?

# Currently a syntax error.
results = (1, 2, (a, b) := (3, 4), 5)

which would create the following bindings:

results = (1, 2, (3, 4), 5)
a = 3
b = 4

A more complex example:

expression = "uvwxyz"
results = (1, 2, ([a, *b, c] := expression), 5)

giving:

results = (1, 2, "uvwxyz", 5)
a = "u"
b = ("v", "w", "x", "y")
c = "z"


Thoughts?


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/5CWWY4EZKXLJZD47NSQA6TRD5SWMFGOJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-07 Thread Steven D'Aprano
On Sun, May 08, 2022 at 11:02:22AM +1000, Chris Angelico wrote:
> On Sun, 8 May 2022 at 10:23, Steven D'Aprano  wrote:

> > Outside of that narrow example of auto-assignment of attributes, can
> > anyone think of a use-case for this?
> >
> 
> Honestly, I don't know of any. But in response to the objection that
> it makes no sense, I offer the perfectly reasonable suggestion that it
> could behave identically to other multiple assignment in Python.

Nobody says that it makes "no sense". Stephen Turnbull suggested it 
doesn't make "much sense", but in context I think it is clear that he 
meant there are no good uses for generalising this dotted parameter 
name idea, not that we can't invent a meaning for the syntax.


> There's not a lot of places where people use "for x, x.y in iterable",
> but it's perfectly legal. Do we need a use-case for that one to
> justify having it, or is it justified by the simple logic that
> assignment targets are populated from left to right?

The analogy breaks down because we aren't talking about assignment 
targets, but function parameters. Function parameters are only 
*kinda sorta* like assignment targets, and the process of binding 
function arguments passed by the caller to those parameters is not as 
simple as

 self, x, x.y = args

The interpreter also does a second pass using keyword arguments, and a 
third pass assigning defaults if needed. Or something like that -- I 
don't think the precise implementation matters.

Of course we could make it work by giving *some* set of defined 
semantics, but unless it is actually useful, why should we bother? Hence 
my comment YAGNI.


> I'm not advocating for this, but it shouldn't be pooh-poohed just
> because it has more power than you personally can think of uses for.

Power to do *what*?

If nobody can think of any uses for this (beyond the auto-assignment of 
attributes), then what power does it really have?

I don't think "power" of a programming language feature has a purely 
objective, precise definition. But if it did, it would surely have 
something to do with the ability to solve actual problems.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/AJEWB5MCHPMKUVJSPMNAU5XTP4F4AX2P/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Adding a .find() method to list

2022-05-07 Thread Steven D'Aprano
On Sat, May 07, 2022 at 04:01:34AM -, python-id...@lucas.rs wrote:

> If you had to get the user where user['id'] == 2 from this list of 
> users, for example, how would you do it?
> 
> users = [
> {'id': 1,'name': 'john'},
> {'id': 2, 'name': 'anna'},
> {'id': 3, 'name': 'bruce'},
> ]

user = None
for record in users:
if record['id'] == 2:
user = record
break
else:  # for...else
raise LookupError('user id not found')

If I needed to do it more than once, or if it needed testing, I would 
change the break into `return record` and put it into a function.

If I really needed to lookup user IDs a lot, I wouldn't use a list, I 
would use something like this:

users = { # map user ID to user name
1: 'john',
2: 'anna',
3: 'bruce',
}

so that user ID lookups are simple and blazingly fast:

user = users[2]

rather than needing to walk through the entire list inspecting each 
item. Picking the right data structure for your problem is 9/10th of the 
battle.


> # way too verbose and not pythonic
> ids = [user['id'] for user in users]
> index = ids.index(2)
> user_2 = users[index]

Three lines is not "too verbose". I wouldn't call it "not Pythonic", I 
would just call it poor code that does too much unnecessary work.


> # short, but it feels a bit janky
> user_2 = next((user for user in users if user['id'] == 2), None)

Seems OK to me.


> # this is okay-ish, i guess
> users_dict = {user['id']: user for user in users}
> user_2 = users_dict.get(2)

More unnecessary work, building an entire temporary dict of potentially 
millions of users just to extract one. Of course if you are going to be 
using it over and over again, this is the right solution, not the list.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/PG32UB5DLF7SDPHSECYT6L4PDQW3D35I/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-07 Thread Steven D'Aprano
On Sat, May 07, 2022 at 11:38:19AM -0700, Ethan Furman wrote:

> > I'd define it very simply. For positional args, these should be
> > exactly equivalent:
> >
> > def func(self, x, x.y):
> >  ...
> >
> > def func(*args):
> >  self, x, x.y = args
> >  ...
> 
> Simple or not, I don't think Python needs that much magic.

Indeed. Just because we can imagine semantics for some syntax, doesn't 
make it useful. Aside from the very special case of attribute binding in 
initialisation methods (usually `__init__`), and not even all, or even a 
majority, of those, this is a great example of YAGNI.

Outside of that narrow example of auto-assignment of attributes, can 
anyone think of a use-case for this?

And as far as auto-assignment of attributes goes, I want to remind 
everyone that we have that already, in a simple two-liner:

vars(self).update(locals())
del self.self

which will work for most practical cases where auto-assignment would be 
useful. (It doesn't work with slots though.)

This is a useful technique that isn't widely known enough. I believe 
that if it were more widely know, we wouldn't be having this discussion 
at all.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/CTVO5ZA2XUYZTG45DP6HPFQNQOKJKXP7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: infile and outfile parameters for the input function

2022-05-03 Thread Steven D'Aprano
On Mon, May 02, 2022 at 08:23:31PM -0700, Christopher Barker wrote:
> On Mon, May 2, 2022 at 11:18 AM Steven D'Aprano
> 
> > why one couldn't just use the redirect_stdout context manager.
> >
> > (Plus not-yet-existing, but hopefully soon, redirect_stdin.)
> 
> 
> I have no use for this but thread safety could be an issue.

No more of an issue than it is for other context managers, or for 
setting sys.stdio directly.

> I have no idea if that’s an issue for the kinds of programs this might be
> used in, but always good to keep in mind.
> 
> Also — is it that hard to write raw_input()?

I feared this would happen... mea culpa.

I wrote:

**Long before we had context managers**, I manually redirected stdin and 
stdout to programmatically feed input and capture output from `raw_input`.

Emphasis added. When I wrote it I feared that people wouldn't remember 
that "before we had context managers" was like Python 2.4 or older (by 
memory), and so what we call input today was called raw_input back then.

So I don't need to *write* raw_input, because it already exists :-)

But what I do need is a nice and reliable way to feed values into input 
as if they were typed by the user, and to capture the output of input.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/R6DTLF7G2PCZJV7BAO2GYQMM66C4AFDF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-02 Thread Steven D'Aprano
On Sun, May 01, 2022 at 10:40:49PM -0700, Christopher Barker wrote:

> Yes, any class  could use this feature (though it's more limited than what
> dataclasses do) -- what I was getting is is that it would not be
> (particularly) useful for all classes -- only classes where there are a lot
> of __init__ parameters that can be auto-assigned. And that use case
> overlaps to some extent with dataclasses.

Ah, the penny drops! That makes sense.


> > Named tuples support all of that too.
> >
> 
> No, they don't -- you can add methods, though with a klunky interface,

Its the same class+def interface used for adding methods to any class, 
just with a call to namedtuple as the base class.

class Thingy(namedtuple("Thingy", "spam eggs cheese")):
def method(self, arg):
pass

I think it is a beautifully elegant interface.


> and they ARE tuples under the hood which does come with restrictions.

That is a very good point.


> And the
> immutability means that added methods can't actually do very much.

TIL that string and Decimal methods don't do much.

*wink*


> > One of the reasons I have not glommed onto dataclasses is that for my
> > purposes, they don't seem to add much that named tuples didn't already
> > give us.
> >
> 
> ahh -- that may be because you think of them as "mutable named tuples" --
> that is, the only reason you'd want to use them is if you want your
> "record" to be mutable. But I think you miss the larger picture.
[...]
> I suspect you may have missed the power of datclasses because you started
> with this assumption. Maybe it's because I'm not much of a database guy,
> but I don't think in terms of records.

I'm not a database guy either. When I say record, I mean in the sense of 
Pascal records, or what C calls structs. A collection of named fields 
holding data.

Objects fundamentally have three properties: identity, state, and 
behaviour. The behaviour comes from methods operating on the object's 
state. And that state is normally a collection of named fields holding 
data. That is, a record.

If your class is written in C, like the builtins, you can avoid 
exposing the names of your data fields, thus giving the illusion from 
Python that they don't have a name. But at the C level, they have a 
name, otherwise you can't refer to them from your C code.


> For me, datclasses are a way to make a general purpose class that hold a
> bunch of data, 

I.e. a bunch of named fields, or a record :-)


> and have the boilerplate written for me.

Yes, I get that part.

I just find the boilerplate to be less of a cognitive burden than 
learning the details of dataclasses. Perhaps that's because I've been 
fortunate enough to not have to deal with classes with vast amounts of 
boilerplate. Or I'm just slow to recognise Blub features :-)


> And what
> dataclasses add that makes them so flexible is that they:
> 
> - allow for various custom fields:
>- notably default factories to handle mutable defaults
> - provide a way to customise the initialization
> - and critically, provide a collection of field objects that can be used to
> customize behavior.

That sounds like a class builder mini-framework.

What you describe as "flexible" I describe as "overcomplex". All that 
extra complexity to just avoid writing a class and methods.

Anyway, I'm not trying to discourage you from using dataclasses, or 
persuade you that they are "bad". I'm sure you know your use-cases, and 
I have not yet sat down and given dataclasses a real solid workout. 
Maybe I will come around to them once I do.


> All this makes them very useful for more general purpose classes than a
> simple record.

I'm not saying that all classes *are* a simple record, heavens not!

I'm saying that all classes contain, at their core, a record of named 
fields containing data. Of course classes extend that with all sorts of 
goodies, like inheritance, object identity, methods to operate on that 
data in all sorts of ways, a nice OOP interface, and more.

Anyway, I think I now understand where you are coming from, thank you 
for taking the time to elaborate.


> I'm suggesting that folks find
> evidence for how often auto-assigned parameters would be very useful when
> dataclasses would not.

+1



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/S25Y3L3HAK6XR2VOI7IDFRPBUMCCBOZ3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-02 Thread Steven D'Aprano
On Mon, May 02, 2022 at 07:44:14PM +0100, Paul Moore wrote:

> I have classes with 20+ parameters (packaging metadata). You can argue
> that a dataclass would be better, or some other form of refactoring,
> and you may actually be right. But it is a legitimate design for that 
> use case.

Indeed. 20+ parameters is only a code smell, it's not *necessarily* 
wrong. Sometimes you just need lots of parameters, even if it is ugly.

For reference, open() only takes 8, so 20 is a pretty wiffy code smell, 
but it is what it is.


> In that sort of case, 20+ lines of assignments in the
> constructor *are* actually rather unreadable, not just a pain to
> write.

I don't know. Its pretty easy to skim lines when reading, especially 
when they follow a pattern:

self.spam = spam
self.eggs = eggs
self.cheese = cheese
self.aardvark = aardvark
self.hovercraft = hovercraft
self.grumpy = grumpy
self.dopey = dopey
self.doc = doc
self.happy = happy
self.bashful = bashful
self.sneezy = sneezy
self.sleepy = sleepy
self.foo = foo
self.bar = bar
self.baz = baz
self.major = major
self.minor = minor
self.minimus = minimus
self.quantum = quantum
self.aether = aether
self.phlogiston = phlogiston

Oh that was painful to write!

But I only needed to write it once, and I bet that 99% of people reading 
it will just skim down the list rather than read each line in full.

To be fair, having written it once, manual refactoring may require me to 
rewrite it again, or at least edit it. In early development, sometimes 
the parameters are in rapid flux, and that's really annoying.

But that's just a minor period of experimental coding, not an on-going 
maintenance issue.


> Of course the real problem is that you often don't want to
> *quite* assign the argument unchanged - `self.provides_extras =
> set(provides_extras or [])` or `self.requires_python = requires_python
> or specifiers.SpecifierSet()` are variations that break the whole
> "just assign the argument unchanged" pattern.

Indeed. Once we move out of that unchanged assignment pattern, we need 
to read more carefully rather than skim

self._spam = (spam or '').lower().strip()

but you can't replace that with auto assignment.


> As a variation on the issue, which the @ syntax *wouldn't* solve, in
> classmethods for classes like this, I often find myself constructing
> dictionaries of arguments, copying multiple values from one dict to
> another, sometimes with the same sort of subtle variation as above:
> 
> @classmethod
> def from_other_args(cls, a, b, c, d):
> kw = {}
> kw["a"] = a
> kw["b"] = b
> kw["c"] = c
> kw["d"] = d
> return cls(**kw)

You may find it easier to make a copy of locals() and delete the 
parameters you don't want, rather than retype them all like that:

params = locals().copy()
for name in ['cls', 'e', 'g']:
del params[name]
return cls(**params)


> Again, in "real code", not all of these would be copied, or some would
> have defaults, etc. The pattern's the same, though - enough args
> arecopied to make the idea of marking them with an @ seem attractive.

But the @ proposal here won't help. If you mark them with @, won't they 
be auto-assigned onto cls?


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/TPRCJTZJRHVUVSHHPDFXN3O2LDWGZRRB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: infile and outfile parameters for the input function

2022-05-02 Thread Steven D'Aprano
On Mon, May 02, 2022 at 07:55:16PM -, sam.z.e...@gmail.com wrote:

> Using the prospective redirect_stdin context manager, the following code
> 
> ```
> with open("/dev/tty", 'r+') as file:
> with contextlib.redirect_stdin(file), contextlib.redirect_stdout(file):
> name = input('Name: ')
> 
> print(name)
> ```
> 
> Could be rewritten like this
> 
> ```
> with open('/dev/tty', 'r+') as file:
> name = input('Name: ', infile=file, outfile=file)
> 
> print(name)
> ```

Thanks for the example, but that doesn't explain the why. Why are we 
redirecting IO to a tty? Assume your audience is not made up of expert 
Linux sys admins who even know what a tty is :-)

(I know what a tty is, kinda, but I still don't know what the above 
does.)



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/LGZ7J5PLFDTEAHSB3BT6O37MNBH6LSEB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: infile and outfile parameters for the input function

2022-05-02 Thread Steven D'Aprano
On Mon, May 02, 2022 at 05:42:12PM -, sam.z.e...@gmail.com wrote:

> input(prompt=None, /, infile=None, outfile=None)

> What do people think about this?

I think I want to see some examples of how and why you would use it, and 
why one couldn't just use the redirect_stdout context manager.

(Plus not-yet-existing, but hopefully soon, redirect_stdin.)

Long before we had context managers, I manually redirected stdin and 
stdout to programmatically feed input and capture output from 
`raw_input`. It would be nice to be able to do that more easily, but I'm 
not sure that parameters to the function are better than context 
managers.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UAWIRR5IR7F65VECFID3UIZ775DDOBBE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: contextlib.redirect_stdio function

2022-05-02 Thread Steven D'Aprano
On Mon, May 02, 2022 at 05:29:56PM -, sam.z.e...@gmail.com wrote:

> > There's already contextlib.redirect_stdout() and
> > contextlib.redirect_stderr(). Adding contextlib.redirect_stdin() would
> > be logical, but I think a more flexible
> > 
> > contextlib.redirect_stdio(stdin=None, stdout=None, stderr=None)
> >
> > would be better - where None (the default) means "leave this alone".

Seems kinda useful, but I have two concerns.

(1) Perhaps this would be better as a recipe using ExitStack (plus a new 
redirect_stdin)?

https://docs.python.org/3/library/contextlib.html#contextlib.ExitStack


(2) I don't see `redirect_stdio(stout=None ...)` as meaning "leave 
stdout alone". I see it as equivalent to some variation of unsetting 
stdout, say setting it to /dev/null.

I don't know how I would *not* redirect stdout, except to just not 
redirect it.

Overall:

Add redirect_stdin: +1 (regardless of what we do with redirect_stdio)

Add an ExitStack recipe: +1

Add redirect_stdio: -0

Use None to mean "don't change": -1

Use None to mean "redirect to nowhere": +1


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GNJX2FKDQY2RI2OPG2IQ74QFIYKBQWS2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-02 Thread Steven D'Aprano
On Mon, May 02, 2022 at 10:34:56AM -0600, Pablo Alcain wrote:

> For what it's worth,
> the choice of the `@` was because of two different reasons: first, because
> we were inspired by Ruby's syntax (later on learned that CoffeeScript and
> Crystal had already taken the approach we are proposing) and because the
> `@` token is already used as an infix for `__matmul__` (
> https://peps.python.org/pep-0465/). I believe it's the only usage that it
> has, so it probably won't be that confusing to give it this new semantic as
> well.

Did you forget decorators?

What other languages support this feature, and what syntax do they use?

Personally, I don't like the idea of introducing syntax which looks 
legal in any function call at all, but is only semantically meaningful 
in methods, and not all methods. Mostly only `__init__`.

How would this feature work with immutable classes where you want to 
assign attributes to the instance in the `__new__` method?

I fear that this is too magical, too cryptic, for something that people 
only use in a tiny fraction of method. 17% of `__init__` methods is 
probably less than 1% of methods, which means that it is going to be a 
rare and unusual piece of syntax.

Beginners and casual coders (students, scientists, sys admins, etc, 
anyone who dabbles in Python without being immersed in the language) are 
surely going to struggle to recognise where `instance.spam` gets 
assigned, when there is no `self.spam = spam` anywhere in the class or 
its superclasses. There is nothing about "@" that hints that it is an 
assignment.

(Well, I suppose there is that assignment and at-sign both start with A.)

I realise that this will not satisfy those who want to minimize the 
amount of keystrokes, but remembering that code is read perhaps 20-100 
times more than it is written, perhaps we should consider a keyword:

def __init__(self, auto spam:int, eggs:str = ''):
# spam is automatically bound to self.spam
self.eggs = eggs.lower()

I dunno... I guess because of that "code is read more than it is 
written" thing, I've never felt that this was a major problem needing 
solving. Sure, every time I've written an __init__ with a bunch of 
`self.spam = spam` bindings, I've felt a tiny pang of "There has to be a 
better way!!!".

But **not once** when I have read that same method later on have I 
regretted that those assignments are explicitly written out, or wished 
that they were implicit and invisible.

Oh, by the way, if *all* of the parameters are to be bound:

def __init__(self, spam, eggs, cheese, aardvark):
vars(self).update(locals())
del self.self

Still less magical and more explicit than this auto-assignment proposal.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C5I33AB2WLW77I77QAJFKZFBJOVNJ7RR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-02 Thread Steven D'Aprano
On Mon, May 02, 2022 at 05:57:45PM +0900, Stephen J. Turnbull wrote:

> def __init__(s, s.x, s.y): pass

I think that if this proposal threatens to encourage people to write 
that horro, that would be enough of a reason to reject it.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JW35VR36AF3NP2KTDGORHKQZOST2LUSI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-02 Thread Steven D'Aprano
On Sun, May 01, 2022 at 06:22:08PM -0700, Devin Jeanpierre wrote:

> Is it unreasonable to instead suggest generalizing the assignment target
> for parameters? For example, if parameter assignment happened left to
> right, and allowed more than just variables, then one could do:
> 
> def __init__(self, self.x, self.y): pass

What would this do?

def __init__(self, spam.x, eggs.y): pass

Would it try to assign to variables spam and eggs in the surrounding 
scopes?

How about this?

def __init__(self, x, x.y): pass


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WDRZ7QYQWXCL3QTV4YVIL2YUAS4DNK7I/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: int.to_base, int.from_base

2022-05-02 Thread Steven D'Aprano
On Mon, May 02, 2022 at 09:58:35AM +0200, Marc-Andre Lemburg wrote:

> Just a word of warning: numeric bases are not necessarily the same
> as numeric encodings. The latter usually come with other formatting
> criteria in addition to representing numeric values, e.g. base64 is
> an encoding and not the same as representing numbers in base 64.

Correct. base64 is for encoding byte-strings, not numbers:

>>> binascii.hexlify(b"Hello world")
b'48656c6c6f20776f726c64'

Of course we can treat any byte string as a base-256 number, in which 
case "Hello world" has the value 8752161802671231069284.

There's no obvious collation/alphabet to use for base 64, but if we use (say)

ASCII digits + uppercase + lowercase + "!@"

then that "Hello world" number 875...284 above is:

4XbR6nl87TlScna (in base 64)

which is completely different from the base64 encoding.

By the way, in base 64 that "Hello world" number has:

* digital sum of 445;
* digital root of 4, with persistance of 3;
* digital product of 261040984907288205312;
* zero-free digital product root of 48, with persistance of 7.

There is absolutely no significance to any of this. I'm just geeking out :-)


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/C6CHPDJEXNIBNOMXPNBOHIWB4RFSN3BO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-05-01 Thread Steven D'Aprano
On Sat, Apr 30, 2022 at 11:54:47PM -0700, Christopher Barker wrote:
> On Sat, Apr 30, 2022 at 6:40 PM Steven D'Aprano  wrote:
> 
> > On Sat, Apr 23, 2022 at 12:11:07PM -0700, Christopher Barker wrote:
> > > Absolutely. However, this is not an "all Classes" question.
> >
> > Isn't it? I thought this was a proposal to allow any class to partake in
> > the dataclass autoassignment feature.
> >
> 
> no -- it's about only a small part of that.

How so? Dataclasses support autoassignment. This proposes to allow 
**all classes** (including non-dataclasses) to also support 
autoassignment.

So can you pleae clarify your meaning. To me, this does look like an 
"all Classes" question. What am I missing?


> > > I don't think of dataclasses as "mutable namedtuples with defaults" at
> > all.
> > What do you think of them as?
> >
> 
> I answered that in the next line, that you quote.

Perhaps your answer isn't as clear as you think it is. See below.


> > > But do think they are for classes that are primarily about storing a
> > > defined set of data.
> >
> > Ah, mutable named tuples, with or without defaults? :-)
> >
> 
> well, no. - the key is that you can add other methods to them, and produce
> all sort of varyingly complex functionality. I have done that myself.

Named tuples support all of that too.

One of the reasons I have not glommed onto dataclasses is that for my 
purposes, they don't seem to add much that named tuples didn't already 
give us.

* Record- or struct-like named fields? Check.

* Automatic equality? Check.

* Nice repr? Check.

* Can add arbitrary methods and override existing methods? Check.

Perhaps named tuples offer *too much**:

* Instances of tuple;

* Equality with other tuples;

and maybe dataclasses offer some features I haven't needed yet, but it 
seems to me that named tuples and dataclasses are two solutions to the 
same problem: how to create a record with named fields.


> > Or possibly records/structs.
> >
> 
> nope, nope, and nope.

Okay, I really have no idea what you think dataclasses are, if you don't 
think of them as something like an object-oriented kind of record or 
struct (a class with named data fields).

You even define them in terms of storing a defined set of data, except 
you clearly don't mean a set in the mathematical meaning of an unordered 
collection (i.e. set()). A set of data is another term for a record.

So I don't understand what you think dataclasses are, if you vehemently
deny that they are records (not just one nope, but three).

And since I don't understand your concept of dataclasses, I don't know 
how to treat your position in this discussion. Should I treat it as 
mainstream, or idiosyncratic? Right now, it seems pretty idiosyncratic.

Maybe that's because I don't understand you. See below.


> But anyway, the rest of my post was the real point, and we're busy arguing
> semantics here.

Well yes, because if we don't agree on semantics, we cannot possibly 
communicate. Semantics is the **meaning of our words and concepts**. If 
we don't agree on what those words mean, then how do we understand each 
other?

I've never understood people who seem to prefer to talk past one another 
with misunderstanding after misunderstanding rather than "argue 
semantics" and clarify precisely what they mean.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/WYQA2TFZEVVQRFMWBBJXSFA4WXKYNZWT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Auto assignment of attributes

2022-04-30 Thread Steven D'Aprano
On Sat, Apr 23, 2022 at 12:11:07PM -0700, Christopher Barker wrote:
> On Sat, Apr 23, 2022 at 10:53 AM Pablo Alcain  wrote:
> 
> > Overall, I think that not all Classes can be thought of as Dataclasses
> > and, even though dataclasses solutions have their merits, they probably
> > cannot be extended to most of the other classes.
> >
> 
> Absolutely. However, this is not an "all Classes" question.

Isn't it? I thought this was a proposal to allow any class to partake in 
the dataclass autoassignment feature.

(Not necessarily the implementation.)


> I don't think of dataclasses as "mutable namedtuples with defaults" at all.

What do you think of them as?


> But do think they are for classes that are primarily about storing a
> defined set of data.

Ah, mutable named tuples, with or without defaults? :-)

Or possibly records/structs.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/T2AWTW54AW5SNJSNDCZ6YNK2T6QWNLQT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Delete dictionary entry if key exists using -= operator via __isub__()

2022-04-28 Thread Steven D'Aprano
On Thu, Apr 28, 2022 at 01:18:09AM -, zmvic...@gmail.com wrote:

> *Background*
> It is frequently desirable to delete a dictionary entry if the key 
> exists.

Frequently?

I don't think I've ever needed to do that. Can you give an example of 
real code that does this?

> It is necessary to check that the key exists or, 
> alternatively, handle a KeyError: for, where `d` is a `dict`, and `k` 
> is a valid hashable key, `del d[k]` raises KeyError if `k` does not 
> exist.

The simplest one-liner to delete a key if and only if it exists is 
with the `pop` method:

mydict.pop(key, None)  # Ignore the return result.

That may not be the most efficient way. I expect that the most efficient 
way will depend on whether the key is more likely to exist or not:

# Not benchmarked, so take my predictions with a pinch of salt.

# Probably fastest if the key is usually present.
try:
del mydict[key]
except KeyError:
pass

# Probably fastest if the key is usually absent.
if key in mydict:
del mydict[key]


So we already have three ways to delete only an existing key from a 
dict, "optimised" (in some sense) for three scenarios:

- key expected to be present;
- key expected to be absent;
- for convenience (one-liner).

With three existing solutions to this problem, it is unlikely that a 
fourth solution will be blessed by building it into the dict type 
itself. (Although of course you can add it to your own subclasses.)

Especially not when that solution is easily mistaken for something else:

d -= 1  # Are we subtracting 1, or deleting key 1?

Alone, that is not a fatal problem, but given that there are three other 
satisfactory solutions to the task of deleting an existing key, even 
minor or trivial problems push the cost:benefit ratio into the negative.

By the way:

> class DemoDict(dict):
> def __init__(self, obj):
> super().__init__(obj)

If the only purpose of a method is to call super and inherit from its 
parent class(es), as in the above, then you don't need to define the 
method at all. Just leave it out, and the parent's `__init__` will be 
called.


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/JUPJN2NNEANAOMSMFHTNNFSBRIM2INW7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: NotImplementedMethod function, or making NotImplemented return itself when called

2022-04-25 Thread Steven D'Aprano
On Mon, Apr 25, 2022 at 03:38:21PM -, aanonyme.perso...@hotmail.fr wrote:

> Typically, when subclassing a NamedTuple type, you often don't want 
> the <, >, <=, >=, + or * operators to work, so in that case you would 
> want for the related methods to return NotImplemented.

When I have subclassed NamedTuple types, I have never done that.

If `obj` is a tuple, it supports those operators. If you subclass tuple, 
and get a NamedTuple, the subclass is still a tuple, and the Liskov 
Substitution Principle tells us that it should support all tuple 
operations. If you subclass the NamedTuple, that is still a tuple, and 
again Liskov tells us that it should behave like a tuple.

I'm the first person to acknowledge that Liskov is more of a guideline 
than a law, so I won't say that what you are doing is *always* wrong, 
but surely it is wrong more often than it is right.

In any case, your function is a one-line function. Not every one line 
function needs to be a builtin. If you don't want to reimplement it each 
and every time, it is easy enough to import it from your personal 
utility library:

class Spam(MyNamedTuple):
from mytoolbox import NotImplementedMethod

__lt__ = __gt__ = NotImplementedMethod

Its not quite as convenient as a builtin, but on the plus side, you 
don't have to write a PEP and then wait until Python 3.11 or 3.12 before 
you can start using it.

-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/YKMPFZDKSZNHXKPLW3422JX42HMFKSNY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: Conditions for a coherent MI relationship [was Re: Re: mro and super don't feel so pythonic]

2022-04-25 Thread Steven D'Aprano
On Sun, Apr 17, 2022 at 07:39:29PM +1200, Greg Ewing wrote:
> On 16/04/22 10:26 pm, Steven D'Aprano wrote:
> >C++ and Eiffel are even stricter (more restrictive) than Python. They
> >don't just exclude class hierarchies which are inconsistent, they
> >exclude class hierarchies with perfectly good linearizations because
> >they have a method conflict.
> 
> No, they don't *exclude* such hierarchies, they just require you
> to resolve the conflicts explicitly.

Okay, fair comment. Can we agree on this then?

- C++ allows hierarchies with method conflicts so long as you do not 
  implicitly inherit from those methods. (You must explicitly 
  call the superclasses you want.)

- Eiffel allows hierarchies with method conflicts so long as you remove 
  the conflict by renaming the methods.



> >no matter
> >how many times I say that other choices for MI are legitimate and maybe
> >even better than Python's choice
> 
> So by saying that something is "not full MI", you didn't mean to
> imply that it is somehow inferior and less desirable?

Well done! I'm glad we're making progress.

https://www.youtube.com/watch?v=Cl2pvI1nQVU

How many times did I suggest that the C++ or Eiffel approach might be 
better than Python's approach? How many times did I mention traits, or 
link to Michele Simionato's blog? I linked to James Knight's (misnamed) 
post "super considered harmful" at least twice. It is certainly worth 
reading to understand some of the problems with Python's MI.

If I say that Python's model of MI is more general, I mean *more general*.
Its not a dog-whistle for Python-supremacists. It just means that Python 
handles more cases than Eiffel or C++ it doesn't imply that those 
languages are "inferior".

For what its worth, I think that fully general MI does fit into what 
Paul Graham calls the "Blub Paradox". It's *more powerful* -- but it 
might be *too powerful* to use effectively, like unrestrained GOTO 
capable of jumping into the middle of functions, or Ruby's ability to 
monkeypatch everything including builtins.

If all I want to do is drive to the local corner store and buy milk, a 
rocket-car that does Mach 3 in a straight line and burns 2000 gallons of 
fuel a minute is more powerful, but not as useful as a 30 year old 
Toyota. Sometimes less is more.

If you're having problems with MI maybe what you need is *less of it* 
not more of it, and perhaps that means getting the compiler to warn you 
when you've attached a JATO rocket to your Toyota sedan, which is what 
C++ and Eiffel and Squeak traits do in different ways.



> Because that's
> what you sounded like you were saying, and why everyone is pushing
> back so hard on it.

Yeah Greg, you got me, you saw through my cunning plan to denigrate C++ 
and Eiffel by saying that they are better than Python.


> >The requirement for automatic conflict resolution is kinda necessary for
> >it to be inheritance
> 
> You seem to have a very different idea of what "inheritance" means
> from everyone else here.

Well, I don't know about "everybody else", but okay.

I think we agree that when we do this:

instance.method()

the interpreter automatically searches for some class in a tree of 
superclasses where the method is defined, and we call that inheritance.

I think that the distinguishing feature here is that the interpreter 
does it *automatically*. If we had to manually inspect the tree of 
superclasses and explicitly walk it, it wouldn't be inheritance:

# Pretend that classes have a parent() method and we only
# have single inheritance.
instance.__class__.parent().parent().parent().method(instance)

which is just another way of explicitly calling a method on a class:

GreatGrandParentClass.method(instance)

So we're not inheriting anything there, we're just calling a function:

some_namespace.function(instance)

And if you're just calling a function, then it really doesn't matter 
whether GreatGrandParentClass is in the instance's MRO or not. If the 
MRO isn't used, then why even have an MRO?

What else could distinguish inheritance from just "call some function", 
if it isn't that the interpreter works out where the function is defined 
for you, using some automatic MRO?

(That's not a rhetorical question. If you have some other distinguishing 
characteristic in mind, apart from automatically searching the MRO, then 
please do speak up, I would love to hear it.)

Take that automatic MRO search away, and you're not doing inheritance, 
you're just calling a function in a namespace.

**Which is fine.** That's the basis of composition and delegation. 
That's not worse than MI. Its better.

(But note that this doesn't distinguish between inheritance and 
generics, which also has a form of automatic delegation. Oh well.)


-- 
Steve
__

[Python-ideas] Re: mro and super don't feel so pythonic

2022-04-23 Thread Steven D'Aprano
On Sat, Apr 23, 2022 at 10:18:05PM +0900, Stephen J. Turnbull wrote:
> malmiteria  writes:

>  > If O1 and O2 are refactored into N1(GP) and N2(GP)
>  > the MRO as it was before refactoring was essentially N1, GP, N2, GP,
>  > as what was O1 before refactoring is equivalent to N1, GP after
>  > refactoring. 
>  > After refactoring, the MRO is now N1, N2, GP. Which do behave
>  > differently, in general.
> 
> Nobody denies that.

I denied the first part, and still do.

There is no possible valid MRO that goes [N1, GP, N2, GP] because that 
lists the same class twice. Such a thing was possible in Python 1.x and 
2.x "old-style" (classic) classes, it was a bug in the way classic 
classes generated the MRO, and it caused bugs in code that hit those 
cases. 

(Fortunately those cases were rare, so most people didn't notice.)


-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UEGNMUD3P55W4C6SJ6IWGTZT7QFPRROU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: mro and super don't feel so pythonic

2022-04-22 Thread Steven D'Aprano
On Fri, Apr 22, 2022 at 08:46:38PM +1100, Matsuoka Takuo wrote:
> On Fri, 22 Apr 2022 at 15:47, Christopher Barker  wrote:
> >
> > Sure -- but there's nothing special or difficult here -- refactoring 
> > can create breaking changes. I believe it was part of Hettinger's 
> > thesis in "Super Considered Super" that the use of super() is part 
> > of the API of a class hierarchy. Indeed, the MRO of a class 
> > hierarchy is part of the API. If you change the MRO, it is a 
> > potentially breaking change, just as if a method is added or 
> > removed, or renamed, or ...
> 
> So I may not have been told a refactoring like that shouldn't involve
> a new instance of overriding, but may I have essentially been told I
> shouldn't refactor at all if I didn't want to create breaking changes?

Pretty much. In general, any change to the MRO is a potential breaking 
change, unless you design your classes very carefully.

See for example this Django ticket:

https://code.djangoproject.com/ticket/29735?cversion=0_hist=3

**Inheritance is hard.** The easy cases are so amazingly easy that 
people are shocked when they run into complicated inheritance designs 
and discover how hard they are.

Some people respond by refusing to believe that inheritance is hard, and 
insisting that there has to be a Magic Bullet that will make everything 
Just Work, or they blame super(). But the problems aren't with super().

Other people respond by saying that inheritance is just a tool, and if 
the tool doesn't work you should use a different tool which is better 
suited to what you are trying to do. That tool might be to restrict the 
way you use inheritance to a smaller subset, or to use delegation, etc.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OBQNROHI5L3KZ6HO2ID7P6FBJZSNDYVS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: mro and super don't feel so pythonic

2022-04-22 Thread Steven D'Aprano
On Wed, Apr 20, 2022 at 03:43:44PM -, malmiteria  wrote:

> to give you exemples of problems :
> 1) let's start with a django problem :
> ```
> class MyView(ModelView, PermissionMixin): pass
> ```
> doesn't apply any of the PermissionMixin logic to the view.
> It doesn't raise a single error either.

Is it supposed to work, or is the bug in your code, not the framework?

If it is a bug in the framework, have you reported it as a bug to 
Django, and if not, why not?


> Since it's mostly made out of django stuff, it's likely there wouldn't 
> be automated testing to check if the permissions are indeed required 
> when attempting to visit whatever resource MyView serves. After all, 
> automated testing should test your code, not the library you're 
> importing's code.

You should be testing MyView to ensure that the permissions are 
required. If you don't have a test that MyView requires testing, then 
somebody could refactor MyView and remove the permission stuff, and you 
will never know.


> The only way to tell those permission aren't applied, in this case, is 
> to actually see the bug happen IRL.

Not the only way. The right way is to test MyView to ensure it is doing 
what you expect.


> 2) Another one, straight from django docs : 
> https://docs.djangoproject.com/fr/4.0/topics/auth/default/#django.contrib.auth.mixins.UserPassesTestMixin.get_test_func

Here's the link for those who can't read French:

https://docs.djangoproject.com/en/4.0/topics/auth/default/#django.contrib.auth.mixins.UserPassesTestMixin.get_test_func

(Your English is much better than my French.)

The problem here is explained in the docs:

"Due to the way UserPassesTestMixin is implemented, you cannot stack 
them in your inheritance list."

And then:

"If TestMixin1 would call super() and take that result into account, 
TestMixin1 wouldn’t work standalone anymore."

So the problem is that UserPassesTestMixin is not designed to be 
the top node in a diamond (that is, you can't inherit from it more than 
once).

This is an example why multiple inheritance must be *cooperative*.

Due to the tight coupling between classes in MI, you have to carefully 
design your classes to cooperate.

I *guess* that the fundamental problem here is that Django is where 
Zope and Plone were six years ago: stuck with an overly complicated 
and complex framework based on MI, before their move to preferring 
composition(?) in version 4. (I'm *totally* guessing here: I'm not an 
expert on Django, or Zope. I could be wrong.) 


> Some would argue the proper way to do multiple inheritance implies 
> having one single parent class all classes you inherit from do 
> themselves inherit from.
>
> This allows you to use super in those class, and still allows you to 
> inherit from each of them individually, as needed, without risking a 
> super call to target object, in which case you'd face the problem that 
> object doesn't have whatever method you're trying to access.

Correct.


> What happens when the common parent class raises NotImplemented errors?

The same as any other exception: if your class raises an exception, it 
is either an internal bug in the class, or a documented exception as 
part of the class' API.

So if your common parent class raises NotImplementedError, that is either
a bug in your common parent class, or a bug in your subclass, for 
triggering the condition that is documented as raising NotImplementedError.


> You can't use super in any of it's child class, and any of its child 
> class using super would work only under MI, if not the last parent 
> inherited from.

The whole point of that common parent is to catch any methods before 
they hit the ultimate base class, object.



> In other term, you render those 'independant' objects not independant anymore.

Classes in inheritance hierarchies are not independent. They are heavily 
dependent and tightly coupled.

This is why people prefer the less-tightly coupled composition pattern 
over inheritance.


> Note that this django docs explicitely states it is not possible to 
> practice multiple inheritance in this case, which is my point :

Right. Because you have to design your classes very carefully to work 
correctly under MI, and Django have not done that in this case. (They 
may or may not have a good reason for that.)


> people don't know a way out of super for MI cases, when super doesn't work.

> 
> 3) My gobelin exemple is another case.
> What if you want to inherit from multiple parent 'as is', instead of 
> having them ignore their respective parent (GP) because this parent 
> (GP) is reoccuring in the inheritance tree?

We've answered this many times.


> 4) Lib refactoring are breaking changes

> A Lib author refactoring his code by extracting a class as a parent 
> class of multiple of the class provided is introducing a breaking 
> change.
> Because any user's code inheriting from at least 2 of the class 
> impacted by this refactoring will now exhibit a different 

[Python-ideas] Re: Conditions for a coherent MI relationship [was Re: Re: mro and super don't feel so pythonic]

2022-04-16 Thread Steven D'Aprano
On Sat, Apr 16, 2022 at 12:23:10PM -0400, David Mertz, Ph.D. wrote:

> R doesn't have inheritance, it's not OOP,

R is OOP and always has been. All values, arrays, functions etc in R are 
objects. Even expressions are objects. And it has inheritance.

https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Objects

https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Inheritance

R has three mechanisms for implementing classes, S3, S4 and Reference 
classes (unofficially known as S5). All three of them allow inheritance.

http://adv-r.had.co.nz/S3.html


> One thing I do find a bête noire is the silly claim, that Chris repeats,
> that inheritance expresses "Is-A" relationships.

"Is-a" is fundamental to the relationship between a class and its 
instances. Inheritance is orthogonal to that relationship, e.g. Swift 
only allows single inheritance. Every class can only have a single 
superclass, which defines what kind of thing the subclass is. But it can 
inherit from multiple mixins or traits, which allow it to inherit 
behaviour.

In Python, virtual subclassing defines that "is-a" relationship 
without inheriting anything from the parent class.


> The reality is more clear in the actual primary definition of the word
> itself: "the practice of receiving private property, titles, debts,
> entitlements, privileges, rights, and obligations".

That definition is incomplete when it comes to inheritance.

If I go to the store and purchase a bottle of milk, I have received 
private property. That's not inheritance.

If I receive a knighthood for slaughtering my monarch's enemies, that 
is also not inheritance.

My obligation to pay taxes when begin to earn income is another thing 
which I receive but is not inheritance.


> In programming, a class
> can receive methods and attributes from some other classes. That's all.
> It's just a convenience of code organization, nothing ontological.

That might be how Alan Kay originally saw OOP. He famously regretted 
using the term "object" because it distracted from what he saw as the 
genuinely fundamental parts of OOP, namely 

* Message passing
* Encapsulation
* Late (dynamic) binding 

Purists may also wish to distinguish between subclassing and subtyping. 
Raymond Hettinger has given talks about opening your mind to different 
models for subclassing, e.g. what he calls the "conceptual view" vs 
"operational view" of subclassing. Or perhaps what we might call "parent 
driven" versus "child driven".

https://www.youtube.com/watch?v=miGolgp9xq8

But whether we like it or not, the concepts of subclassing and subtyping 
are entwined. We model "is-a" concepts using classes; we implement code 
reuse using classes; we model taxonomic hierarchies using classes. 
Classes are flexible; they contain multitudes.



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/G6KLP7QVS3RXWJYNIEPX22DHJFUA6HZO/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-ideas] Re: mro and super don't feel so pythonic

2022-04-16 Thread Steven D'Aprano
On Sat, Apr 16, 2022 at 05:27:57PM +1200, Greg Ewing wrote:
> On 15/04/22 10:37 pm, Steven D'Aprano wrote:
> >If you look at languages that implement MI, and pick the implementations 
> >which allow it with the fewest restrictions, then that is "full MI".
> 
> >I believe that Python (and other languages) allow MI with
> >the smallest set of restrictions, namely that there is a C3
> >linearization possible
> 
> But before Python adopted the C3 algorithm, it was less
> restrictive about the inheritance graph.

Less restrictive, and *inconsistent* (hence buggy).


> So by your definition, current Python does not do full MI!

No, I'm excluding inconsistent models of MI.

Correctness is a hard requirement. Otherwise we get into the ludicrous 
territory of insisting that every feature can be implemented with one 
function:

def omnipotent_function(*args, **kwargs):
"""Function which does EVERYTHING.

(Possibly not correctly, but who cares?)
"""
return None

There you go, now we can define an MRO for any class hierarchy 
imaginable, and dispatch to the next class in that hierarchy, using the 
same function. It won't work, of course, but if correctness isn't a hard 
requirement, what does that matter? :-)


> >If you have to manually call a specific method, as shown here:
> >
> >https://devblogs.microsoft.com/oldnewthing/20210813-05/?p=105554
> >
> >you're no longer using inheritance, you're doing delegation.
> 
> You could also say that Python automatically delegates to the first
> method found by searching the MRO.
> 
> Why is one of these delegation and not the other?

That is a very interesting question.

As I mentioned earlier, there is a sense that all inheritance is a kind 
of delegation, and in languages without an explicit super() (or 
"nextclass", or whatever you want to call it), the only way to get 
inheritance when you overload a method is to use explicit delegation to 
your superclass.

So we might say that all inheritance is delegation, but not all 
delegation is inheritance. We might even go further and say that any 
delegation to a superclass (not just the direct parent) is a form of 
manual inheritance.

But in general, in ordinary language, when we talk about inheritance, 
we're talking about two (maybe three) cases:

1. Automatic inheritance, when your class doesn't define a method but 
   automatically inherits it from its superclass(es).

class Parent:
def method(self): pass

class Child(Parent):
pass

assert hasattr(Child, "method")


2. Automatic inheritance when your class overloads a method and calls
   super() manually to delegate to the superclass(es).

class Child(Parent):
def method(self):
print("Overload")
super().method()

3. And more dubiously, but commonly used, when people who don't like 
   super(), or know about it, explicitly delegate to their  
   parent in single inheritance:

class Child(Parent):
def method(self):
print("Overload")
Parent.method(self)  # Oooh, flashbacks to Python 1.5



-- 
Steve
___
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GOMMXQW4L6OJXO2425DZQE5AERGEMC5G/
Code of Conduct: http://python.org/psf/codeofconduct/


  1   2   3   4   5   6   7   8   9   10   >