[Python-Dev] Re: Python-Dev Digest, Vol 232, Issue 10

2022-11-28 Thread Yoni Lavi
> I can't see any, but then I couldn't see the security consequences of
> predictable string hashes until they were pointed out to me. So it would
> be really good to have some security experts comment on whether this is
> safe or not.

I can't either. I can point out that the complexity attack via hash
collisions is not possible on None specifically because it is a singleton,
so:
1. There may only be one None in a dict at most,
2. For composite types that depend on None and other things, like say a
tuple of string and an optional int, they become as resistant to attack as
the same types without None, in this case string and int.
3. Python only explicitly added this security (hash randomization) feature
to string and bytes hashes, and if your composite key type depends on those
types, it will still be protected regardless of what None does


> Because this entire discussion is motivated by the OP who wants
> consistent set order across multiple runs of his Python application.
> That's what he needs; having None hash to a constant value is just a
> means to that end.

Not entirely. I do explain in my doc why there is a foundational reason
why, once the choice was made to have None represent the `None` of
Optional[T], it entered the family of types that you would expect to have
deterministic behavior. And as the hashing function is intrinsic to types
in Python, it is included in that.


> Even if we grant None a constant hash, that still does not guarantee
> consistent set order across runs. At best, we might get such consistent
> order as an undocumented and changeable implementation detail, until we
> change the implementation of hashing, or of sets, or of something
> seemingly unrelated like address randomisation.

ASLR will not cause any trouble so long as I keep objects with identity
based hashing out of my sets. Or at least, any sets I later iterate on and
run imperative code with side-effects under the loop.
The possibility of disabling ASLR is not a reason to dismiss this change.
No one guarantees a user of Python is in a position to make infosec related
decisions on the computer system they are working on.

Regarding the possibilities of hashes changing for the worse (in this
regard) - sure. Anything is possible.

Regarding sets - the set implementation is deterministic, and always has
been (for decades).
Yes, in theory it is possible that a new version or implementation of
Python will make sets non-deterministic - shuffle themselves in the
background independently from their history of operations, or what have
you, and then my change loses all of its value.
Like I said, anything is possible. But I think these last two points are
essentially FUD.

I made my proposal because I believe the FUD scenarios are strongly
unlikely. (and even then, at worst we end up with a "practically useless"
behavior on None, that can also be freely reverted along with such other
changes anyway)



On Tue, Nov 29, 2022 at 5:34 AM  wrote:

> Send Python-Dev mailing list submissions to
> python-dev@python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> or, via email, send a message with subject or body 'help' to
> python-dev-requ...@python.org
>
> You can reach the person managing the list at
> python-dev-ow...@python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-Dev digest..."Today's Topics:
>
>1. Re: A proposal to modify `None` so that it hashes to a constant
>   (Steven D'Aprano)
>2. Re: A proposal to modify `None` so that it hashes to a constant
>   (Oscar Benjamin)
>3. Re: A proposal to modify `None` so that it hashes to a constant
>   (Chris Angelico)
>4. Re: A proposal to modify `None` so that it hashes to a constant
>   (Steven D'Aprano)
>
>
>
> -- Forwarded message --
> From: "Steven D'Aprano" 
> To: python-dev@python.org
> Cc:
> Bcc:
> Date: Mon, 28 Nov 2022 15:35:44 +1100
> Subject: [Python-Dev] Re: A proposal to modify `None` so that it hashes to
> a constant
> On Tue, Nov 29, 2022 at 01:34:54PM +1300, Greg Ewing wrote:
>
> > I got the impression that there were some internal language reasons
> > to want stable dicts, e.g. so that the class dict passed to __prepare__
> > preserves the order in which names are assigned in the class body. Are
> > there any such use cases for stable sets?
>
> Some people wanted order preserving kwargs, I think for web frameworks.
> There was even a discussion for a while about using OrderedDict for
> kwargs and leaving dicts unordered.
>
> For me, the biggest advantage of preserving input order in dicts is that
> it is now easier to write doctests using the dict's repr. It would be
> nice to be able to do the same for sets, but not nice enough to justify
> making them bigger or slower.
>
>
> --
> Steve
>
>
>
> -- Forwarded message --
> From: 

[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Guido van Rossum
[Oscar Benjamin]
> (If you think that there might be a
> performance penalty then you haven't understood the suggestion!)

Then I don't understand the question, and I will refrain from participating
further in this discussion.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VLJA4NBA44JVFUIIH4UNDLWF5NKFGWWA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Steven D'Aprano
On Tue, Nov 29, 2022 at 02:07:34AM +, Oscar Benjamin wrote:

> Let's split this into two separate questions:

Let's not. Your first question about non-deterministic set order being 
"innately good" is a straw man: as we've already discussed, set order is 
not non-deterministic (except in the informal sense of "hard to 
predict") and I don't think anyone is arguing in favour of keeping set 
order unpredictable even if there are faster, more compact, simpler 
implementations which preserve order.

Talking about determinism is just muddying the waters. Sets are 
deterministic: their order is determinied by the implementation, the set 
history, and potentially any environmental sources of entropy used in 
address randomisation. Deterministic does not mean predictable.

If we want to get this discussion onto a more useful path, we should 
start with a security question:

The hash of None changes because of address randomisation. Address 
randomisation is enabled as a security measure. Are there any security 
consequences of giving None a constant hash?

I can't see any, but then I couldn't see the security consequences of 
predictable string hashes until they were pointed out to me. So it would 
be really good to have some security experts comment on whether this is 
safe or not.


> why are we even asking about "set order" rather than the
> benefits of determinism in general?

Because this entire discussion is motivated by the OP who wants 
consistent set order across multiple runs of his Python application. 
That's what he needs; having None hash to a constant value is just a 
means to that end.

His sets contain objects whose hashes depend on `Optional[int]`, which 
means sometimes they include None, and when he runs under an interpreter 
built with address randomisation, the hash of None can change.

Even if we grant None a constant hash, that still does not guarantee 
consistent set order across runs. At best, we might get such consistent 
order as an undocumented and changeable implementation detail, until we 
change the implementation of hashing, or of sets, or of something 
seemingly unrelated like address randomisation.


-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UL2RZQLAAWK6ZUQPHA3RRN22TPCE7PLL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Chris Angelico
On Tue, 29 Nov 2022 at 13:12, Oscar Benjamin  wrote:
> As for point 2. the fact that sets are currently non-deterministic is
> actually a relatively new thing in Python. Before hash-randomisation
> set and dict order *was* deterministic but with an arbitrary order.
> That was only changed because of a supposed security issue with hash
> collisions. Prior to that it was well understood that determinism was
> beneficial (honestly I don't understand why I have to state this point
> explicitly: determinism is almost always best in our context).

To clarify: The hash collision attack is a very real one, but specific
to dictionaries of string keys, since there are quite a few ways for
an attacker to send a string that gets automatically parsed into such
a dictionary (eg web app frameworks where the request parameters are
made available as a dictionary). But since that attack surface is *so*
specific, randomization of non-string hashes is unimportant.

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PIUODXYX4ZYXHGKONYCRQKOGDYOAGDEE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Oscar Benjamin
On Tue, 29 Nov 2022 at 01:33, Steven D'Aprano  wrote:
>
> On Mon, Nov 28, 2022 at 11:13:34PM +, Oscar Benjamin wrote:
> > On Mon, 28 Nov 2022 at 22:56, Brett Cannon  wrote:
>
> As I understand it, we could make sets ordered, but only at the cost of
> space (much more memory) or time (slower) or both.
>
> I am sure that Guido is correct that **if** somebody comes up with a
> fast, efficient ordered set implementation that doesn't perform worse
> than the current implementation, we will happily swap to giving sets a
> predictable order, as we did with dicts. (Practicality beats purity --
> even if sets are *philosophically* unordered, preserving input order is
> too useful to give up unless we gain something in return.)

Let's split this into two separate questions:

1. Is it *innately* good that set order is non-deterministic?
2. Are there some other reasons why it is good to choose a model that
implies *non-deterministic* set order?

The answer to 1. is emphatically NO. In fact the question itself is
badly posed: why are we even asking about "set order" rather than the
benefits of determinism in general? If I want my code to be
deterministic then that's just something that I want regardless of
whether sets, dicts, floats etc are involved.

As for point 2. the fact that sets are currently non-deterministic is
actually a relatively new thing in Python. Before hash-randomisation
set and dict order *was* deterministic but with an arbitrary order.
That was only changed because of a supposed security issue with hash
collisions. Prior to that it was well understood that determinism was
beneficial (honestly I don't understand why I have to state this point
explicitly: determinism is almost always best in our context).

Please everyone don't confuse arbitrary order, implementation defined
order and non-deterministic order. There is no reason why sets in
Python need to have a *non-deterministic* order or at least why there
shouldn't be a way to control that. There is no performance penalty in
making the order *deterministic*. (If you think that there might be a
performance penalty then you haven't understood the suggestion!)

> > It would be useful to have a straight-forward way to sort a set into a
> > deterministic ordering but no such feature exists after the Py3K
> > changes (sorted used to do this in Python 2.x).
>
> `sorted()` works fine on homogeneous sets. It is only heterogeneous sets
> that are a problem, and in practice, that usually means None mixed in
> with some other type.

That is of course precisely the context for this thread!

--
Oscar
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YGXJBLQOQBZL5ZTEPJ2V2B3KUCXDXSK2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Steven D'Aprano
On Tue, Nov 29, 2022 at 01:34:54PM +1300, Greg Ewing wrote:

> I got the impression that there were some internal language reasons
> to want stable dicts, e.g. so that the class dict passed to __prepare__
> preserves the order in which names are assigned in the class body. Are
> there any such use cases for stable sets?

Some people wanted order preserving kwargs, I think for web frameworks. 
There was even a discussion for a while about using OrderedDict for 
kwargs and leaving dicts unordered.

For me, the biggest advantage of preserving input order in dicts is that 
it is now easier to write doctests using the dict's repr. It would be 
nice to be able to do the same for sets, but not nice enough to justify 
making them bigger or slower.


-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5HWYYKDJLGDZT5IZIXM42EWF2WPFXKBJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Matthias Görgens
For what it's worth, as a user of the language I would like sets to behave
as much as possible as-if they were basically dicts that map all elements
to `()`.  That way I'd have to keep one less mental model in my head.

I deliberately say 'as-if' because when I'm a user of the language, I don't
care how it's implemented.  (Just like I don't have to care as a user that
we have (at least) two different ways dicts are represented internally.)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WB5VJHTCEPDLVQV46YDXO3TRBDXJCK5S/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Steven D'Aprano
On Mon, Nov 28, 2022 at 11:13:34PM +, Oscar Benjamin wrote:
> On Mon, 28 Nov 2022 at 22:56, Brett Cannon  wrote:

> > That's actually by design. Sets are not meant to be deterministic 
> > conceptually as they are essentially a bag of stuff. If you want 
> > deterministic ordering you should convert it to a list and sort the 
> > list.
> 
> What does "sets are not meant to be deterministic" even mean?

I'm not Brett, so I'm not answering for him, but many people (sometimes 
including me) misuse "not deterministic" to refer to set order and the 
old dict order. Of course set order is deterministic, it is determined 
by the combination of the set implementation, the hashing algorithm, and 
the history of the set -- all the items that have ever appeared in the 
set (both those removed and those that remain).

Set order is deterministic in the same way that roulette wheels are 
deterministic.

We don't have a good term for this: the items don't appear in random 
order. But it is not predictable either, except in very special cases. 
"Arbitrary" is not right either, since that implies we can easily choose 
whatever set order we want.

I think that physicists call this "deterministic chaos"

https://www.quora.com/Theoretical-Physics-What-is-deterministic-but-unpredictable

(sorry for the quora link) so I guess we might say that set iteration 
order is "deterministically chaotic" if you want to be precise, but life 
is too short for that level of pedantry (and that's coming from me, a 
Grade A Pedant *wink*) so people often describe it as random, arbitrary, 
or non-deterministic when it's none of those things :-).

Maybe we could call set order "pseudo-random".

Getting back to the design part, I think that what Brett is trying to 
get across is not that the chaotic set order is in and of itself a 
requirement, but that given the other requirements, chaotic set order is 
currently considered a necessary condition.

As I understand it, we could make sets ordered, but only at the cost of 
space (much more memory) or time (slower) or both.

I am sure that Guido is correct that **if** somebody comes up with a 
fast, efficient ordered set implementation that doesn't perform worse 
than the current implementation, we will happily swap to giving sets a 
predictable order, as we did with dicts. (Practicality beats purity -- 
even if sets are *philosophically* unordered, preserving input order is 
too useful to give up unless we gain something in return.)

But I don't think it is fair or kind to call Brett's argument FUD. At 
the very least it is uncharitable interpretation of Brett's position.

> It would be useful to have a straight-forward way to sort a set into a
> deterministic ordering but no such feature exists after the Py3K
> changes (sorted used to do this in Python 2.x).

`sorted()` works fine on homogeneous sets. It is only heterogeneous sets 
that are a problem, and in practice, that usually means None mixed in 
with some other type.


-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UAVNSMS3XFAQQ6ZRD27IXPTWZFCHP6M4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Guido van Rossum
Nah, `__prepare__` very much predates stable dicts and that problem was
solved differently.

On Mon, Nov 28, 2022 at 4:46 PM Greg Ewing  wrote:

> On 29/11/22 12:51 pm, Guido van Rossum wrote:
> > "Sets weren't meant to be deterministic" sounds like a remnant of
> > the old philosophy, where we said the same about dicts -- until they
> > became deterministic without slowing down, and then everybody loved it.
>
> I got the impression that there were some internal language reasons
> to want stable dicts, e.g. so that the class dict passed to __prepare__
> preserves the order in which names are assigned in the class body. Are
> there any such use cases for stable sets?
>
> --
> Greg
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/3TEOQMV2UXOKWMHVYA63JLPLAZ2TNX55/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O7V5XLTAFZZXNTUXT5LD7C75XNGQZFIL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Greg Ewing

On 29/11/22 12:51 pm, Guido van Rossum wrote:
"Sets weren't meant to be deterministic" sounds like a remnant of 
the old philosophy, where we said the same about dicts -- until they 
became deterministic without slowing down, and then everybody loved it.


I got the impression that there were some internal language reasons
to want stable dicts, e.g. so that the class dict passed to __prepare__
preserves the order in which names are assigned in the class body. Are
there any such use cases for stable sets?

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3TEOQMV2UXOKWMHVYA63JLPLAZ2TNX55/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Guido van Rossum
To stir up some more fire, I would personally be fine with sets having the
same ordering guarantees as dicts, *IF* it can be done without performance
degradations. So far nobody has come up with a way to ensure that. "Sets
weren't meant to be deterministic" sounds like a remnant of the old
philosophy, where we said the same about dicts -- until they became
deterministic without slowing down, and then everybody loved it.

Regarding hash(None), I think that there is something to be said for making
that stable, and the arguments against it feel like rationalizations for
FUD. We've survived way larger controversies. I also note that hash(()) is
apparently stable.

On Mon, Nov 28, 2022 at 3:16 PM Oscar Benjamin 
wrote:

> On Mon, 28 Nov 2022 at 22:56, Brett Cannon  wrote:
> >
> > On Sun, Nov 27, 2022 at 11:36 AM Yoni Lavi 
> wrote:
> >>
> >> All it takes is for your program to compute a set somewhere with
> affected keys, and iterate on it - and determinism is lost.
> >
> > That's actually by design. Sets are not meant to be deterministic
> conceptually as they are essentially a bag of stuff. If you want
> deterministic ordering you should convert it to a list and sort the list.
>
> What does "sets are not meant to be deterministic" even mean?
>
> Mathematically speaking sets are not meant to be ordered in any
> particular way but a computational implementation has to have some
> order and there is no reason to prefer non-deterministic order in
> general. Actually determinism in a computational context is usually a
> very valuable feature. I find it hard to see why non-determinism is
> "by design".
>
> Also it isn't usually possible to sort a list containing None:
>
> In [9]: sorted([None, 1, 2])
> ---
> TypeError Traceback (most recent call last)
>  in 
> > 1 sorted([None, 1, 2])
>
> TypeError: '<' not supported between instances of 'int' and 'NoneType'
>
> It would be useful to have a straight-forward way to sort a set into a
> deterministic ordering but no such feature exists after the Py3K
> changes (sorted used to do this in Python 2.x).
>
> --
> Oscar
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ILP2ZKVXQIF2ONOWRJCMLNHI3LFUFBD3/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IGVDQ73A4PTUF42AAEA4AXS45ORUP6PB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Oscar Benjamin
On Mon, 28 Nov 2022 at 22:56, Brett Cannon  wrote:
>
> On Sun, Nov 27, 2022 at 11:36 AM Yoni Lavi  wrote:
>>
>> All it takes is for your program to compute a set somewhere with affected 
>> keys, and iterate on it - and determinism is lost.
>
> That's actually by design. Sets are not meant to be deterministic 
> conceptually as they are essentially a bag of stuff. If you want 
> deterministic ordering you should convert it to a list and sort the list.

What does "sets are not meant to be deterministic" even mean?

Mathematically speaking sets are not meant to be ordered in any
particular way but a computational implementation has to have some
order and there is no reason to prefer non-deterministic order in
general. Actually determinism in a computational context is usually a
very valuable feature. I find it hard to see why non-determinism is
"by design".

Also it isn't usually possible to sort a list containing None:

In [9]: sorted([None, 1, 2])
---
TypeError Traceback (most recent call last)
 in 
> 1 sorted([None, 1, 2])

TypeError: '<' not supported between instances of 'int' and 'NoneType'

It would be useful to have a straight-forward way to sort a set into a
deterministic ordering but no such feature exists after the Py3K
changes (sorted used to do this in Python 2.x).

--
Oscar
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ILP2ZKVXQIF2ONOWRJCMLNHI3LFUFBD3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Chris Angelico
On Tue, 29 Nov 2022 at 09:51, Brett Cannon  wrote:
> ... we worked hard to stop people from relying on consistent 
> hashing/iteration from random-access data structures like dict and set.
>

Say what? Who's been working hard to stop people from relying on
consistent iteration order for a dict?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WE5D4ZVU2ZUBDW7VX27PJBPWJDWLONED/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: A proposal to modify `None` so that it hashes to a constant

2022-11-28 Thread Brett Cannon
On Sun, Nov 27, 2022 at 11:36 AM Yoni Lavi  wrote:

> I wrote a doc stating my case here:
>
> https://docs.google.com/document/d/1et5x5HckTJhUQsz2lcC1avQrgDufXFnHMin7GlI5XPI/edit#
>
> Briefly,
>
> 1. The main motivation for it is to allow users to get a predictable
> result on a given input (for programs that are doing pure compute, in
> domains like operations research / compilation), any time they run their
> program. Having stable repro is important for debugging. Notebooks with
> statistical analysis are another similar case where this is needed: you
> might want other people to run your notebook and get the same result you
> did.
>

But the hash of an object is not guaranteed to be stable by the language,
so I would argue someone expecting that is expected to convert
random-access data structures  to ones that are consistent when necessary
(e.g. sorted lists).


>
> 2. The reason the hash non-determinism of None matters in practice is that
> it can infect commonly used mapping key types, such as frozen dataclasses
> containing `Optional[int]` fields.
>

I don't see why the hashing within a dict needs to be consistent as that's
not a guarantee we make with Python.


>
> 3. Non-determinism emerging from other value types like `str` can be
> disabled by the user using `PYTHONHASHSEED`, but there's no such protection
> against `None`.
>

If I remember correctly, PYTHONHASHSEED was added to help folks migrate
when we added randomness to hashing as they had accidentally come to expect
a consistent iteration order on dictionary keys. I wouldn't take its
existence to suggest that PYTHONHASHSEED is meant to make **all** hashing
consistent (e.g. people who implement their own __hash__ don't have to
follow that expectation).


>
> All it takes is for your program to compute a set somewhere with affected
> keys, and iterate on it - and determinism is lost.
>

That's actually by design. Sets are not meant to be deterministic
conceptually as they are essentially a bag of stuff. If you want
deterministic ordering you should convert it to a list and sort the list.


>
> The need to modify None itself is caused by two factors
> - `Optional` being implemented effectively as `T | None` in Python as a
> strongly established practice
> - The fact that `__hash__` is an intrinsic property of a type in Python,
> the hashing function cannot be externally supplied to its builtin container
> types. So we have to modify the type None itself, rather than write some
> alternative hasher that we could use if we care about deterministic
> behavior across runs.
>
> This was debated at length over the forum and in discord.
> I also posted a PR for it, and it was closed, see:
>
> https://github.com/python/cpython/issues/99540
> https://github.com/python/cpython/pull/99541
>
> Asking for opinions, and to re-open the PR, provided there is enough
> support for such a change to take place.
>

I personally agree with the arguments made in the issue, so I'm afraid I
don't' support making the change as we worked hard to stop people from
relying on consistent hashing/iteration from random-access data structures
like dict and set.

-Brett


>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/KUH4HZYKPBO57A73QKCGU4GD2JNY3VMH/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/25XFRWUOUREKKY6GUIOQIIRFBNI34MNZ/
Code of Conduct: http://python.org/psf/codeofconduct/