Re: Non-deterministic set ordering

2022-05-15 Thread MRAB

On 2022-05-16 04:20, Rob Cliffe via Python-list wrote:



On 16/05/2022 04:13, Dan Stromberg wrote:


On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list 
 wrote:


I was shocked to discover that when repeatedly running the following
program (condensed from a "real" program) under Python 3.8.3

for p in { ('x','y'), ('y','x') }:
 print(p)

the output was sometimes

('y', 'x')
('x', 'y')

and sometimes

('x', 'y')
('y', 'x')

Can anyone explain why running identical code should result in
traversing a set in a different order?


Sets are defined as unordered so that they can be hashed internally to 
give O(1) operations for many tasks.


It wouldn't be unreasonable for sets to use a fixed-by-arbitrary 
ordering for a given group of set operations, but being unpredictable 
deters developers from mistakenly assuming they are ordered.


If you need order, you should use a tuple, list, or something like 
https://grantjenks.com/docs/sortedcontainers/sortedset.html

Thanks, I can work round this behaviour.
But I'm curious: where does the variability come from?  Is it deliberate
(as your answer seems to imply)?  AFAIK the same code within the *same
run* of a program does produce identical results.


Basically, Python uses hash randomisation in order to protect it against 
denial-of-service attacks. (Search for "PYTHONHASHSEED" in the docs.)


It also applied to dicts (the code for sets was based on that for 
dicts), but dicts now remember their insertion order.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Non-deterministic set ordering

2022-05-15 Thread Rob Cliffe via Python-list

Thanks, Paul.  Question answered!
Rob Cliffe

On 16/05/2022 04:36, Paul Bryan wrote:

This may explain it:
https://stackoverflow.com/questions/27522626/hash-function-in-python-3-3-returns-different-results-between-sessions

On Mon, 2022-05-16 at 04:20 +0100, Rob Cliffe via Python-list wrote:



On 16/05/2022 04:13, Dan Stromberg wrote:


On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list
 wrote:

    I was shocked to discover that when repeatedly running the following
    program (condensed from a "real" program) under Python 3.8.3

    for p in { ('x','y'), ('y','x') }:
     print(p)

    the output was sometimes

    ('y', 'x')
    ('x', 'y')

    and sometimes

    ('x', 'y')
    ('y', 'x')

    Can anyone explain why running identical code should result in
    traversing a set in a different order?


Sets are defined as unordered so that they can be hashed internally to
give O(1) operations for many tasks.

It wouldn't be unreasonable for sets to use a fixed-by-arbitrary
ordering for a given group of set operations, but being unpredictable
deters developers from mistakenly assuming they are ordered.

If you need order, you should use a tuple, list, or something like
https://grantjenks.com/docs/sortedcontainers/sortedset.html

Thanks, I can work round this behaviour.
But I'm curious: where does the variability come from?  Is it deliberate
(as your answer seems to imply)?  AFAIK the same code within the *same
run* of a program does produce identical results.
Best wishes
Rob Cliffe



--
https://mail.python.org/mailman/listinfo/python-list


Re: Non-deterministic set ordering

2022-05-15 Thread Paul Bryan
This may explain it:
https://stackoverflow.com/questions/27522626/hash-function-in-python-3-3-returns-different-results-between-sessions

On Mon, 2022-05-16 at 04:20 +0100, Rob Cliffe via Python-list wrote:
> 
> 
> On 16/05/2022 04:13, Dan Stromberg wrote:
> > 
> > On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list 
> >  wrote:
> > 
> >     I was shocked to discover that when repeatedly running the
> > following
> >     program (condensed from a "real" program) under Python 3.8.3
> > 
> >     for p in { ('x','y'), ('y','x') }:
> >      print(p)
> > 
> >     the output was sometimes
> > 
> >     ('y', 'x')
> >     ('x', 'y')
> > 
> >     and sometimes
> > 
> >     ('x', 'y')
> >     ('y', 'x')
> > 
> >     Can anyone explain why running identical code should result in
> >     traversing a set in a different order?
> > 
> > 
> > Sets are defined as unordered so that they can be hashed internally
> > to 
> > give O(1) operations for many tasks.
> > 
> > It wouldn't be unreasonable for sets to use a fixed-by-arbitrary 
> > ordering for a given group of set operations, but being
> > unpredictable 
> > deters developers from mistakenly assuming they are ordered.
> > 
> > If you need order, you should use a tuple, list, or something like 
> > https://grantjenks.com/docs/sortedcontainers/sortedset.html
> Thanks, I can work round this behaviour.
> But I'm curious: where does the variability come from?  Is it
> deliberate 
> (as your answer seems to imply)?  AFAIK the same code within the
> *same 
> run* of a program does produce identical results.
> Best wishes
> Rob Cliffe

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Non-deterministic set ordering

2022-05-15 Thread Rob Cliffe via Python-list



On 16/05/2022 04:13, Dan Stromberg wrote:


On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list 
 wrote:


I was shocked to discover that when repeatedly running the following
program (condensed from a "real" program) under Python 3.8.3

for p in { ('x','y'), ('y','x') }:
 print(p)

the output was sometimes

('y', 'x')
('x', 'y')

and sometimes

('x', 'y')
('y', 'x')

Can anyone explain why running identical code should result in
traversing a set in a different order?


Sets are defined as unordered so that they can be hashed internally to 
give O(1) operations for many tasks.


It wouldn't be unreasonable for sets to use a fixed-by-arbitrary 
ordering for a given group of set operations, but being unpredictable 
deters developers from mistakenly assuming they are ordered.


If you need order, you should use a tuple, list, or something like 
https://grantjenks.com/docs/sortedcontainers/sortedset.html

Thanks, I can work round this behaviour.
But I'm curious: where does the variability come from?  Is it deliberate 
(as your answer seems to imply)?  AFAIK the same code within the *same 
run* of a program does produce identical results.

Best wishes
Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list


Re: Non-deterministic set ordering

2022-05-15 Thread Dan Stromberg
On Sun, May 15, 2022 at 8:01 PM Rob Cliffe via Python-list <
python-list@python.org> wrote:

> I was shocked to discover that when repeatedly running the following
> program (condensed from a "real" program) under Python 3.8.3
>
> for p in { ('x','y'), ('y','x') }:
>  print(p)
>
> the output was sometimes
>
> ('y', 'x')
> ('x', 'y')
>
> and sometimes
>
> ('x', 'y')
> ('y', 'x')
>
> Can anyone explain why running identical code should result in
> traversing a set in a different order?
>

Sets are defined as unordered so that they can be hashed internally to give
O(1) operations for many tasks.

It wouldn't be unreasonable for sets to use a fixed-by-arbitrary ordering
for a given group of set operations, but being unpredictable deters
developers from mistakenly assuming they are ordered.

If you need order, you should use a tuple, list, or something like
https://grantjenks.com/docs/sortedcontainers/sortedset.html
-- 
https://mail.python.org/mailman/listinfo/python-list


Non-deterministic set ordering

2022-05-15 Thread Rob Cliffe via Python-list
I was shocked to discover that when repeatedly running the following 
program (condensed from a "real" program) under Python 3.8.3


for p in { ('x','y'), ('y','x') }:
    print(p)

the output was sometimes

('y', 'x')
('x', 'y')

and sometimes

('x', 'y')
('y', 'x')

Can anyone explain why running identical code should result in 
traversing a set in a different order?

Thanks
Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list