[issue45435] delete misleading faq entry about atomic operations

2021-10-12 Thread Ben


Ben  added the comment:

The problem with the FAQs is that it's over-simplifying things to the point 
where it can sometimes mislead.

Notably, it says the GIL protects these operations; but as Antoine points out,  
many operations on datatypes drop back into Python (including potential decrefs)

Concerns about non-atomic evaluation of the composition of operations in these 
statements is mostly due to the way the FAQ is presented,  it should be made 
clearer *which* operations it's describing to be atomic.
(Otherwise you get questions like "is x = L[x] atomic?")

graingert said the following might be useful, so:
Going through each of the points of the FAQ...

The following seem relatively straight-forward and non-controversial(?):
x = L[i]
x = L.pop()
x = y
L.append(x)
L1.extend(L2)

I'm not even sure what it *means* when it says the following:
D.keys()

The following probably have some caveats:
D[x] = y

These appear to be the suspect ones:
D1.update(D2)
L.sort()
L[i:j] = L2
x.field = y

Exploring each in more detail...

dict.keys is just a mystery to me, maybe this mattered in Python 2 but these 
are view objects now, or maybe I am missing something?

dict.__setitem__ needs clarification really, surely the actual setting of the 
item is "atomic" in that other threads will either see the dict with or without 
the item and not halfway through some resizing operation or something, but in 
doing the setting it may trigger many __eq__ calls on the other keys
(either during the resize itself, or just during probing).

The dict.update case seems like it should hold if both dicts have keys made of 
only other builtin types so that the GIL can continue to protect.  If the keys 
of either are custom objects with their own __eq__ then the "atomicity" of the 
operation is in question
as the __eq__ can happen "during" the update.
Imagine two update()s to the same dict,  if the keys have custom __eq__'s then 
the (concurrent) composition of the two may give some mix of the two 
dictionaries overlapping keys.
(Note that the __hash__ doesn't matter as it is not recomputed on an update)

For list.sort it's more subtle,
there is built-in protection to make it somewhat atomic
which means that append()s and extend()s shouldn't be lost
but concurrent threads might suddenly see the list be emptied and concurrent 
len()/L.pop() see sequentially inconsistent results.

For list.__setitem__ it's clear it's non-atomic in the case that the elements 
of the list are custom objects with their own __del__, and the FAQ does infact 
mention this case (at the bottom).

Attribute assignment is odd,  I can't see how that can be described as "atomic" 
for arbitrary objects. There is no way the FAQ really means that x and y are 
instances of `object`.

There are questions about operations that are potentially missing(?) from the 
list:
len(L)
D1.copy()
L1 += L2  (or does "extend" cover this too?)
... etc,  and other datatypes (tuples are an obvious question here)

It's not clear why the FAQ picked these exact operations out specifically.

Fundamentally this FAQ tries to be both a language definition ("You can rely on 
these operations being atomic") but also somewhat of an 
implementation-dependent description ("this is what is true in CPython").
Perhaps the best long-term solution would be to remove this "FAQ" and either 
move more detailed discussion about atomicity guarantees for various operations 
to the actual docs for the built-in data structures or to relax the guarantees 
the language gives -- asking people to use mutexes/async libraries more and 
only guaranteeing enough to make those cases work.

--
nosy: +bjs

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-12 Thread Jeff Allen


Jeff Allen  added the comment:

Thomas wrote:
> it's as part of this discussion in 
> https://mail.python.org/archives/list/python-...@python.org/thread/ABR2L6BENNA6UPSPKV474HCS4LWT26GY/#IAOCDDCJ653NBED3G2J2YBWD7HHPFHT6
>  and others in #python-dev 

That's where I noticed it, but it seemed the wrong place to explore this way.

Steven is right, I'm over-stating the case. And although valid that this is 
CPython specific, it's well sign-posted and I'm just being thin-skinned.

Serhiy writes:
> sort() is atomic, even if GIL is released during executing custom __lt__. It 
> is guaranteed that no operations on the list in other threads can affect the 
> result of sort().

The strategy noted here: 
https://github.com/python/cpython/blob/2d21612f0dd84bf6d0ce35bcfcc9f0e1a41c202d/Objects/listobject.c#L2261-L2265
does guarantee that, which I hadn't noticed. What if during the release of the 
GIL, another thread appends to L? In my simple experiment I get a ValueError 
and the modifications are lost. I think that is not thread-safe.

Serhiy also writes:

> I do not understand what non-atomic you see in x = L[i]. The value of x is 
> determined by values of L and i at the start of the operation. GIL is not 
> released during indexing L, and if it is released between indexing and 
> assignment, it does not affect the result.

and Steven:

> Does that matter though? I think that's a distinction that makes no 
difference.

> We know that another thread could change the L or the i before the 
assignment, if they are global. But once the L[i] lookup has occurred, 
it doesn't matter if they change. It's not going to affect what value 
gets bound to the x.

Fair enough. Atomicity is a bit slippery, I find. It depends where the critical 
region starts. Thinking again, it's not the assignment that's the issue ...

L is pushed
i is pushed
__getitem__ is called
x is popped

It is possible, if i and L are accessible to another thread and change after L 
is pushed, that x is given a value composed from an i and an L that never 
existed concurrently in the view of the other thread. Straining at gnats here, 
but atomicity is a strong claim.

And on the point about re-ordering and CPUs, I can't imagine re-ordering that 
effectively changes the order of byte codes. But do CPython threads run in 
separate CPUs, or is that only when we have multiple interpreters? If so, and L 
were in a hot memory location (either the variable or its content), this could 
be inconsistent between threads. Sorry, I don't know the memory coherence 
CPython has: I know I couldn't rely on it in Java.

I'm just arguing that the section gives advice that is *nearly* always right, 
which is a horrible thing to debug. I'll stop stirring.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-12 Thread Serhiy Storchaka


Serhiy Storchaka  added the comment:

sort() is atomic, even if GIL is released during executing custom __lt__. It is 
guaranteed that no operations on the list in other threads can affect the 
result of sort().

I do not understand what non-atomic you see in x = L[i]. The value of x is 
determined by values of L and i at the start of the operation. GIL is not 
released during indexing L, and if it is released between indexing and 
assignment, it does not affect the result.

The FAQ answer is specially about built-in types, it is not related to types 
with overwritten __getitem__ etc.

The most questionable examples are dict operations. But even they provide some 
kind of atomacity. But you perhaps need to know internals to understand 
limitations.

We perhaps should explicitly document what non-trivial operations are atomical 
(for example list and dict copying is atomic) and whether atomacity is the part 
of the language specification or CPython implementation detail. In many places 
in the stdlib the code relies on GIL for atomacity.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-12 Thread Steven D'Aprano


Steven D'Aprano  added the comment:

Jeff makes an excellent point about the docs failing to distinguish 
between language guarantees, implementation guarantees, and things which 
are merely true sometimes.

On the other hand, we only need document what is true *now*, not what 
may be true in some long distant future.

On Tue, Oct 12, 2021 at 07:42:07AM +, Jeff Allen wrote:

> 2. In x = L[i], the index and assignment are distinct actions (in 
> today's byte code), allowing L or i to change before x is assigned.

Does that matter though? I think that's a distinction that makes no 
difference.

We know that another thread could change the L or the i before the 
assignment, if they are global. But once the L[i] lookup has occurred, 
it doesn't matter if they change. It's not going to affect what value 
gets bound to the x.

So in a practical sense, we can say that once the lookup L[i] has 
occurred, the binding might as well be atomic. I think that's what the 
entry is trying to say. Have I missed something?

> 3. A compiler (even a CPU) is free to re-order operations and cache 
> values in unguessable ways, on the assumption of a single thread.

The CPU doesn't operate at the level of Python byte code though, and 
there are limits to what the compiler will do. It's not going to reorder 
things in ways that change the semantics of the code (that would be a 
compiler bug). Its not going to reorder this code:

x = 1
print(x)
x = 2

so that "2" gets printed. So I don't see that this objection is 
relevant.

> 4. Code written on these principals is fragile. It only takes the 
> replacement of a built-in with sub-class redefining __getitem__ (to 
> support some worthy aim elsewhere in the code) to invalidate it.

The FAQ entry could be more forceful that it is only talking about 
certain built-in types, and once you change things to another type, the 
promises may no longer hold.

But we should not hold it against the FAQs that the promises made about 
one type don't apply to other types.

> 5. sort() is not atomic if an element is of a type that overrides 
> comparison in Python. (Nor is modifying a dictionary if __hash__ or 
> __eq__ are redefined.)

Indeed, and the FAQ should be more forceful about that proviso.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-12 Thread Antoine Pitrou


Antoine Pitrou  added the comment:

I'm also surprised to learn that `L.sort()` and `D1.update(D2)` are supposed to 
be atomic. They certainly are not in the general case.

Remember, any Python code can release the GIL (because the GIL is released 
periodically in the interpreter loop). Any DECREF can also release the GIL 
(because it may trigger the execution of arbitrary destructors). This restricts 
a lot which operations can be safely considered atomic.

--
nosy: +pablogsal, pitrou, serhiy.storchaka
type:  -> behavior
versions: +Python 3.10, Python 3.11, Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-12 Thread Thomas Grainger


Thomas Grainger  added the comment:

it's as part of this discussion in 
https://mail.python.org/archives/list/python-...@python.org/thread/ABR2L6BENNA6UPSPKV474HCS4LWT26GY/#IAOCDDCJ653NBED3G2J2YBWD7HHPFHT6
 and others in #python-dev 

specifically 
https://github.com/python/cpython/blob/2f92e2a590f0e5d2d3093549f5af9a4a1889eb5a/Objects/dictobject.c#L2582-L2586

regarding if any of the items are builtins or not: the faq entry lists (L, L1, 
L2 are lists, D, D1, D2 are dicts, x, y are objects, i, j are ints) so I read 
that to mean x and y are user defined objects with user defined comparison and 
equality methods

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-12 Thread Jeff Allen


Jeff Allen  added the comment:

I'm interested in Thomas' reasons, but here are some of mine (as far as I 
understand things):

1. It is specific to one interpreter implemented in C, equipped with a GIL, and 
on certain assumptions about the byte code interpreter and the implementation 
of built-ins, that may not hold long-term. 

2. In x = L[i], the index and assignment are distinct actions (in today's byte 
code), allowing L or i to change before x is assigned. This applies to multiple 
other of the examples.

3. A compiler (even a CPU) is free to re-order operations and cache values in 
unguessable ways, on the assumption of a single thread.

4. Code written on these principals is fragile. It only takes the replacement 
of a built-in with sub-class redefining __getitem__ (to support some worthy aim 
elsewhere in the code) to invalidate it.

5. sort() is not atomic if an element is of a type that overrides comparison in 
Python. (Nor is modifying a dictionary if __hash__ or __eq__ are redefined.)
 

If you want retain the question, with a better answer, the last sentence is 
good: "When in doubt, use a mutex!", accompanied by "Always be in doubt."

--
nosy: +jeff.allen

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-11 Thread Steven D'Aprano


New submission from Steven D'Aprano :

Why do you say that the FAQ is misleading?

If it is misleading, it should be replaced with a more correct answer, not just 
deleted.

--
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-11 Thread Thomas Grainger


Change by Thomas Grainger :


--
keywords: +patch
pull_requests: +27181
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/28886

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45435] delete misleading faq entry about atomic operations

2021-10-11 Thread Thomas Grainger


Change by Thomas Grainger :


--
assignee: docs@python
components: Documentation
nosy: docs@python, graingert
priority: normal
severity: normal
status: open
title: delete misleading faq entry about atomic operations

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com