[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-19 Thread Bruce Merry


Bruce Merry  added the comment:

+tzickel I'd suggest reading the discussion in issue 36051, and maybe raising a 
new issue about it if you still have concerns. In short, dropping the GIL in 
more bytes.join cases wouldn't necessarily be wrong, but it might break code 
that made the assumption that bytes.join is atomic even though that's never 
been claimed.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-18 Thread tzickel


tzickel  added the comment:

Regarding getting the buffer and releasing the GIL, if it's wrong, why not fix 
other places in the code that do it, like:

https://github.com/python/cpython/blob/611836a69a7a98bb106b4d315ed76a1e17266f4f/Modules/posixmodule.c#L9619

The GIL is released, the syscall might be blocking, and iov contains pointers 
to buffers ?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Inada Naoki


Change by Inada Naoki :


--
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread tzickel


tzickel  added the comment:

My mistake...

--
resolution:  -> not a bug

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Ma Lin


Ma Lin  added the comment:

I also planned to review this commit at some moment, I feel a bit unsteady 
about it.

If an optimization needs to be fine-tuned, and may introduces some pitfalls for 
future code maintenance, IMHO it is best to avoid doing this kind of 
optimization.

--
nosy: +Ma Lin

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Bruce Merry


Bruce Merry  added the comment:

> static_buffers is not a static variable. It is auto local variable.
> So I think other thread don't hijack it.

Oh yes, quite right. I should have looked closer at the code first before 
commenting. I think this can be closed as not-a-bug, unless +tzickel has 
example code that gives the wrong output?

> perhaps add an if to check if the backing object is really mutable ? 
> (Py_buffer.readonly)

It's not just the buffer data being mutable that's an issue, it's the owning 
object. It's possible for an object to expose a read-only buffer, but also 
allow the buffer (including its size or address) to be mutated through its own 
API.

> Also, semi related, (dunno where to discuss it), would a good .join() 
> optimization be to add an optional length parameter, like .join(iterable, 
> length=10)

You could always open a separate bug for it, but I can't see it catching on 
given that one needs to modify one's code for it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Inada Naoki


Inada Naoki  added the comment:

>
>
>
> perhaps add an if to check if the backing object is really mutable ?
> (Py_buffer.readonly)
>
>

Py_buffer.readonly doesn't mean immutable.  You can create read only buffer
from bytearray too.

Current logic uses PyBytes_CheckExact.  It is safe and enough for most
cases.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Inada Naoki


Inada Naoki  added the comment:

static_buffers is not a static variable. It is auto local variable.
So I think other thread don't hijack it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread tzickel


tzickel  added the comment:

Also, semi related, (dunno where to discuss it), would a good .join() 
optimization be to add an optional length parameter, like .join(iterable, 
length=10), and when running in that code-path, it would skip all the calls to 
(PySequence_Fast which converts no list to list), and all the pre calculation 
of length to calculate allocation size, and instead directly start copying from 
input until length is done, and if the iterable didn't have enough length to 
fill up, only then throw an exception ?

There are places where you know how much information you expect to be .joining 
(or you want to join just a part of it) ?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread tzickel


tzickel  added the comment:

Also, in line:

https://github.com/python/cpython/blob/d07d9f4c43bc85a77021bcc7d77643f8ebb605cf/Objects/stringlib/join.h#L85

perhaps add an if to check if the backing object is really mutable ? 
(Py_buffer.readonly)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread Bruce Merry


Bruce Merry  added the comment:

Good catch! I'll take a look this week to see what makes sense for the use case 
for which I originally proposed this optimisation.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread tzickel


Change by tzickel :


--
nosy: +bmerry, inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join

2020-03-16 Thread tzickel


New submission from tzickel :

bpo 36051 added optimization to release GIL on certain conditions of bytes 
joining, but it has missed a critical path.

If the number of items joining is less or equal to NB_STATIC_BUFFERS (10) than 
static_buffers will be used to hold the buffers.

https://github.com/python/cpython/blob/5b66ec166b81c8a77286da2c0d17be3579c3069a/Objects/stringlib/join.h#L54

But then the decision to release the GIL or not (drop_gil) does not take this 
into consideration, and the GIL might be released and then another thread is 
free to do the same code path, and hijack the static_buffers for it's own 
usage, causing a race condition.

A decision should be made if it's worth for the optimization to not use the 
static buffers in this case (although it's an early part of the code...) or not 
drop the GIL anyhow if it's static buffers (a thing which might make this 
optimization not worth it, since based on length of data to join, and not 
number of items to join).

--
messages: 364288
nosy: tzickel
priority: normal
severity: normal
status: open
title: A race condition with GIL releasing exists in stringlib_bytes_join
versions: Python 3.9

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com