[issue36051] Drop the GIL during large bytes.join operations?

2020-02-03 Thread Inada Naoki
Change by Inada Naoki : -- stage: patch review -> resolved status: open -> closed ___ Python tracker ___ ___ Python-bugs-list

[issue36051] Drop the GIL during large bytes.join operations?

2020-02-03 Thread Inada Naoki
Inada Naoki added the comment: New changeset 869c0c99b94ff9527acc1ca060164ab3d1bdcc53 by Inada Naoki in branch 'master': bpo-36051: Fix compiler warning. (GH-18325) https://github.com/python/cpython/commit/869c0c99b94ff9527acc1ca060164ab3d1bdcc53 --

[issue36051] Drop the GIL during large bytes.join operations?

2020-02-03 Thread Inada Naoki
Change by Inada Naoki : -- pull_requests: +17698 stage: resolved -> patch review pull_request: https://github.com/python/cpython/pull/18325 ___ Python tracker ___

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-30 Thread Skip Montanaro
Skip Montanaro added the comment: I think to avoid compiler warnings about 'save' perhaps being used uninitialized, it should be initialized to NULL when declared on line 21 of Objects/stringlib/join.h. -- nosy: +skip.montanaro status: closed -> open Added file:

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-28 Thread Inada Naoki
Change by Inada Naoki : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___ ___

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-28 Thread Inada Naoki
Inada Naoki added the comment: New changeset d07d9f4c43bc85a77021bcc7d77643f8ebb605cf by Bruce Merry in branch 'master': bpo-36051: Drop GIL during large bytes.join() (GH-17757) https://github.com/python/cpython/commit/d07d9f4c43bc85a77021bcc7d77643f8ebb605cf --

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-16 Thread STINNER Victor
Change by STINNER Victor : -- nosy: -vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-16 Thread Bruce Merry
Bruce Merry added the comment: I think I've addressed the concerns that were raised in this bug, but let me know if I've missed any. -- ___ Python tracker ___

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-05 Thread Bruce Merry
Bruce Merry added the comment: I ran the test on a Xeon machine (Skylake-XP) and it also looks like performance is only improved from 1MB up (somewhat to my surprise, given how poor single-threaded memcpy performance is on that machine). So I've updated the pull request with that threshold.

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-05 Thread Bruce Merry
Bruce Merry added the comment: I've written a variant of the benchmark in which one thread does joins and the other does unrelated CPU-bound work that doesn't touch memory much. It also didn't show much benefit to thresholds below 512KB. I still want to test things on a server-class CPU,

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-02 Thread Inada Naoki
Inada Naoki added the comment: > (slowdowns because releasing/acquiring the GIL is not free, particularly when > contended) Yes, it's relatively high. We shouldn't release the GIL only for ~0.5ms. That's why 1MB~ seems nice threshold. > If the threshold is too low then users can always

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-02 Thread Bruce Merry
Bruce Merry added the comment: I'm realising that the benchmark makes it difficult to see what's going on because it doesn't separate overhead costs (slowdowns because releasing/acquiring the GIL is not free, particularly when contended) from cache effects (slowdowns due to parallel threads

[issue36051] Drop the GIL during large bytes.join operations?

2020-01-01 Thread Inada Naoki
Inada Naoki added the comment: > In the single-threaded case, the benchmark seems to show that for 64K+, > performance is improved by dropping the GIL (which I'm guessing must be > statistical noise, since there shouldn't be anything contending for it), > which is my reasoning behind the

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry
Bruce Merry added the comment: > Do you think it would be sufficient to change the stress test from joining > 1000 items to joining 10 items? Actually that won't work, because the existing stress test is using a non-empty separator. I'll add another version of that stress test that uses

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry
Bruce Merry added the comment: > I'll take a look at extra unit tests soon. Do you know off the top of your > head where to look for existing `join` tests to add to? Never mind, I found it:

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry
Bruce Merry added the comment: I've attached a benchmark script and CSV results for master (whichever version that was at the point I forked) and with unconditional dropping of the GIL. It shows up to 3x performance improvement when using 4 threads. That's on my home desktop, which is quite

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry
Change by Bruce Merry : Added file: https://bugs.python.org/file48813/benchjoin.py ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry
Change by Bruce Merry : Added file: https://bugs.python.org/file48812/new.csv ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-31 Thread Bruce Merry
Change by Bruce Merry : Added file: https://bugs.python.org/file48811/old.csv ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Please provide benchmarks that demonstrate the benefit of this change. We also need to add a test for join() which covers the new code. -- ___ Python tracker

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Bruce Merry
Bruce Merry added the comment: If we want to be conservative, we could only drop the GIL if all the buffers pass the PyBytes_CheckExact test. Presumably that won't encounter any of these problems because bytes objects are immutable? -- ___ Python

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Inada Naoki
Inada Naoki added the comment: > 2. If the thread tries to change the size of the bytearrays during the join > (ba1 += b'123'), it'll die with a BufferError that wasn't previously possible Makes sense. We shouldn't drop GIL while having buffer of arbitrary objects. --

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Josh Rosenberg
Josh Rosenberg added the comment: This will introduce a risk of data races that didn't previously exist. If you do: ba1 = bytearray(b'\x00') * 5 ba2 = bytearray(b'\x00') * 5 ... pass references to thread that mutates them ... ba3 = b''.join((ba1, ba2)) then two

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-30 Thread Bruce Merry
Change by Bruce Merry : -- keywords: +patch pull_requests: +17193 stage: -> patch review pull_request: https://github.com/python/cpython/pull/17757 ___ Python tracker ___

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-22 Thread Bruce Merry
Bruce Merry added the comment: > It seems we can release GIL during iterating the buffer array. That's what I had in mind. Naturally it would require a bit of benchmarking to pick a threshold such that the small case doesn't lose performance due to locking overheads. If no one else is

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-22 Thread Inada Naoki
Inada Naoki added the comment: https://github.com/python/cpython/blob/068768faf6b82478de239d7ab903dfb249ad96a4/Objects/stringlib/join.h#L105-L126 It seems we can release GIL during iterating the buffer array. Even though there is no one big chunk, it would be beneficial if the output size is

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-21 Thread Antoine Pitrou
Change by Antoine Pitrou : -- nosy: +inada.naoki ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-21 Thread Antoine Pitrou
Antoine Pitrou added the comment: If there is a large chunk (e.g. several MBs), dropping the GIL during the memcpy of that chunk may be beneficial. This kind of optimization may be applicable in other similar cases (such as extending a bytearray or a BytesIO object). -- nosy:

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-21 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Could you show evidences that dropping the GIL can help you? bytes.join() needs to perform operations which needs to hold the GIL (allocating the memory, iterating the list, getting the data of bytes-like objects). I afraid that the cost of memcpy() is a

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-20 Thread Batuhan
Change by Batuhan : -- versions: +Python 3.9 -Python 3.8 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36051] Drop the GIL during large bytes.join operations?

2019-12-20 Thread Batuhan
Change by Batuhan : -- nosy: +BTaskaya, vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36051] Drop the GIL during large bytes.join operations?

2019-02-20 Thread SilentGhost
Change by SilentGhost : -- title: (Performance) Drop the GIL during large bytes.join operations? -> Drop the GIL during large bytes.join operations? type: -> performance versions: +Python 3.8 -Python 3.7 ___ Python tracker