I'm suspicious of pyperformance testing for this reason:
The point of Python is operating OK despite GIL because "most of the time is
spent in 'external' libraries."
Pyperformance tests "typical" python performance where supposedly most tests
are "ok" despite GIL. You need multithreading in
On Mon, Apr 25, 2022 at 2:33 PM Brett Cannon wrote:
>
>
> On Sat, Apr 23, 2022 at 8:31 AM wrote:
>
>> Hello all,
>>
>> I am very excited about a future multithreaded Python. I managed to
>> postpone some rewrites in the company I work for Rust/Go, precisely because
>> of the potential to have a
On Sat, Apr 23, 2022 at 8:31 AM wrote:
> Hello all,
>
> I am very excited about a future multithreaded Python. I managed to
> postpone some rewrites in the company I work for Rust/Go, precisely because
> of the potential to have a Python solution in the medium term.
>
> I was wondering. Is Sam
Hello all,
I am very excited about a future multithreaded Python. I managed to postpone
some rewrites in the company I work for Rust/Go, precisely because of the
potential to have a Python solution in the medium term.
I was wondering. Is Sam Gross' nogil merge being seriously considered by the
Sam> I think the performance difference is because of different
versions of NumPy.
Thanks all for the help/input/advice. It never occurred to me that two
relatively recent versions of numpy would differ so much for the
simple tasks in my script (array creation & transform). I confirmed
this by
> I think the performance difference is because of different versions of
> NumPy.
>
Good reason to leave numpy completely out of it. Unless you want to test
nogil’s performance effects on numpy code — an interesting exercise in
itself.
Also — sorry I didn’t look at your code before, but you
Hi Skip,
I think the performance difference is because of different versions of
NumPy. Python 3.9 installs NumPy 1.21.3 by default for "pip install numpy".
I've only built and packaged NumPy 1.19.4 for "nogil" Python. There are
substantial performance differences between the two NumPy builds for
> Remember that py stone is a terrible benchmark.
I understand that. I was only using it as a spot check. I was surprised at
how much slower my (threaded or unthreaded) matrix multiply was on nogil vs
3.9+. I went into it thinking I would see an improvement. The Performance
section of Sam's
Remember that py stone is a terrible benchmark. It only exercises a few
byte codes and a modern CPU’s caching and branch prediction make minced
meat of those. Sam wrote a whole new register-based VM so perhaps that
exercises different byte codes.
On Sun, Oct 31, 2021 at 05:19 Skip Montanaro
Skip> 1. I use numpy arrays filled with random values, and the output array
is also a numpy array. The vector multiplication is done in a simple for
loop in my vecmul() function.
CHB> probably doesn't make a difference for this exercise, but numpy arrays
make lousy replacements for a regular
On Fri, Oct 29, 2021 at 6:10 AM Skip Montanaro
wrote:
> 1. I use numpy arrays filled with random values, and the output array is
> also a numpy array. The vector multiplication is done in a simple for loop
> in my vecmul() function.
>
probably doesn't make a difference for this exercise, but
>
> Did you try running the same code with stock Python?
>
> One reason I ask is the IIUC, you are using numpy for the individual
> vector operations, and numpy already releases the GIL in some
> circumstances.
>
I had not run the same code with stock Python (but see below). Also, I only
used
Thanks Skip — nice to see some examples.
Did you try running the same code with stock Python?
One reason I ask is the IIUC, you are using numpy for the individual
vector operations, and numpy already releases the GIL in some
circumstances.
It would also be fun to see David Beezley’s example
Guido> To be clear, Sam’s basic approach is a bit slower for
single-threaded code, and he admits that. But to sweeten the pot he has
also applied a bunch of unrelated speedups that make it faster in general,
so that overall it’s always a win. But presumably we could upstream the
latter easily,
Mohamed> I love everything about this - but I expect some hesitancy
due to this "Multithreaded programs are prone to concurrency bugs.".
Paul> The way I see it, the concurrency model to be used is selected
by developers. They can choose between ...
I think the real intent of the statement
The way I see it, the concurrency model to be used is selected by
developers. They can choose between multi-threading, multi-process, or
asyncio, or even a hybrid. If developers select multithreading, then
they carry the burden of ensuring mutual exclusion and avoiding race
conditions, dead locks,
I love everything about this - but I expect some hesitancy due to this
"Multithreaded programs are prone to concurrency bugs.".
If there is significant pushback, I have one suggestion:
Would it be helpful to think of the python concurrency mode as a property of
interpreters?
`interp =
> Still, I hope you at least enjoyed my enthusiasm!
I did!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
Oops! Sorry everybody, I meant that to be off-list.
Still, I hope you at least enjoyed my enthusiasm!
/arry
On Tue, Oct 12, 2021, 12:55 Larry Hastings wrote:
>
> (off-list)
>
>
> On 10/11/21 2:09 PM, Sam Gross wrote:
>
> The ccbench results look pretty good: about 18.1x speed-up on "pi
>
(off-list)
On 10/11/21 2:09 PM, Sam Gross wrote:
The ccbench results look pretty good: about 18.1x speed-up on "pi
calculation" and 19.8x speed-up on "regular expression" with 20
threads (turbo off). The latency and throughput results look good too.
JESUS CHRIST
//arry/
Thank you Sam, this additional detail really helps me understand your proposal.
-Barry
> On Oct 11, 2021, at 12:06, Sam Gross wrote:
>
> I’m unclear what is actually retried. You use this note throughout the
> document, so I think it would help to clarify exactly what is retried and why
>
I've updated the linked gists with the results from interpreters compiled
with PGO, so the numbers have slightly changed.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
On Mon, Oct 11, 2021 at 7:04 AM Antoine Pitrou wrote:
> It's crude, but you can take a look at `ccbench` in the Tools directory.
>
Thanks, I wasn't familiar with this. The ccbench results look pretty good:
about 18.1x speed-up on "pi calculation" and 19.8x speed-up on "regular
expression" with
I have a PR to remove this FAQ entry:
https://github.com/python/cpython/pull/28886
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
As far as I understand we should get a smaller improvement on single thread
because some of the optimizations listed in this work are partially or
totally implemented.
This is excluding any non linear behaviour between the different
optimizations of course, and assuming that both versions yield
On Mon, Oct 11, 2021 at 12:58 PM Thomas Grainger wrote:
> Is D1.update(D2) still atomic with this implementation?
> https://docs.python.org/3.11/faq/library.html#what-kinds-of-global-value-mutation-are-thread-safe
>
No. For example, another thread reading from the dict concurrently may
observe
When you mean "an order of magnitude less overhead than the current CPython
implementation" do you mean compared with the main branch? We recently
implemented already almost everything is listed in this paragraph:
https://github.com/python/cpython/pull/27077
We also pack some extra similar
>
> I’m unclear what is actually retried. You use this note throughout the
> document, so I think it would help to clarify exactly what is retried and
> why that solves the particular problem. I’m confused because, is it the
> refcount increment that’s retried or the entire sequence of steps
> On 11 Oct 2021, at 18:58, Thomas Grainger wrote:
>
> Is D1.update(D2) still atomic with this implementation?
> https://docs.python.org/3.11/faq/library.html#what-kinds-of-global-value-mutation-are-thread-safe
>
>
On Fri, Oct 8, 2021 at 11:35 AM Chris Jerdonek
wrote:
> Is it also slower even when running with PYTHONGIL=1? If it could be made
> the same speed for single-threaded code when running in GIL-enabled mode,
> that might be an easier intermediate target while still adding value.
>
Running with
Is D1.update(D2) still atomic with this implementation?
https://docs.python.org/3.11/faq/library.html#what-kinds-of-global-value-mutation-are-thread-safe
On Mon, 11 Oct 2021, 17:54 Sam Gross, wrote:
> On Fri, Oct 8, 2021 at 12:04 PM Nathaniel Smith wrote:
>
>> I notice the fb.com address -- is
On Fri, Oct 8, 2021 at 12:04 PM Nathaniel Smith wrote:
> I notice the fb.com address -- is this a personal project or something
> facebook is working on? what's the relationship to Cinder, if any?
>
It is a Facebook project, at least in the important sense that I work on it
as an employee at
On Thu, 7 Oct 2021 15:52:56 -0400
Sam Gross wrote:
> Hi,
>
> I've been working on changes to CPython to allow it to run without the
> global interpreter lock. I'd like to share a working proof-of-concept that
> can run without the GIL. The proof-of-concept involves substantial changes
> to
Congrats on this impressive work Sam. I enjoyed the thorough write up of the
design. There’s one aspect that I don’t quite understand. Maybe I missed the
explanation. For example:
```
• Load the address of the item
• Increment the reference count of the item, if it is
On Sun, Oct 10, 2021 at 2:31 PM Dan Stromberg wrote:
>
>
> On Thu, Oct 7, 2021 at 9:10 PM Chris Angelico wrote:
>>
>> Concurrency is *hard*. There's no getting around it, there's no
>> sugar-coating it. There are concepts that simply have to be learned,
>> and the failures can be extremely hard
On Thu, Oct 7, 2021 at 9:10 PM Chris Angelico wrote:
> Concurrency is *hard*. There's no getting around it, there's no
> sugar-coating it. There are concepts that simply have to be learned,
> and the failures can be extremely hard to track down. Instantiating an
> object on the wrong thread can
On Fri, Oct 8, 2021 at 8:55 PM Sam Gross wrote:
> the "nogil" interpreter stays within the same interpreter loop for many
> Python function calls, while upstream CPython
> recursively calls into _PyEval_EvalFrameDefault.
>
Not for much longer though. https://github.com/python/cpython/pull/28488
On Fri, Oct 8, 2021 at 12:24 PM Pablo Galindo Salgado
wrote:
> When you mean "an order of magnitude less overhead than the current
> CPython implementation" do you mean compared with the main branch? We
> recently implemented already almost everything is listed in this paragraph.
>
I think I
On Fri, Oct 8, 2021 at 12:55 PM Daniel Pope wrote:
> I'm a novice C programmer, but I'm unsure about the safety of your
> thread-safe collections description.
>
The "list" class uses a slightly different strategy than "dict", which I
forgot about
when writing the design overview. List relies on
On 10/7/21 8:52 PM, Sam Gross wrote:
I've been working on changes to CPython to allow it to run without the
global interpreter lock.
Before anybody asks: Sam contacted me privately some time ago to pick my
brain a little. But honestly, Sam didn't need any help--he'd already
taken the
On Fri, 8 Oct 2021 at 03:50, Sam Gross wrote:
> My goal with the proof-of-concept is to demonstrate that removing the GIL is
> feasible and worthwhile, and that the technical ideas of the project could
> serve as a basis of such an effort.
I'm a novice C programmer, but I'm unsure about the
>
> To speed-up function calls, the interpreter uses a linear, resizable stack
> to store function call frames, an idea taken from LuaJIT. The stack stores
> the interpreter registers (local variables + space for temporaries) plus
> some extra information per-function call. This avoids the need
On Thu, Oct 7, 2021 at 7:54 PM Sam Gross wrote:
> Design overview:
> https://docs.google.com/document/d/18CXhDb1ygxg-YXNBJNzfzZsDFosB5e6BfnXLlejd9l0/edit
Whoa, this is impressive work.
I notice the fb.com address -- is this a personal project or something
facebook is working on? what's the
On Fri, Oct 8, 2021 at 8:11 AM Guido van Rossum wrote:
> To be clear, Sam’s basic approach is a bit slower for single-threaded
> code, and he admits that.
>
Is it also slower even when running with PYTHONGIL=1? If it could be made
the same speed for single-threaded code when running in
To be clear, Sam’s basic approach is a bit slower for single-threaded code,
and he admits that. But to sweeten the pot he has also applied a bunch of
unrelated speedups that make it faster in general, so that overall it’s
always a win. But presumably we could upstream the latter easily,
separately
> On 8 Oct 2021, at 10:13, Steven D'Aprano wrote:
>
> Hi Sam,
>
> On Thu, Oct 07, 2021 at 03:52:56PM -0400, Sam Gross wrote:
>
>> I've been working on changes to CPython to allow it to run without the
>> global interpreter lock. I'd like to share a working proof-of-concept that
>> can run
Hi Sam,
On Thu, Oct 07, 2021 at 03:52:56PM -0400, Sam Gross wrote:
> I've been working on changes to CPython to allow it to run without the
> global interpreter lock. I'd like to share a working proof-of-concept that
> can run without the GIL.
Getting Python to run without the GIL has never
On Fri, Oct 8, 2021 at 1:51 PM Sam Gross wrote:
>
> Hi,
>
> I've been working on changes to CPython to allow it to run without the global
> interpreter lock. I'd like to share a working proof-of-concept that can run
> without the GIL. The proof-of-concept involves substantial changes to CPython
48 matches
Mail list logo