Hi everyone,
CPython is slow. We all know that, yet little is done to fix it.
I'd like to change that.
I have a plan to speed up CPython by a factor of five over the next few
years. But it needs funding.
I am aware that there have been several promised speed ups in the past
that have
No problem, I did not think you were attacking me or find your
response rude.
On Wed, May 18, 2016, at 01:06 PM, Cesare Di Mauro wrote:
> If you feel like I've attacked you, I apologize: it wasn't my
> intention. Please, don't get it personal: I only reported my honest
> opinion, albeit after
If you feel like I've attacked you, I apologize: it wasn't my intention.
Please, don't get it personal: I only reported my honest opinion, albeit
after a re-read it looks too rude, and I'm sorry for that.
Regarding the post-bytecode optimization issues, they are mainly
represented by the constant
Your criticisms may very well be true. IIRC though, I wrote that pass
because what was available was not general enough. The stackdepth_walk
function made assumptions that, while true of code generated by the
current cpython frontend, were not universally true. If a goal is to
move this
2016-05-17 8:25 GMT+02:00 :
> In the project https://github.com/zachariahreed/byteasm I mentioned on
> the list earlier this month, I have a pass that to computes stack usage
> for a given sequence of bytecodes. It seems to be a fair bit more
> agressive than cpython. Maybe
In the project https://github.com/zachariahreed/byteasm I mentioned on
the list earlier this month, I have a pass that to computes stack usage
for a given sequence of bytecodes. It seems to be a fair bit more
agressive than cpython. Maybe it's more generally useful. It's pure
python rather than C
2016-05-16 17:55 GMT+02:00 Meador Inge :
> On Sun, May 15, 2016 at 2:23 AM, Cesare Di Mauro <
> cesare.di.ma...@gmail.com> wrote:
>
>
>> Just one thing that comes to my mind: is the stack depth calculation
>> routine changed? It was suboptimal, and calculating a better number
On Sun, May 15, 2016 at 2:23 AM, Cesare Di Mauro
wrote:
> Just one thing that comes to my mind: is the stack depth calculation
> routine changed? It was suboptimal, and calculating a better number
> decreases stack allocation, and increases the frame usage.
>
This is
2016-02-01 17:54 GMT+01:00 Yury Selivanov :
> Thanks for bringing this up!
>
> IIRC wpython was about using "fat" bytecodes, i.e. using 64bits per
> bytecode instead of 8.
No, it used 16, 32, and 48-bit per opcode (1, 2, or 3 16-bit words).
> That allows to minimize
2016-02-02 10:28 GMT+01:00 Victor Stinner :
> 2016-01-27 19:25 GMT+01:00 Yury Selivanov :
> > tl;dr The summary is that I have a patch that improves CPython
> performance
> > up to 5-10% on macro benchmarks. Benchmarks results on Macbook Pro/Mac
On 3 February 2016 at 03:52, Brett Cannon wrote:
> Fifth, if we manage to show that a C API can easily be added to CPython to
> make a JIT something that can simply be plugged in and be useful, then we
> will also have a basic JIT framework for people to use. As I said, our use
Also, modern compiler technology tends to use "infinite register" machines
for the intermediate representation, then uses register coloring to assign
the actual registers (and generate spill code if needed). I've seen work on
inter-function optimization for avoiding some register loads and stores
On Tue, 2 Feb 2016 at 01:29 Victor Stinner wrote:
> Hi,
>
> I'm back for the FOSDEM event at Bruxelles, it was really cool. I gave
> talk about FAT Python and I got good feedback. But friends told me
> that people now have expectations on FAT Python. It looks like
On 2016-02-02 4:28 AM, Victor Stinner wrote:
[..]
I take a first look at your patch and sorry,
Thanks for the initial code review!
I'm skeptical about the
design. I have to play with it a little bit more to check if there is
no better design.
So far I see two things you are worried
Hi,
I'm back for the FOSDEM event at Bruxelles, it was really cool. I gave
talk about FAT Python and I got good feedback. But friends told me
that people now have expectations on FAT Python. It looks like people
care of Python performance :-)
FYI the slides of my talk:
On 02.02.2016 00:27, Greg Ewing wrote:
Sven R. Kunze wrote:
Are there some resources on why register machines are considered
faster than stack machines?
If a register VM is faster, it's probably because each register
instruction does the work of about 2-3 stack instructions,
meaning less
On 01.02.2016 19:28, Brett Cannon wrote:
A search for [stack vs register based virtual machine] will get you
some information.
Alright. :) Will go for that.
You aren't really supposed to yet. :) In Pyjion's case we are still
working on compatibility, let alone trying to show a speed
On Mon, 1 Feb 2016 at 09:08 Yury Selivanov wrote:
>
>
> On 2016-01-29 11:28 PM, Steven D'Aprano wrote:
> > On Wed, Jan 27, 2016 at 01:25:27PM -0500, Yury Selivanov wrote:
> >> Hi,
> >>
> >>
> >> tl;dr The summary is that I have a patch that improves CPython
> >>
On 01.02.2016 18:18, Brett Cannon wrote:
On Mon, 1 Feb 2016 at 09:08 Yury Selivanov > wrote:
On 2016-01-29 11:28 PM, Steven D'Aprano wrote:
> On Wed, Jan 27, 2016 at 01:25:27PM -0500, Yury Selivanov wrote:
>> Hi,
On Mon, 1 Feb 2016 at 10:21 Sven R. Kunze wrote:
>
>
> On 01.02.2016 18:18, Brett Cannon wrote:
>
>
>
> On Mon, 1 Feb 2016 at 09:08 Yury Selivanov <
> yselivanov...@gmail.com> wrote:
>
>>
>>
>> On 2016-01-29 11:28 PM, Steven D'Aprano wrote:
>> > On Wed,
On 01/02/2016 16:54, Yury Selivanov wrote:
On 2016-01-29 11:28 PM, Steven D'Aprano wrote:
On Wed, Jan 27, 2016 at 01:25:27PM -0500, Yury Selivanov wrote:
Hi,
tl;dr The summary is that I have a patch that improves CPython
performance up to 5-10% on macro benchmarks. Benchmarks results on
Sven R. Kunze wrote:
Are there some resources on why register machines are considered faster
than stack machines?
If a register VM is faster, it's probably because each register
instruction does the work of about 2-3 stack instructions,
meaning less trips around the eval loop, so less
Hi Brett,
On 2016-02-01 12:18 PM, Brett Cannon wrote:
On Mon, 1 Feb 2016 at 09:08 Yury Selivanov > wrote:
[..]
If I were to do some big refactoring of the ceval loop, I'd probably
consider implementing a register VM.
On 2016-01-29 11:28 PM, Steven D'Aprano wrote:
On Wed, Jan 27, 2016 at 01:25:27PM -0500, Yury Selivanov wrote:
Hi,
tl;dr The summary is that I have a patch that improves CPython
performance up to 5-10% on macro benchmarks. Benchmarks results on
Macbook Pro/Mac OS X, desktop CPU/Linux,
On 01.02.2016 17:54, Yury Selivanov wrote:
If I were to do some big refactoring of the ceval loop, I'd probably
consider implementing a register VM. While register VMs are a bit
faster than stack VMs (up to 20-30%), they would also allow us to
apply more optimizations, and even bolt on a
Hi Yury,
> An off-topic: have you ever tried hg.python.org/benchmarks
> or compare MicroPython vs CPython? I'm curious if MicroPython
> is faster -- in that case we'll try to copy some optimization
> ideas.
I've tried a small number of those benchmarks, but not in any rigorous
way, and not
Yury Selivanov schrieb am 27.01.2016 um 19:25:
> tl;dr The summary is that I have a patch that improves CPython performance
> up to 5-10% on macro benchmarks. Benchmarks results on Macbook Pro/Mac OS
> X, desktop CPU/Linux, server CPU/Linux are available at [1]. There are no
> slowdowns that I
On 2016-01-29 5:00 AM, Stefan Behnel wrote:
Yury Selivanov schrieb am 27.01.2016 um 19:25:
[..]
LOAD_METHOD looks at the object on top of the stack, and checks if the name
resolves to a method or to a regular attribute. If it's a method, then we
push the unbound method object and the object
Hi Damien,
BTW I just saw (and backed!) your new Kickstarter campaign
to port MicroPython to ESP8266, good stuff!
On 2016-01-29 7:38 AM, Damien George wrote:
Hi Yury,
[..]
Do you use opcode dictionary caching only for LOAD_GLOBAL-like
opcodes? Do you have an equivalent of LOAD_FAST, or you
On Wed, Jan 27, 2016 at 01:25:27PM -0500, Yury Selivanov wrote:
> Hi,
>
>
> tl;dr The summary is that I have a patch that improves CPython
> performance up to 5-10% on macro benchmarks. Benchmarks results on
> Macbook Pro/Mac OS X, desktop CPU/Linux, server CPU/Linux are available
> at [1].
Hi Yuri,
I think these are great ideas to speed up CPython. They are probably
the simplest yet most effective ways to get performance improvements
in the VM.
MicroPython has had LOAD_METHOD/CALL_METHOD from the start (inspired
by PyPy, and the main reason to have it is because you don't need to
BTW, this optimization also makes some old optimization tricks obsolete.
1. No need to write 'def func(len=len)'. Globals lookups will be fast.
2. No need to save bound methods:
obj = []
obj_append = obj.append
for _ in range(10**6):
obj_append(something)
This hand-optimized code would
On 2016-01-27 3:46 PM, Glenn Linderman wrote:
On 1/27/2016 12:37 PM, Yury Selivanov wrote:
MicroPython also has dictionary lookup caching, but it's a bit
different to your proposal. We do something much simpler: each opcode
that has a cache ability (eg LOAD_GLOBAL, STORE_GLOBAL,
On 1/27/2016 12:37 PM, Yury Selivanov wrote:
MicroPython also has dictionary lookup caching, but it's a bit
different to your proposal. We do something much simpler: each opcode
that has a cache ability (eg LOAD_GLOBAL, STORE_GLOBAL, LOAD_ATTR,
etc) includes a single byte in the opcode which
On Wed, 27 Jan 2016 at 10:26 Yury Selivanov wrote:
> Hi,
>
>
> tl;dr The summary is that I have a patch that improves CPython
> performance up to 5-10% on macro benchmarks. Benchmarks results on
> Macbook Pro/Mac OS X, desktop CPU/Linux, server CPU/Linux are available
>
Hi Yury,
(Sorry for misspelling your name previously!)
> Yes, we'll need to add CALL_METHOD{_VAR|_KW|etc} opcodes to optimize all
> kind of method calls. However, I'm not sure how big the impact will be,
> need to do more benchmarking.
I never did such fine grained analysis with MicroPython.
Damien,
On 2016-01-27 4:20 PM, Damien George wrote:
Hi Yury,
(Sorry for misspelling your name previously!)
NP. As long as the first letter is "y" I don't care ;)
Yes, we'll need to add CALL_METHOD{_VAR|_KW|etc} opcodes to optimize all
kind of method calls. However, I'm not sure how big
On 2016-01-27 3:10 PM, Damien George wrote:
Hi Yuri,
I think these are great ideas to speed up CPython. They are probably
the simplest yet most effective ways to get performance improvements
in the VM.
Thanks!
MicroPython has had LOAD_METHOD/CALL_METHOD from the start (inspired
by PyPy,
Hi,
tl;dr The summary is that I have a patch that improves CPython
performance up to 5-10% on macro benchmarks. Benchmarks results on
Macbook Pro/Mac OS X, desktop CPU/Linux, server CPU/Linux are available
at [1]. There are no slowdowns that I could reproduce consistently.
There are
On 2016-01-27 3:01 PM, Brett Cannon wrote:
[..]
We can also optimize LOAD_METHOD. There are high chances, that
'obj' in
'obj.method()' will be of the same type every time we execute the code
object. So if we'd have an opcodes cache, LOAD_METHOD could then
cache
a
As Brett suggested, I've just run the benchmarks suite with memory
tracking on. The results are here:
https://gist.github.com/1st1/1851afb2773526fd7c58
Looks like the memory increase is around 1%.
One synthetic micro-benchmark, unpack_sequence, contains hundreds of
lines that load a global
41 matches
Mail list logo