Re: [IronPython] speed

Jim Hugunin Wed, 26 Jul 2006 00:30:16 -0700

You're correct that most of our work in getting to IronPython 1.0 has been 
focused on completeness and correctness rather than performance. IronPython 1.0 
is roughly as fast as IronPython 0.1 was - which is reasonably fast (see more 
at the end of this message). As anyone who's built a large system knows, not 
losing performance while achieving completeness and correctness is a challenge. 
When we talk about IronPython performance, we try to reference specific 
benchmarks. The standard line you'll see is, "IronPython is fast - up to 1.8x 
faster than CPython on the standard pystone benchmark." Performance will vary 
on different tasks.


Even though performance will vary, any time that IronPython is 21x slower than 
CPython that should be considered a bug in IronPython and you should file it as 
an issue on CodePlex. I ran your test script on my ThinkPad X60 laptop with a 
1.83GHz Intel Core Duo processor and 1.5GB of RAM under Windows XP SP2 with the 
final RTM release of .NET 2.0. Running your test with the 1.0beta9 release of 
IronPython, I find that it is ~8x slower in IronPython than in CPython-2.4. 
This is much better than your result, but still is not acceptable performance 
for such a simple test case.

Over the past week, we looked into this more closely. There were two major 
performance issues revealed by your test case. One was that the way we are 
packaging our signed release builds caused worse performance than the standard 
internal builds we tested on. The second issue was we had some bad performance 
issues calling methods on builtin types. Both of these issues have been fixed 
in the soon to be released IronPython 1.0 RC 1 (which you can build from the 
current codeplex sources today). After the fix, I find that IronPython-1.0rc1 
is about 2.2x slower than CPython-2.4 on your benchmark code. While I wish that 
IronPython was faster on this test, for this stage of the project a ~2x 
performance hit on some benchmarks is considered acceptable. There are other 
benchmarks where IronPython will be 2x faster than CPython. In fact, I can 
modify your test below to write it in a more abstract style and it will run 
with roughly the same performance on IronPython as CPython.

import time

def do_x(i):
    if i % 2:
        return 10
    else:
        return "a string"

def do_z(i):
    if isinstance(i, str):
        return i.upper()
    else:
        return i*3

def test():
    start= time.clock()

    x = [do_x(i) for i in xrange(1000000)]
    z = [do_z(i) for i in x]

    end= time.clock() - start
    print end

test() # pre-run to ignore initialization time
test()

I can't stress enough how much we appreciate this kind of performance bug 
report. Because you included a small self-contained test script without any 
external dependencies, it was easy for us to isolate the issues in IronPython 
and get them fixed. Right now, we don't have the time to help people who are 
encountering performance issues in complete apps, but we can address issues 
when they are reported this clearly and are this easy to reproduce. The only 
additional thing that I would have liked to see here would be a more complete 
description of the machine and version of .NET and IronPython that you were 
running against.

I mentioned at the start of this email that IronPython's performance hasn't 
changed much from the 0.1 version. Keep in mind that IronPython 0.1 was a tiny 
little translator that I wrote from Python to C# that had everything it needed 
to run the pystone benchmark and nothing else. I'm quite excited that we've 
been able to keep the good performance aspects of that initial prototype in the 
1.0 release. Here's a copy of data from the original email that I sent about 
IronPython 0.1:

--------------------------------------------------------------------
Date: Mon, 8 Dec 2003 17:16:15 -0800
From: Jim Hugunin <[EMAIL PROTECTED]>
Subject: Python can run fast on the CLR
To: [EMAIL PROTECTED]

            IronPython-0.1  Python-2.3      Python-2.1
pystone         0.58            1.00            1.29

function call   0.19            1.00            1.12
integer add     0.59            1.00            1.18
string.replace  0.92            1.00            1.00
range(bigint)   5.57            1.00            1.09
eval("2+2")     66.97           1.00            1.58
-------------------------------------------------------------------

These numbers are measuring time to run each of the benchmarks and are all 
relative to Python-2.3. Smaller number are better and indicate faster 
performance.  For IronPython 0.1, the performance on function calls and integer 
add were both considerably faster than CPython, string.replace was roughly the 
same speed, range was too slow and performance on eval("2+2") was horrible.

Out of curiosity, I reran these same benchmarks on 1.0rc1 as well as 
CPython-2.4 and 2.5beta2 using the same machine as described above.

          IronPython-1.0rc1  Python-2.5b2      Python-2.4
pystone         0.55            1.00            1.01

function call   0.14            1.00            0.98
integer add     0.46            1.00            1.04
string.replace  0.92            1.00            1.45
range(bigint)   6.07            1.00            1.00
eval("2+2")    14.03            1.00            0.76

If you compare the numbers closely, you should be somewhat stunned by how 
similar they are. I certainly was. A lot of things have changed, but the 
underlying results still show that IronPython is blindingly fast on function 
calls, ~2x faster than CPython on pystone and simple math, about the same speed 
on many library function calls and ~6x slower for range(bigint). The good news 
is that the one place where IronPython was previously 67x slower than CPython 
is the one place where a huge improvement can be measured and IronPython is now 
only 14x slower on eval("2+2").

Everywhere that IronPython is more than 2x slower than CPython is something we 
need to better understand. For this set of microbenchmarks, there are two cases 
of this. The first case is range(bigint). I believe that this 6x perf hit on 
range(bigint) has the same underlying cause as the 2x perf hit on Luis's 
benchmark below. There is something to do with building up large lists of 
numbers that gives IronPython trouble. This is a performance issue that we will 
certainly be looking into more deeply in the future and I hope to see major 
improvements here in the future

The second performance issue is with eval("2+2"). The improvements to 
performance here came as a result of a new API (DynamicMethod) added in .NET 
2.0 that let's us generate code for small methods much more efficiently. We'd 
like to see this get still faster, but it isn't an area of top concern to me 
since eval is very rarely used in performance critical scenarios. In fact, I've 
heard privately from a number of people that they wish CPython's eval could be 
slowed down by 10x to discourage it's use except where it is truly needed. 
IronPython does a lot more work compiling code in order to get its performance 
boosts and some additional cost to eval seems reasonable to pay here.

Looking at comparisons of microbenchmarks is an interesting way to identify 
possible low-hanging optimization fruit for IronPython. Last week I also looked 
at the pybench benchmark by Marc-Andre Lemburg which includes a large number of 
microbenchmark tests. This was helpful to more clearly identify the same 
performance issue Luis's test shows for builtin method calls. From pybench, we 
also noticed a surprising >10x performance hit in IronPython for list and tuple 
slicing. This has been reduced dramatically in the 1.0rc1 release. I expect 
that we will look at more of these kinds of comparisons in the future as we 
work to further optimize IronPython.

Thanks - Jim
--- bench.py used to generate the tables above (copy pystone.py from Lib/test) 
---
import time
N = 1000000

import pystone

def test_pystone(L):
    pystone.pystones(200000)

def test_call(L):
    def f(a,b): return
    for i in L:
        f(1,2); f(1,2); f(1,2); f(1,2); f(1,2)
        f(1,2); f(1,2); f(1,2); f(1,2); f(1,2)

def test_add(L):
    x = 10
    for i in L:
        y = x + 1; y = x + 1; y = x + 1; y = x + 1; y = x + 1
        y = x + 1; y = x + 1; y = x + 1; y = x + 1; y = x + 1

def test_replace(L):
    s = "abcdefghi"
    for i in L:
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')
        s.replace('def', 'DEF')

def test_range(L):
    for i in range(100):
        r = range(N)

def test_eval(L):
    for i in range(10000):
        x = eval("2+2")


def bench(func, L):
    start = time.clock()
    func(L)
    end = time.clock()

    print 'ran %s in \t%.2f seconds' % (func.__name__, end-start)

tests = [test_pystone, test_call, test_add, test_replace, test_range, test_eval]
L = range(N)
for i in range(2):
    for test in tests:
        bench(test, L)
------------------------------------------------------------------------------------

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Luis M. Gonzalez
Sent: Tuesday, July 11, 2006 8:10 PM
To: [email protected]
Subject: [IronPython] speed

Hi everyone,
I'd like to ask you a question about Ironpython's speed and performance:
I imagine that so far, you've been concentrated in completeness and 
compatibility more than performance, and I guess you'll address this issue 
after verion 1.0.
However, and although you claimed that Ironpython is faster than cpython, I see 
cases where it is slower by a large margin.
For example, the script below is up to 21x slower than cpython.

My question is: what are your expectations regarding ironpython's speed in the 
future?
According to your experience so far, are you confident that it will match or 
surpass cpython's?
Where do you think it will be better and where it will be worse?

script
-----------------------------
import time

def test():
    start= time.clock()

    z=[]

    x=range(1000000)

    for i in range(1000000):
        if i % 2:
            x[i] = 10
        else:
            x[i] = "a string"

    for i in x:
        if type(i)==str:
                z.append(i.upper())
        else:
                z.append(i*3)

    end= time.clock() - start
    print end

test()
-----------------------------------
end script
Regards,
Luis

_______________________________________________
users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com

Re: [IronPython] speed

Reply via email to