You're correct that most of our work in getting to IronPython 1.0 has been
focused on completeness and correctness rather than performance. IronPython 1.0
is roughly as fast as IronPython 0.1 was - which is reasonably fast (see more
at the end of this message). As anyone who's built a large system knows, not
losing performance while achieving completeness and correctness is a challenge.
When we talk about IronPython performance, we try to reference specific
benchmarks. The standard line you'll see is, "IronPython is fast - up to 1.8x
faster than CPython on the standard pystone benchmark." Performance will vary
on different tasks.
Even though performance will vary, any time that IronPython is 21x slower than
CPython that should be considered a bug in IronPython and you should file it as
an issue on CodePlex. I ran your test script on my ThinkPad X60 laptop with a
1.83GHz Intel Core Duo processor and 1.5GB of RAM under Windows XP SP2 with the
final RTM release of .NET 2.0. Running your test with the 1.0beta9 release of
IronPython, I find that it is ~8x slower in IronPython than in CPython-2.4.
This is much better than your result, but still is not acceptable performance
for such a simple test case.
Over the past week, we looked into this more closely. There were two major
performance issues revealed by your test case. One was that the way we are
packaging our signed release builds caused worse performance than the standard
internal builds we tested on. The second issue was we had some bad performance
issues calling methods on builtin types. Both of these issues have been fixed
in the soon to be released IronPython 1.0 RC 1 (which you can build from the
current codeplex sources today). After the fix, I find that IronPython-1.0rc1
is about 2.2x slower than CPython-2.4 on your benchmark code. While I wish that
IronPython was faster on this test, for this stage of the project a ~2x
performance hit on some benchmarks is considered acceptable. There are other
benchmarks where IronPython will be 2x faster than CPython. In fact, I can
modify your test below to write it in a more abstract style and it will run
with roughly the same performance on IronPython as CPython.
import time
def do_x(i):
if i % 2:
return 10
else:
return "a string"
def do_z(i):
if isinstance(i, str):
return i.upper()
else:
return i*3
def test():
start= time.clock()
x = [do_x(i) for i in xrange(1000000)]
z = [do_z(i) for i in x]
end= time.clock() - start
print end
test() # pre-run to ignore initialization time
test()
I can't stress enough how much we appreciate this kind of performance bug
report. Because you included a small self-contained test script without any
external dependencies, it was easy for us to isolate the issues in IronPython
and get them fixed. Right now, we don't have the time to help people who are
encountering performance issues in complete apps, but we can address issues
when they are reported this clearly and are this easy to reproduce. The only
additional thing that I would have liked to see here would be a more complete
description of the machine and version of .NET and IronPython that you were
running against.
I mentioned at the start of this email that IronPython's performance hasn't
changed much from the 0.1 version. Keep in mind that IronPython 0.1 was a tiny
little translator that I wrote from Python to C# that had everything it needed
to run the pystone benchmark and nothing else. I'm quite excited that we've
been able to keep the good performance aspects of that initial prototype in the
1.0 release. Here's a copy of data from the original email that I sent about
IronPython 0.1:
--------------------------------------------------------------------
Date: Mon, 8 Dec 2003 17:16:15 -0800
From: Jim Hugunin <[EMAIL PROTECTED]>
Subject: Python can run fast on the CLR
To: [EMAIL PROTECTED]
IronPython-0.1 Python-2.3 Python-2.1
pystone 0.58 1.00 1.29
function call 0.19 1.00 1.12
integer add 0.59 1.00 1.18
string.replace 0.92 1.00 1.00
range(bigint) 5.57 1.00 1.09
eval("2+2") 66.97 1.00 1.58
-------------------------------------------------------------------
These numbers are measuring time to run each of the benchmarks and are all
relative to Python-2.3. Smaller number are better and indicate faster
performance. For IronPython 0.1, the performance on function calls and integer
add were both considerably faster than CPython, string.replace was roughly the
same speed, range was too slow and performance on eval("2+2") was horrible.
Out of curiosity, I reran these same benchmarks on 1.0rc1 as well as
CPython-2.4 and 2.5beta2 using the same machine as described above.
IronPython-1.0rc1 Python-2.5b2 Python-2.4
pystone 0.55 1.00 1.01
function call 0.14 1.00 0.98
integer add 0.46 1.00 1.04
string.replace 0.92 1.00 1.45
range(bigint) 6.07 1.00 1.00
eval("2+2") 14.03 1.00 0.76
If you compare the numbers closely, you should be somewhat stunned by how
similar they are. I certainly was. A lot of things have changed, but the
underlying results still show that IronPython is blindingly fast on function
calls, ~2x faster than CPython on pystone and simple math, about the same speed
on many library function calls and ~6x slower for range(bigint). The good news
is that the one place where IronPython was previously 67x slower than CPython
is the one place where a huge improvement can be measured and IronPython is now
only 14x slower on eval("2+2").
Everywhere that IronPython is more than 2x slower than CPython is something we
need to better understand. For this set of microbenchmarks, there are two cases
of this. The first case is range(bigint). I believe that this 6x perf hit on
range(bigint) has the same underlying cause as the 2x perf hit on Luis's
benchmark below. There is something to do with building up large lists of
numbers that gives IronPython trouble. This is a performance issue that we will
certainly be looking into more deeply in the future and I hope to see major
improvements here in the future
The second performance issue is with eval("2+2"). The improvements to
performance here came as a result of a new API (DynamicMethod) added in .NET
2.0 that let's us generate code for small methods much more efficiently. We'd
like to see this get still faster, but it isn't an area of top concern to me
since eval is very rarely used in performance critical scenarios. In fact, I've
heard privately from a number of people that they wish CPython's eval could be
slowed down by 10x to discourage it's use except where it is truly needed.
IronPython does a lot more work compiling code in order to get its performance
boosts and some additional cost to eval seems reasonable to pay here.
Looking at comparisons of microbenchmarks is an interesting way to identify
possible low-hanging optimization fruit for IronPython. Last week I also looked
at the pybench benchmark by Marc-Andre Lemburg which includes a large number of
microbenchmark tests. This was helpful to more clearly identify the same
performance issue Luis's test shows for builtin method calls. From pybench, we
also noticed a surprising >10x performance hit in IronPython for list and tuple
slicing. This has been reduced dramatically in the 1.0rc1 release. I expect
that we will look at more of these kinds of comparisons in the future as we
work to further optimize IronPython.
Thanks - Jim
--- bench.py used to generate the tables above (copy pystone.py from Lib/test)
---
import time
N = 1000000
import pystone
def test_pystone(L):
pystone.pystones(200000)
def test_call(L):
def f(a,b): return
for i in L:
f(1,2); f(1,2); f(1,2); f(1,2); f(1,2)
f(1,2); f(1,2); f(1,2); f(1,2); f(1,2)
def test_add(L):
x = 10
for i in L:
y = x + 1; y = x + 1; y = x + 1; y = x + 1; y = x + 1
y = x + 1; y = x + 1; y = x + 1; y = x + 1; y = x + 1
def test_replace(L):
s = "abcdefghi"
for i in L:
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
s.replace('def', 'DEF')
def test_range(L):
for i in range(100):
r = range(N)
def test_eval(L):
for i in range(10000):
x = eval("2+2")
def bench(func, L):
start = time.clock()
func(L)
end = time.clock()
print 'ran %s in \t%.2f seconds' % (func.__name__, end-start)
tests = [test_pystone, test_call, test_add, test_replace, test_range, test_eval]
L = range(N)
for i in range(2):
for test in tests:
bench(test, L)
------------------------------------------------------------------------------------
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Luis M. Gonzalez
Sent: Tuesday, July 11, 2006 8:10 PM
To: [email protected]
Subject: [IronPython] speed
Hi everyone,
I'd like to ask you a question about Ironpython's speed and performance:
I imagine that so far, you've been concentrated in completeness and
compatibility more than performance, and I guess you'll address this issue
after verion 1.0.
However, and although you claimed that Ironpython is faster than cpython, I see
cases where it is slower by a large margin.
For example, the script below is up to 21x slower than cpython.
My question is: what are your expectations regarding ironpython's speed in the
future?
According to your experience so far, are you confident that it will match or
surpass cpython's?
Where do you think it will be better and where it will be worse?
script
-----------------------------
import time
def test():
start= time.clock()
z=[]
x=range(1000000)
for i in range(1000000):
if i % 2:
x[i] = 10
else:
x[i] = "a string"
for i in x:
if type(i)==str:
z.append(i.upper())
else:
z.append(i*3)
end= time.clock() - start
print end
test()
-----------------------------------
end script
Regards,
Luis
_______________________________________________
users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com