Re: [IronPython] Weird performance issue

Michael Foord Thu, 08 Jan 2009 15:38:21 -0800

Hi Dino,

This could be at the root of *several* of the really odd performanceissues we've been having with our IronPython 2 port. That is quite animprovement and mirrors the kind of behaviour we see.


Can we get a patch against the 2.0 source code for your dictionary fix?

On the other hand, what do we lose by moving to the 2.1 branch?Obviously this is less tested (in the wild) code, but we have prettycomprehensive tests. 2.1 is also a bit of a Python 2.5 / 2.6 hybrid atthe moment I guess - are there any potential gotchas?


Michael

Dino Viehland wrote:

Ok, I’ve looked at this more closely now and I believe I have a fix.I’ve also discovered that changes in the DLR make this much better aswell in the 2.1 branch. But the numbers I get are:
2.0: 59.9065353243

2.1: 4.28863087116

2.1 + dict fixes: 1.22241745833
These numbers are quite dependent upon the number of types that havebeen seen for hashing and equality. In 2.0 increasing the number oftypes will cause the long running loop to become progressively worse.In 2.1 we’re smart enough to re-organize the L2 cache based upon themost recently used items which makes this much faster. W/ my changesthe dictionaries are smarter about managing call sites based upontheir type of keys.
My guess is we can add my dictionary fix to 2.0 but back porting allof the call site changes might be too much.
Here’s my repro:

import System

class x(object):

pass
data = [1, 1.0, 1L, 1.0, 'abc', 1L, frozenset(), xrange(2),frozenset(), BaseException(), ord, type, True, x(),System.Collections.ArrayList(), System.Collections.Hashtable(),System, KeyError('x'), AttributeError('foo'), Warning(), 1L]
d = {}

for i in data:

d[i] = 0

i in d

del d[i]

import time

start = time.clock()

data = data[-5:]

x = {}

for i in xrange(100000):

for val in data:

x[val] = 0

val in x

del x[val]

end = time.clock()

print end - start
*From:* [email protected][mailto:[email protected]] *On Behalf Of *Dino Viehland
*Sent:* Tuesday, January 06, 2009 11:25 AM
*To:* [email protected]; Discussion of IronPython
*Subject:* Re: [IronPython] Weird performance issue
I think we are thrashing the hash code call site inCommonDictionaryStorage.HashSite._HashSite. Alternately it could bethe equals call site in PythonContext._equalRetBoolSite. Similar tothe property issue you can look at the call site object and see ifit’s going to the fixed read-only EmptyRuleSet.
We can probably add 2 sites which are per-dictionary but we might needto do something more clever as that could regress general dictionary perf.
The DLR team is going to be looking into general issues withunderstanding what’s going on with call sites. Yesterday I pointed outperformance investigations as being something important for them tolook at.
*From:* [email protected][mailto:[email protected]] *On Behalf Of *Glenn Jones
*Sent:* Tuesday, January 06, 2009 6:07 AM
*To:* Discussion of IronPython
*Subject:* [IronPython] Weird performance issue

Hi all,

More from the Resolver One upgrade:
We are seeing several performance failures in our tests, which weexpect because the performance profile of IPy2 is different to IPy1.There is one that is perplexing and we hope that someone can give ussome insight into possible causes for us to investigate.
One of our performance tests only fails when it is preceded by othertests. There is (as far as we can tell) no shared state between thetests. After much poking around, we discovered that replacing a setwith a list caused the performance to improve from ~700ms down to~100ms. We have been unable get a minimal repro, so far. Here's alittle more detail on the how the code works:
class Thing(object):
def run(arg1, arg2):
* do some stuff
s = set()
* do stuff that populates s
return list(s)
We create a Thing and call run so that the set contains an entry, ifwe then create a new Thing and call the run method it takes 10 timeslonger than expected.If we change s = set to s = [] (and convert s.add to s.append, whereappropriate), and do the same as above, the second call to run behavesas we expect. There is (at least in our code) nothing shared betweenthe first call to run and the second call.
Does anyone have any ideas about where we can look to work out what'shappening? Any suggestions for ways to diagnose this?
Thanks
Glenn & Kamil

------------------------------------------------------------------------

_______________________________________________
Users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog


_______________________________________________
Users mailing list
[email protected]
http://lists.ironpython.com/listinfo.cgi/users-ironpython.com

Re: [IronPython] Weird performance issue

Reply via email to