Hello David,

On 10/08/11 21:27, David Naylor wrote:
Hi,

I needed to create a cache of date and time objects and I wondered what was the 
best way to handle the cache.  For comparison I put together
the following test:

[cut]
Pypy displays significant slowdown in the defaultdict function, otherwise 
displays its usual speedup.  To check what is the cause I replaced i.date()
with i.day and found no major difference in times.  It appears dict.setdefault 
(or it's interaction with jit) is causing a slow down.

I don't think that setdefault is the culprit here, as shown by this benchmark:

@bench.bench
def setdef():
    d = {}
    for i in range(10000000):
        d.setdefault(i, i)
    return d

@bench.bench
def tryexcept():
    d = {}
    for i in range(10000000):
        try:
            d[i]
        except KeyError:
            d[i] = i
    return d

setdef()
tryexcept()

$ python dictbench.py
setdef: 2.03 seconds
tryexcept: 8.54 seconds

tmp $ pypy-c dictbench.py
setdef: 1.31 seconds
tryexcept: 1.37 seconds

as you can see, in PyPy there is almost no difference between using a try/except or using setdefault.


What is very slow on PyPy seems to be hashing datetime objects:

import datetime

@bench.bench
def hashdate():
    res = 0
    for i in range(100000):
        now = datetime.datetime.now()
        res ^= hash(now)
    return res

hashdate()

$ pypy-c dictbench.py
hashdate: 0.83 seconds

$ python dictbench.py
hashdate: 0.22 seconds

I had a quick look at the code (which in PyPy is written at applevel) and it does a lot of nonsense. In particular, __hash__ calls __getstate which formats a dynamically created string, just to call hash() on it. I suppose that this code can (and should) be optimized a lot. I may try to look at it but it's unclear when, since I'm about to go on vacation.

ciao,
Anto
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
http://mail.python.org/mailman/listinfo/pypy-dev

Reply via email to