[issue14621] Hash function is not randomized properly

Christian Heimes Wed, 07 Nov 2012 03:53:18 -0800

Christian Heimes added the comment:

Serhiy, the performance of hash() for long strings isn't very relevant for the 
general performance of a Python program. Short strings dominate. I've modified 
the timeit to create a new string object every time.


for I in 5 10 15 20 30 40 50 60; do echo -ne "$I\t"; ./python -m timeit 
-n100000 -r30 -s "h = hash; x = 'ä' * $I" -- "h(x + 'a')" | awk '{print $6}' ; 
done

ASCII:
#       SIP        FNV
5       0.112      0.0979
10      0.115      0.103
15      0.12       0.107
20      0.124      0.112
30      0.126      0.127
40      0.136      0.142
50      0.142      0.147
60      0.146      0.159

UCS-2:
#       SIP        FNV
5       0.114      0.0977
10      0.117      0.0988
15      0.12       0.11
20      0.126      0.109
30      0.13       0.122
40      0.14       0.132
50      0.144      0.147
60      0.152      0.157

For short strings the additional round and setup costs make hash() about 10% 
slower. For long strings SIP is faster as it processes 8 bytes at once instead 
of 1 to 4 bytes.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue14621>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue14621] Hash function is not randomized properly

Reply via email to