New submission from David Albert Torpey <dt...@users.sourceforge.net>:
Dictionary keys are commonly numbers, strings, or tuples. Python has optimized numbers and strings to remember their hash values on successive calls. Tuples should do this too since their recursive hash function can take a long time to compute. Tuples are Python's official record type and the one obvious way of making non-scalar dictionary keys. The code to do this in stringobject.c is short and sweet, so this major speed boost should be an easy thing to. static long string_hash(PyStringObject *a) { register Py_ssize_t len; register unsigned char *p; register long x; if (a->ob_shash != -1) <== return a->ob_shash; <== len = Py_SIZE(a); p = (unsigned char *) a->ob_sval; x = *p << 7; while (--len >= 0) x = (1000003*x) ^ *p++; x ^= Py_SIZE(a); if (x == -1) <== x = -2; <== a->ob_shash = x; return x; } The code in tupleobject.c would just need to add the four lines marked above. Here's what is looks like now. static long tuplehash(PyTupleObject *v) { register long x, y; register Py_ssize_t len = Py_SIZE(v); register PyObject **p; long mult = 1000003L; x = 0x345678L; p = v->ob_item; while (--len >= 0) { y = PyObject_Hash(*p++); if (y == -1) return -1; x = (x ^ y) * mult; /* the cast might truncate len; that doesn't change hash stability */ mult += (long)(82520L + len + len); } x += 97531L; if (x == -1) x = -2; return x; } Thank you guys for all of your work. *David ---------- messages: 114929 nosy: dtorp priority: normal severity: normal status: open title: tuples should remember their hash value type: resource usage versions: Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue9685> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com