The point is that the OP doesn't want to write his own hash function, but wants Python to provide a standard way of hashing an iterable. Today, the standard way is to convert to tuple and call hash on that. That may not be efficient. FWIW from a style perspective, I agree with OP.
On Thu, Jan 5, 2017 at 4:30 AM M.-A. Lemburg <m...@egenix.com> wrote: > On 28.12.2016 04:13, j...@math.brown.edu wrote: > > Suppose you have implemented an immutable Position type to represent > > the state of a game played on an MxN board, where the board size can > > grow quite large. > > ... > > > > According to > https://docs.python.org/3/reference/datamodel.html#object.__hash__ > > : > > > > > > """ > > it is advised to mix together the hash values of the components of the > > object that also play a part in comparison of objects by packing them > > into a tuple and hashing the tuple. Example: > > > > def __hash__(self): > > return hash((self.name, self.nick, self.color)) > > > > """ > > > > > > Applying this advice to the use cases above would require creating an > > arbitrarily large tuple in memory before passing it to hash(), which > > is then just thrown away. It would be preferable if there were a way > > to pass multiple values to hash() in a streaming fashion, such that > > the overall hash were computed incrementally, without building up a > > large object in memory first. > > I think there's a misunderstanding here: the hash(obj) built-in > merely interfaces to the obj.__hash__() method (or the tp_hash slot > for C types) and returns whatever these methods give. > > It doesn't implement any logic by itself. > > If you would like to implement a more efficient hash algorithm > for your types, just go ahead and write them as .__hash__() > method or tp_hash slot method and you're done. > > The example from the docs is just to showcase an example of > how such a hash function should work, i.e. to mix in all > relevant data attributes. > > In your case, you'd probably use a simple for loop to calculate > the hash without creating tuples or any other temporary > structures. > > Here's the hash implementation tuples use as an example > > /* The addend 82520, was selected from the range(0, 1000000) for > generating the greatest number of prime multipliers for tuples > upto length eight: > > 1082527, 1165049, 1082531, 1165057, 1247581, 1330103, 1082533, > 1330111, 1412633, 1165069, 1247599, 1495177, 1577699 > > Tests have shown that it's not worth to cache the hash value, see > issue #9685. > */ > > static Py_hash_t > tuplehash(PyTupleObject *v) > { > Py_uhash_t x; /* Unsigned for defined overflow behavior. */ > Py_hash_t y; > Py_ssize_t len = Py_SIZE(v); > PyObject **p; > Py_uhash_t mult = _PyHASH_MULTIPLIER; > x = 0x345678UL; > p = v->ob_item; > while (--len >= 0) { > y = PyObject_Hash(*p++); > if (y == -1) > return -1; > x = (x ^ y) * mult; > /* the cast might truncate len; that doesn't change hash > stability */ > mult += (Py_hash_t)(82520UL + len + len); > } > x += 97531UL; > if (x == (Py_uhash_t)-1) > x = -2; > return x; > } > > As you can see, there's some magic going on there to make > sure that the hash values behave well when used as "keys" > for the dictionary implementation (which is their main > purpose in Python). > > You are free to create your own hash implementation. > The only characteristic to pay attention to is to have > objects which compare equal give the same hash value. > This is needed to be able to map such objects to the same > dictionary slots. > > There should be no need to have a special hash function which > works on iterables. As long as those iterable objects define > their own .__hash__() method or tp_slot, the hash() built-in > (and Python's dict implementation) will use these and, if needed, > those methods can then use an approach to build hash values > using iterators on the object's internal data along similar > lines as the above tuple implementation. > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Experts (#1, Jan 05 2017) > >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ > >>> Python Database Interfaces ... http://products.egenix.com/ > >>> Plone/Zope Database Interfaces ... http://zope.egenix.com/ > ________________________________________________________________________ > > ::: We implement business ideas - efficiently in both time and costs ::: > > eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 > D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > Registered at Amtsgericht Duesseldorf: HRB 46611 > http://www.egenix.com/company/contact/ > http://www.malemburg.com/ > > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > > -- > > --- > You received this message because you are subscribed to a topic in the > Google Groups "python-ideas" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/python-ideas/XcuC01a8SYs/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > python-ideas+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/