On Thu, Dec 23, 2021 at 5:40 PM Stephen J. Turnbull <stephenjturnb...@gmail.com> wrote: > > Hao Hu writes: > > On 12/18/21 08:44, Stephen J. Turnbull wrote: > > > Hao Hu writes: > > > > > >> For instance, if we create a caching programming interface that > > > > >> relies on a distributed kv store, > > > > > > I would be very suspicious of using Python's hash builtin for such a > > > purpose. The Python hash functions are very carefully tuned for high > > > performance in one application only: equality testing in Python, > > > especially for dicts. [...] > > > > It is pretty much the same use case as python's dictionary though, the > > goal is just to generalize it to use with a distributed kv store. > > Sure, you know that because it's your application. But I don't know > that, and it's only an example you give to justify a change to > Python. The burden on you is not to argue that it works in one > application; it's to argue that it works broadly enough to be worth > changing a lot stuff in Python, imposing a change burden on any > project that implements __hash__ for any of its classes, and for > anybody who supports both pre- and post-change version of Python, they > need to support both __hash__(object) and __hash__(object, salt) > (probably trivial, just def __hash__(self, salt=None):, but I haven't > thought about it). >
A bit more complicated for anything that builds its hash out of other objects' hashes (eg a tuple), since it would have to avoid calling hash(object, salt) if it was called as __hash__(self, None). Changing the signature of a dunder is generally a pain. Python's hashing function is designed with some extremely specific use-cases in mind. For example, small integers hash to themselves, because this gives good results for dictionaries whose keys are all small integers. That won't be as beneficial if the keyvalue store is distributed (since each node will only have part of the full dictionary), and it also means that the application would be vulnerable to hash collision attacks. As soon as something is networked, the rules change, and I do not see this as a safe choice for a distributed kv store. Exposing the string hashing algorithm *only*, as a convenient and fast way to hash strings, would have some value. Trying to expose, but also control, the overall hash function? Not something I would recommend. ChrisA _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/2KLISIHSW2OLDWPP36CILY5PGZGWZDZ5/ Code of Conduct: http://python.org/psf/codeofconduct/