Mark Dickinson <[email protected]> added the comment:
This shouldn't be a problem: there's no rule that says that different objects
should have different hashes. Indeed, with a countable infinity of possible
different hashable inputs, a deterministic hashing algorithm, and only finitely
many outputs, such a rule would be a mathematical impossibility. For example:
>>> hash(-1) == hash(-2)
True
Are these hash collisions causing real issues in your code? While a single hash
collision like this shouldn't be an issue, if there are many collisions within
a single (non-artificial) dataset, that _can_ lead to performance issues.
Looking at the code, we could probably do a better job of making the hash
collisions less predictable. The current code looks like:
def __hash__(self):
return hash(int(self.network_address) ^ int(self.netmask))
I'd propose hashing a tuple instead of using the xor. For example:
def __hash__(self):
return hash((int(self.network_address), int(self.netmask)))
Hash collisions would almost certainly still occur with this scheme, but they'd
be a tiny bit less obvious and harder to find.
----------
nosy: +mark.dickinson
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue33784>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com