Am 14.11.2012 01:50, schrieb Richard:
> These URL ID's would just be used internally for quick lookups, not exposed 
> publicly in a web application.
> 
> Ideally I would want to avoid collisions altogether. But if that means 
> significant extra CPU time then 1 collision in 10 million hashes would be 
> tolerable.

Are you storing the URLs in any kind of database like a SQL database? A
proper index on the data column will avoid full table scans. It will
give you almost O(1) complexity on lookups and O(n) worst case
complexity for collisions.


-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to