yeah good point - I have gone with md5 for now.
On Wednesday, November 14, 2012 3:06:18 PM UTC+11, Chris Angelico wrote: > On Wed, Nov 14, 2012 at 2:25 PM, Richard <richar...@gmail.com> wrote: > > > So the use case - I'm storing webpages on disk and want a quick retrieval > > system based on URL. > > > I can't store the files in a single directory because of OS limitations so > > have been using a sub folder structure. > > > For example to store data at URL "abc": a/b/c/index.html > > > This data is also viewed locally through a web app. > > > > > > If you can suggest a better approach I would welcome it. > > > > The cost of a crypto hash on the URL will be completely dwarfed by the > > cost of storing/retrieving on disk. You could probably do some > > arithmetic and figure out exactly how many URLs (at an average length > > of, say, 100 bytes) you can hash in the time of one disk seek. > > > > ChrisA -- http://mail.python.org/mailman/listinfo/python-list