>        Then I think your only solution is to do it yourself (the
> "tinyurl" service), assuring uniqueness. Basically, you "only" need to
> mix datastore simple Gets with hashtable behaviour (taking care and
> assuming that two different urls, despite rare, can result in the same
> hash key). You need to query by hash key and then, among the results,
> look for the matching url. Most of the time (depending on the url->key
> algorithm) you should get few results, so overhead won't be a problem.

The above assumes that there'll be collisions when you map from urls
to the key name.  I might naively assume that with a large enough key
space (500 characters), you should be able to find a hashing function
that'll pretty much guarantee unique keys to be generated.

My first question, though, would be if the urls that are greater than
500 characters can be truncated or simply not permitted.  When I've
seen really long urls on previous apps, they've been attacks by bots
or urls with all kinds of state information.  You could impose a
restriction on submitted URLs and I think a very small % of URLs would
trip it.  Anyone know stats of valid URLs by length?

Since I also plan on implementing a Digg-like function on one of my
websites, I'm interested in good hashing functions that let you
produce a reasonably sized hash (less than 500 char) efficiently.
Seems like the SHA hashes are overkill.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to