RE: SES URL handling

Paul Vernon Mon, 23 Apr 2007 08:11:23 -0700

> I'm not sure about the algorithms you're using, but it sounds OK to me.
> Are
> you experiencing a significant performance hit using your algorithm?


I'm not seeing any performance hit at the moment. It seems to be very fast
in fact. I could have implemented the levenshtein algorithm all by itself
but this would have then had to go through every link stored in the DB
whereas, using DIFFERENCE, it pulls out a tiny subset (3 or 4 rows at most)
so the levenshtein algorithm has much less work to do. Also, if the
DIFFERENCE only returns one hit, the levenshtein algorithm is skipped
entirely. 

> Why are you looking for a better way?

It's my first stab, just one solution to a problem where we are seeing where
people are transposing and/or missing/adding characters etc when they are
typing in a URL from printed documentation.

Being a small shop, I sometimes like to get opinions on other ways of doing
things...

> An alternative solution, I believe, would be to use Lucene, and build
> an
> index of your SES url's and then I believe you can use "Did you mean"
> (similarity scores) functionality to grab the most likely URL.  This
> might
> perform better then the SQL equivalent.

I hadn't considered this it may well be an option. I'll take a look
thanks... :)

Paul




~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
ColdFusion MX7 by Adobe®
Dyncamically transform webcontent into Adobe PDF with new ColdFusion MX7. 
Free Trial. http://www.adobe.com/products/coldfusion?sdid=RVJV

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:276037
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

RE: SES URL handling

Reply via email to