On Wed, Jul 08, 2009 at 08:33:34PM -0400, Steve Tolkin wrote: > I want to search a sorted array to see if which strings, if any, have the > same prefix as my search string. For example, given the search string of > "tempest" I want to find "tempests" and "tempestuous". My word list has > about 250,000 words. Even though only about 7500 will be used to start as > search I think brute force would be too slow. It would take O (n*n) time. > > Because I believe in measuring performance I actually wrote the brute force > code. It only uses 2500 strings to strart the search, rather than the 7500 > I plan to use. In fact it is very slow. It is still running 15 minutes > later, and I can tell it is only half done.
I'm very surprised it is so slow. I wrote code to test the brute force approach and it searches ~250,000 words for 2500 strings in just 3 minutes. > Alternatively, can someone recommend another perl module for prefix > searching on a moderately large set of strings, e.g. trie or suffix tree or > suffix array, etc.. It must be reliable and easy to use. Another option would be a dictionary tree, in which each node is a single letter, so each word is represented by a path through the tree. I don't know how suitable it would be for this purpose. It is a cool structure though. :) Ronald _______________________________________________ Boston-pm mailing list [email protected] http://mail.pm.org/mailman/listinfo/boston-pm

