On Mar 30, 2006, at 11:36 PM, Guyren Howe wrote:


You could certainly do this in REALbasic. A useful, free resource for the dictionary etc would be WordNet <http://wordnet.princeton.edu/>.

But you're getting into some moderately complex stuff here. See, for example: <http://citeseer.ist.psu.edu/context/20836/0> to see that this is an area of active research for computer scientists. There are standard approaches, though. Typical is to index all the substrings of a particular length of each word, along with other information such as in what order they occur. As you can imagine, the index can wind up being many times the size of the original data.

Depending on your project, a compelling alternative might be Lucene: <http://lucene.apache.org/java/docs/>. Lucene is very, very good at this kind of thing.

OTOH, I'd love to see someone put together an all-REALbasic solution for this. Extra bonus points if you implement an index that lets me do grep searches. :-)

I just did a brute force inverted index with a very basic definition for "word"

Basically gave me a pretty decent search and combined with a LIKE clause was useful for my case

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to