Picking up on Siddharth's posting, I was thinking this is the most natural starting point anyway. I had to assess the similarity of word pairs and wrote an R script that basically computes a variety of word similarity measures such as Levenshtein distance, Dice, XDice (you als mentioned character bigrams), longest common subsequence and variations of these, which works just fine (slower than Perl, but it can do hundreds of thousands of comparisons on even a slow machine). However, I don't quite see why one would want to use NSP for that. STG -- Stefan Th. Gries ---------------------------------------- University of California, Santa Barbara http://people.freenet.de/Stefan_Th_Gries ----------------------------------------
Machen Sie aus 14 Cent spielend bis zu 100 Euro! Die neue Gaming-Area von Arcor - über 50 Onlinespiele im Angebot. http://www.arcor.de/rd/emf-gaming-1 ------------------------ Yahoo! Groups Sponsor --------------------~--> Fair play? Video games influencing politics. Click and talk back! http://us.click.yahoo.com/2jUsvC/tzNLAA/TtwFAA/dpFolB/TM --------------------------------------------------------------------~-> Yahoo! Groups Links <*> To visit your group on the web, go to: http://groups.yahoo.com/group/ngram/ <*> To unsubscribe from this group, send an email to: [EMAIL PROTECTED] <*> Your use of Yahoo! Groups is subject to: http://docs.yahoo.com/info/terms/