On 27 Jun 2008, at 14:30, francogrex wrote:


Hello,
It's just a strange coincidence that someone posted just very recently a question about matching. I know there are several match function in the base package (such as match, pmatch, charmatch, and the gsub etc) but I can't
seem to use them wisely to be able to get what I need.
suppose I have the following strings:
"tets"
"estt"
"rtes7"
"gstes"
"tes5t"

Is there an R procedure to determine how related each string is to the
reference string "test", for example to say that "tets" is similar to "test"
with a probability of 0.9 or something of that sort?

Have a look at ?agrep.
One could loop for different max.distances to get the relation.

An other way is to calculate the edit distance by Levenshtein(- Damerau). A starting point could be :

http://wiki.r-project.org/rwiki/doku.php?id=tips:data-strings:levenshtein

--Hans

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to