On 27 Jun 2008, at 14:30, francogrex wrote:
Hello,
It's just a strange coincidence that someone posted just very
recently a
question about matching. I know there are several match function in
the base
package (such as match, pmatch, charmatch, and the gsub etc) but I
can't
seem to use them wisely to be able to get what I need.
suppose I have the following strings:
"tets"
"estt"
"rtes7"
"gstes"
"tes5t"
Is there an R procedure to determine how related each string is to the
reference string "test", for example to say that "tets" is similar
to "test"
with a probability of 0.9 or something of that sort?
Have a look at ?agrep.
One could loop for different max.distances to get the relation.
An other way is to calculate the edit distance by Levenshtein(-
Damerau). A starting point could be :
http://wiki.r-project.org/rwiki/doku.php?id=tips:data-strings:levenshtein
--Hans
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.