[R] Pattern names matching
Dear R magic guys.. I have two tables (actually will be dataframes), both with names to be matched. The names on the first dataframe are from a study with antenatal visits on some health centers here. It happens that we need the delivery info. And half and some thing else of the women decided to delivery some where else our health units. We managed to get the names from some other places but now we have to match our 4000 original names with over 2 other names. To make thing more bitter some names have badly written. So I need some algorithm like Levenstein or sondex or phonix or something better already on R. Can you help me? Orvalho [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pattern names matching
See the stringMatch function in the MiscPsycho package for an implementation of Levenshtein From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Orvalho Augusto [orvaq...@gmail.com] Sent: Saturday, August 20, 2011 11:08 AM To: r-help@r-project.org Subject: [R] Pattern names matching Dear R magic guys.. I have two tables (actually will be dataframes), both with names to be matched. The names on the first dataframe are from a study with antenatal visits on some health centers here. It happens that we need the delivery info. And half and some thing else of the women decided to delivery some where else our health units. We managed to get the names from some other places but now we have to match our 4000 original names with over 2 other names. To make thing more bitter some names have badly written. So I need some algorithm like Levenstein or sondex or phonix or something better already on R. Can you help me? Orvalho [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pattern names matching
On Aug 20, 2011, at 11:25 AM, Doran, Harold wrote: See the stringMatch function in the MiscPsycho package for an implementation of Levenshtein The agrep function in base R also returns a Levenshtein distance. -- David. From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Orvalho Augusto [orvaq...@gmail.com] Sent: Saturday, August 20, 2011 11:08 AM To: r-help@r-project.org Subject: [R] Pattern names matching Dear R magic guys.. I have two tables (actually will be dataframes), both with names to be matched. The names on the first dataframe are from a study with antenatal visits on some health centers here. It happens that we need the delivery info. And half and some thing else of the women decided to delivery some where else our health units. We managed to get the names from some other places but now we have to match our 4000 original names with over 2 other names. To make thing more bitter some names have badly written. So I need some algorithm like Levenstein or sondex or phonix or something better already on R. Can you help me? Orvalho David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Pattern names matching
Thank you. Orvalho On Sat, Aug 20, 2011 at 6:02 PM, David Winsemius dwinsem...@comcast.netwrote: On Aug 20, 2011, at 11:25 AM, Doran, Harold wrote: See the stringMatch function in the MiscPsycho package for an implementation of Levenshtein The agrep function in base R also returns a Levenshtein distance. -- David. __**__ From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Orvalho Augusto [orvaq...@gmail.com] Sent: Saturday, August 20, 2011 11:08 AM To: r-help@r-project.org Subject: [R] Pattern names matching Dear R magic guys.. I have two tables (actually will be dataframes), both with names to be matched. The names on the first dataframe are from a study with antenatal visits on some health centers here. It happens that we need the delivery info. And half and some thing else of the women decided to delivery some where else our health units. We managed to get the names from some other places but now we have to match our 4000 original names with over 2 other names. To make thing more bitter some names have badly written. So I need some algorithm like Levenstein or sondex or phonix or something better already on R. Can you help me? Orvalho David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.