Not sure what is the equivalent of python difflib (SequenceMatcher) in R. If 
you have one, it will work.

Sent from a handheld device. Pardon the brevity and typos.
On Aug 25, 2020, 20:09 +0530, [email protected] <[email protected]>, 
wrote:
> Hi,
>
> I have collected hospital data from multiple sources. However, each source 
> have different name. Trying to clean list with no duplicates. I am using R 
> and couldn't resolve with stringdist_join . Appreciate you suggesting some 
> approach.
>
> For example, Guntur (A.P) is listed with following names. Can we mark (or 
> eliminate) duplicate?
>
> Example 1
> SANKARA EYE HOSPITAL(GUNTUR)
> SANKARA EYE HOSPITAL
> SANKARA EYE HOSPITAL ( A UNIT OF SRI KANCHI KAMA KOTI MEDICAL TRUST)
>
>
> Example 2
> ASHIRWAD HEART HOSPITAL ( GHATKOPAR )
> Ashirwad Heart Hospital
> ASHIRWAD HEART HOSPITAL ( GHATKOPAR )
> Ashirwad Heart Hospita-Ghatkopar
>
> Thanks
> Ram
> --
> Datameet is a community of Data Science enthusiasts in India. Know more about 
> us by visiting http://datameet.org
> ---
> You received this message because you are subscribed to the Google Groups 
> "datameet" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/datameet/19ee8101-84ec-42b0-974a-43035b5902f1n%40googlegroups.com.

-- 
Datameet is a community of Data Science enthusiasts in India. Know more about 
us by visiting http://datameet.org
--- 
You received this message because you are subscribed to the Google Groups 
"datameet" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/datameet/f69e252d-a5fb-4a34-afc3-67958614c8f3%40Spark.

Reply via email to