On 25/04/2006 6:26 PM, Iain King wrote: > hawkesed wrote: >> If I have a list, say of names. And I want to count all the people >> named, say, Susie, but I don't care exactly how they spell it (ie, >> Susy, Susi, Susie all work.) how would I do this? Set up a regular >> expression inside the count? Is there a wildcard variable I can use? >> Here is the code for the non-fuzzy way: >> lstNames.count("Susie") >> Any ideas? Is this something you wouldn't expect count to do? >> Thanks y'all from a newbie. >> Ed > > Dare I suggest using REs? This looks like something they'de be good > for: > > import re > > def countMatches(names, namePattern): > count = 0 > for name in names: > if namePattern.match(name): > count += 1 > return count > > susie = re.compile("Su(s|z)(i|ie|y)") > > print countMatches(["John", "Suzy", "Peter", "Steven", "Susie", > "Susi"], susie) > > > some other patters: > > iain = re.compile("(Ia(i)?n|Eoin)") > steven = re.compile("Ste(v|ph|f)(e|a)n")
What about Steffan, Etienne, Esteban, István, ... ? > john = re.compile("Jo(h)?n") > IMHO, the amount of hand-crafting that goes into a *general-purpose* phonetic matching algorithm is already bordering on overkill. Your method using REs would not appear to scale well at all. -- http://mail.python.org/mailman/listinfo/python-list