On 29/06/2006 10:07 AM, BBands wrote: > On 6/28/06, John Machin <[EMAIL PROTECTED]> wrote: >> On 29/06/2006 9:28 AM, BBands wrote: >> > I'd like to see if a string exists, even approximately, in another. For >> > example if "black" exists in "blakbird" or if "beatles" exists in >> > "beatlemania". The application is to look though a long list of songs >> > and return any approximate matches along with a confidence factor. I >> > have looked at edit distance, but that isn't a good choice for finding >> > a short string in a longer one. >> >> There is a trivial difference between the traditional >> distance-matrix-based Levenshtein algorithm for edit distance and the >> corresponding one for approximate string searching. Ditto between >> finite-state-machine approaches. Ditto between modern bit-bashing >> approaches. >> >> > I have also explored >> > difflib.SequenceMatcher and .get_close_matches, but what I'd really >> > like is something like: >> > >> > a = FindApprox("beatles", "beatlemania") >> > print a >> > 0.857 >> > >> > Any ideas? >> >> You got no ideas from googling "approximate string search python"??? > > Yes, many including agrepy and soundex in addition to those I > mentioned already, but none seem really handy at approximately looking > up smaller strings in larger ones. I also note that this has been the > topic of prior discussion without resolutiuon. > > jab
It helps if you tell all that you've done. Otherwise people will tell you to do what you've done already :-) It helps if you reply on-list so that others can see. You may get better answers sooner. I have to vanish now. Will check back tonight. Cheers, John agrepy = approximate-grep-python -- why doesn't that suit you? -- http://mail.python.org/mailman/listinfo/python-list