On Tue, Aug 31, 2010 at 9:12 PM, <pyt...@bdurham.com> wrote: > I'm parsing a simple, domain specific scripting language that has commands > like the following: *page, *title, *text, *footer, etc. There are about 100 > of these '*' commands. When we see a command that we don't recognize, I > would like to find the closest match possible (from a list of all legal > commands) and include this suggestion in our diagnostic output. > > I'm not sure what you would call the type of algorithm I'm looking for: > closest matching string or auto-correct? > > Any suggestions on algorithms or python libraries that would help me do > what I'm looking for? > > Here's my short-list of ideas based on my research so far: > - Soundex > - Lawrence Philips' Metaphone Algorithm (from aspell?) > - Edit Distance Algorithm > - Levenstein Word Distance > > Any suggestions, comments on the above techniques, or ideas on a simpler > algorithm for finding close string matches based on a list, dict, or set of > possible strings? >
You are looking possibly for autocompletion/spell correction algorithm. You can write a HMM based Pos Tagger. > Thank you, > Malcolm > > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- ~l0nwlf
-- http://mail.python.org/mailman/listinfo/python-list