Here is a fairly easy way to search in a body of text for a string when there might be single-character errors in a pattern, like searching for "butterfly" but the text has "budterfly". The paper, which I found some years ago, dates back to I think the early 1990s. It has a basic method with a series of potential refinements. The basic method is well suited for programming in Python. I'm not sure that the refinements would work well, because they involve tinkering with the hash table design, and keeping the hash tables small enough that they can stay in cache memory - not what interpreted language are especially good for.
Actually, the paper covers searching for multiple patterns at one time, i.e., in a single pass. But you can easily do it for just one pattern if that's what you want. I thought this might be of interest to some people on the list. Here's a link to the paper: APPROXIMATE MULTIPLE STRING SEARCH (Muth and Manber) <http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=F5D9579C47892924DAF93B36A9445424?doi=10.1.1.21.3317&rep=rep1&type=pdf> -- You received this message because you are subscribed to the Google Groups "leo-editor" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/leo-editor/4d0023b1-244d-44c7-9059-41f26fe4cf40n%40googlegroups.com.
