Hi everyone, I'd like to introduce a Python library I've been working on for a while: fuzzysearch. I would love to get as much feedback as possible: comments, suggestions, bugs and more are all very welcome!
fuzzysearch is useful for searching when you'd like to find nearly-exact matches. What should be considered a "nearly matching" sub-string is defined by a maximum allowed Levenshtein distance[1]. This can be further refined by indicating the maximum allowed number of substitutions, insertions and/or deletions, each separately. Here is a basic example: >>> from fuzzysearch import find_near_matches >>> find_near_matches('PATTERN', 'aaaPATERNaaa', max_l_dist=1) [Match(start=3, end=9, dist=1)] The library supports Python 2.6+ and 3.2+ with a single code base. It is extensively tested with 97% code coverage. There are many optimizations under the hood, including custom algorithms and C extensions implemented in C and Cython. Install as usual: $ pip install fuzzysearch The repo is on github: https://github.com/taleinat/fuzzysearch Let me know what you think! - Tal Einat .. [1]: http://en.wikipedia.org/wiki/Levenshtein_distance -- https://mail.python.org/mailman/listinfo/python-list