Howdy all :)

We recently encountered an interesting issue during development of our automated packaging tool, (`boulder new`). The computation time for levenshteinDifference severely impeded performance when scanning LICENSE files and matching them with the SPDX data set.

As an experiment, we've begun porting Python's difflib to D and it can be found here:

https://gitlab.com/serpent-os/dlang/difflib-d

It's a bit ugly internally right now but it does implement the bare basics, i.e. `SequenceMatcher` with the following APIs:

- `longestMatch` - Return the longest match between two sequences
    - `matchingBlocks` - Return Match[] of all encountered matches
    - `ratio` - Similarity between two sequences, 0.0f-1.0f

Anyway, thought it may come in handy for people in future, so if there's interest
we can flesh it out beyond our own use case. :)

Reply via email to