On 10/8/18 2:34 PM, Stefan Monnier wrote:
Is there a chance this performance behavior is the
result of a performance bug, or is the algorithm really that costly?

It's O(N*D), where N is the input length and D is the number of differences. In your case that might be about 10*11, which is getting up there.

tries to do word-level diffs (by basically turning every word
into N copies of this word, each one on its own line (where N is the
number of chars in the word, used to indicate to `diff` that long words
are "more costly" than short ones)

Ouch. That's a really inefficient way of mucking with the cost algorithm. Diff should be able to do taht directly, without your having to repeat the words. (Just a simple matter of programming....)




Reply via email to