Linus Torvalds wrote:
On Fri, Jun 27, 2014 at 10:48 AM, Junio C Hamano <> wrote:

Even though the original question mentioned "delta discovery", I
think what was being asked is not "delta" in the Git sense (which
your answer is about) but is "can we diff two long sequences of text
(that happens to consist of only 4-letter alphabet but that is a
irrelevant detail) without holding both in-core in their entirety?",
which is a more relevant question/desire from the application point
of view.

.. even there, there's another issue. With enough memory, the diff
itself should be fairly reasonable to do, but we do not have any sane
*format* for diffing those kinds of things.

The regular textual diff is line-based, and is not amenable to
comparing two long lines. You'll just get a diff that says "the two
really long lines are different".

The binary diff option should work, but it is a horrible output
format, and not very helpful. It contains all the relevant data ("copy
this chunk from here to here"), but it's then shown in a binary
encoding that isn't really all that useful if you want to say "what
are the differences between these two chromosomes".

There is also --word-diff[=<mode>] word-based textual diff,
and I think one can abuse --word-diff-regex=<regex> for
character-based diff... or maybe not, as <regex> specifies
word characters, not words or word separators.

Jakub Narębski

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to
More majordomo info at

Reply via email to