Excerpts from Mads Kiilerich's message of 2016-11-03 22:34:15 +0100: > # HG changeset patch > # User Mads Kiilerich <mad...@unity3d.com> > # Date 1478208837 -3600 > # Thu Nov 03 22:33:57 2016 +0100 > # Node ID be83c5f4ec8931cb7e771a80a3ef5e2042a005c1 > # Parent 3e0216b2a0995cb21946bc13fb21391013332c57 > bdiff: make sure we append to repeated lines instead of inserting into range > > This will mitigate the symptoms that tests exposed in the previous changeset. > > Arguably, we need similar handling for longer sequences of repeated lines...
I'm concerned about this. This patch looks like a workaround for the simplest case. While you are aware it could have issues with "longer sequences of repeated lines". And that's not that uncommon. There are real-world JSON files like: [ ... { "a": "b", }, { "a": "b", }, { "a": "b", }, ... ] Actually, those kinds of JSON files are part of the motivation of mpm's f1ca249696ed. This series would surely affect them, while I'm not sure the change is in a good way or not yet. Note that after f1ca249696ed, there are still unsolved issues with some large JSON files - we have seen (from user report) diff output being "wrong", where human could generate much smaller diff easily. If you are not in a hurry, I'd like to learn this area deeper before commenting confidently. That may take me 1 to 2 weeks. > But also, we already have examples of how the heuristics handle other cases in > a similar way. > > diff --git a/mercurial/bdiff.c b/mercurial/bdiff.c > --- a/mercurial/bdiff.c > +++ b/mercurial/bdiff.c > @@ -187,7 +187,7 @@ static int longest_match(struct bdiff_li > } else if (i == mi && findbetterb) { > /* better j in first upper half */ > mj = j; > - if (j <= bhalf) > + if (j <= bhalf && !(j > 0 && k == 1 && b[j - 1].e == > b[j].e)) > findbetterb = 0; > } > } > diff --git a/tests/test-annotate.t b/tests/test-annotate.t > --- a/tests/test-annotate.t > +++ b/tests/test-annotate.t > @@ -91,8 +91,8 @@ annotate (JSON) > annotate -n b > > $ hg annotate -n b > + 0: a > 1: a > - 0: a > 1: a > 3: b4 > 3: b5 > @@ -111,8 +111,8 @@ annotate --no-follow b > annotate -nl b > > $ hg annotate -nl b > - 1:1: a > 0:1: a > + 1:2: a > 1:3: a > 3:4: b4 > 3:5: b5 > @@ -121,8 +121,8 @@ annotate -nl b > annotate -nf b > > $ hg annotate -nf b > + 0 a: a > 1 a: a > - 0 a: a > 1 a: a > 3 b: b4 > 3 b: b5 > @@ -131,8 +131,8 @@ annotate -nf b > annotate -nlf b > > $ hg annotate -nlf b > - 1 a:1: a > 0 a:1: a > + 1 a:2: a > 1 a:3: a > 3 b:4: b4 > 3 b:5: b5 > @@ -156,8 +156,8 @@ annotate -nlf b > annotate after merge > > $ hg annotate -nf b > + 0 a: a > 1 a: a > - 0 a: a > 1 a: a > 3 b: b4 > 4 b: c > @@ -166,8 +166,8 @@ annotate after merge > annotate after merge with -l > > $ hg annotate -nlf b > - 1 a:1: a > 0 a:1: a > + 1 a:2: a > 1 a:3: a > 3 b:4: b4 > 4 b:5: c > @@ -198,7 +198,7 @@ annotate after merge with -l > annotate after rename merge > > $ hg annotate -nf b > - 1 a: a > + 0 a: a > 6 b: z > 1 a: a > 3 b: b4 > @@ -209,7 +209,7 @@ annotate after rename merge > annotate after rename merge with -l > > $ hg annotate -nlf b > - 1 a:1: a > + 0 a:1: a > 6 b:2: z > 1 a:3: a > 3 b:4: b4 > @@ -226,7 +226,7 @@ Issue2807: alignment of line numbers wit > $ echo more >> b > $ hg ci -mmore -d '7 0' > $ hg annotate -nlf b > - 1 a: 1: a > + 0 a: 1: a > 6 b: 2: z > 1 a: 3: a > 3 b: 4: b4 > @@ -240,15 +240,15 @@ Issue2807: alignment of line numbers wit > linkrev vs rev > > $ hg annotate -r tip -n a > + 0: a > 1: a > - 0: a > 1: a > > linkrev vs rev with -l > > $ hg annotate -r tip -nl a > - 1:1: a > 0:1: a > + 1:2: a > 1:3: a > > Issue589: "undelete" sequence leads to crash > diff --git a/tests/test-bhalf.t b/tests/test-bhalf.t > --- a/tests/test-bhalf.t > +++ b/tests/test-bhalf.t > @@ -105,8 +105,8 @@ Explore some bdiff implementation edge c > --- a/x > +++ b/x > @@ -1,1 +1,3 @@ > + a > +a > - a > +a > diff --git a/y b/y > --- a/y > diff --git a/tests/test-commit-amend.t b/tests/test-commit-amend.t > --- a/tests/test-commit-amend.t > +++ b/tests/test-commit-amend.t > @@ -47,8 +47,8 @@ Amending changeset with changes in worki > --- a/a Thu Jan 01 00:00:00 1970 +0000 > +++ b/a Thu Jan 01 00:00:00 1970 +0000 > @@ -1,1 +1,3 @@ > + a > +a > - a > +a > $ hg log > changeset: 1:43f1ba15f28a > @@ -122,13 +122,13 @@ No changes, just a different message: > uncompressed size of bundle content: > 254 (changelog) > 163 (manifests) > - 141 a > + 129 a > saved backup bundle to > $TESTTMP/.hg/strip-backup/74609c7f506e-1bfde511-amend-backup.hg (glob) > 1 changesets found > uncompressed size of bundle content: > 250 (changelog) > 163 (manifests) > - 141 a > + 129 a > adding branch > adding changesets > adding manifests > @@ -140,8 +140,8 @@ No changes, just a different message: > --- a/a Thu Jan 01 00:00:00 1970 +0000 > +++ b/a Thu Jan 01 00:00:00 1970 +0000 > @@ -1,1 +1,3 @@ > + a > +a > - a > +a > $ hg log > changeset: 1:1cd866679df8 > @@ -266,13 +266,13 @@ then, test editing custom commit message > uncompressed size of bundle content: > 249 (changelog) > 163 (manifests) > - 143 a > + 131 a > saved backup bundle to > $TESTTMP/.hg/strip-backup/5f357c7560ab-e7c84ade-amend-backup.hg (glob) > 1 changesets found > uncompressed size of bundle content: > 257 (changelog) > 163 (manifests) > - 143 a > + 131 a > adding branch > adding changesets > adding manifests > @@ -309,13 +309,13 @@ Same, but with changes in working dir (d > uncompressed size of bundle content: > 464 (changelog) > 322 (manifests) > - 261 a > + 249 a > saved backup bundle to > $TESTTMP/.hg/strip-backup/7ab3bf440b54-8e3b5088-amend-backup.hg (glob) > 1 changesets found > uncompressed size of bundle content: > 257 (changelog) > 163 (manifests) > - 145 a > + 133 a > adding branch > adding changesets > adding manifests _______________________________________________ Mercurial-devel mailing list Mercurial-devel@mercurial-scm.org https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel