Re: [PATCH v2 5/7] xdl_change_compact(): introduce the concept of a change group

2016-08-23 Thread Junio C Hamano
Michael Haggerty  writes:

> The idea of xdl_change_compact() is fairly simple:
>
> * Proceed through groups of changed lines in the file to be compacted,
>   keeping track of the corresponding location in the "other" file.
>
> * If possible, slide the group up and down to try to give the most
>   aesthetically pleasing diff. Whenever it is slid, the current location
>   in the other file needs to be adjusted.
>
> But these simple concepts are obfuscated by a lot of index handling that
> is written in terse, subtle, and varied patterns. I found it very hard
> to convince myself that the function was correct.
>
> So introduce a "struct group" that represents a group of changed lines
> in a file. Add some functions that perform elementary operations on
> groups:
>
> * Initialize a group to the first group in a file
> * Move to the next or previous group in a file
> * Slide a group up or down
>
> Even though the resulting code is longer, I think it is easier to
> understand and review.

Yup.  The important thing is that the length of the core logic of
sliding up and down becomes easier to read, because it shrinks; the
mechanics of sliding up and down may need more lines with boilderplate,
but they are isolated "do one thing and do it well" helpers.

Nice.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 5/7] xdl_change_compact(): introduce the concept of a change group

2016-08-22 Thread Michael Haggerty
The idea of xdl_change_compact() is fairly simple:

* Proceed through groups of changed lines in the file to be compacted,
  keeping track of the corresponding location in the "other" file.

* If possible, slide the group up and down to try to give the most
  aesthetically pleasing diff. Whenever it is slid, the current location
  in the other file needs to be adjusted.

But these simple concepts are obfuscated by a lot of index handling that
is written in terse, subtle, and varied patterns. I found it very hard
to convince myself that the function was correct.

So introduce a "struct group" that represents a group of changed lines
in a file. Add some functions that perform elementary operations on
groups:

* Initialize a group to the first group in a file
* Move to the next or previous group in a file
* Slide a group up or down

Even though the resulting code is longer, I think it is easier to
understand and review. Its performance is not changed
appreciably (though it would be if `group_next()` and `group_previous()`
were not inlined).

...and in fact, the rewriting helped me discover another bug in the
--compaction-heuristic code: The update of blank_lines was never done
for the highest possible position of the group. This means that it could
fail to slide the group to its highest possible position, even if that
position had a blank line as its last line. So for example, it yielded
the following diff:

$ git diff --no-index --compaction-heuristic a.txt b.txt
diff --git a/a.txt b/b.txt
index e53969f..0d60c5fe 100644
--- a/a.txt
+++ b/b.txt
@@ -1,3 +1,7 @@
 1
 A
+
+B
+
+A
 2

when in fact the following diff is better (according to the rules of
--compaction-heuristic):

$ git diff --no-index --compaction-heuristic a.txt b.txt
diff --git a/a.txt b/b.txt
index e53969f..0d60c5fe 100644
--- a/a.txt
+++ b/b.txt
@@ -1,3 +1,7 @@
 1
+A
+
+B
+
 A
 2

The new code gives the bottom answer.

Signed-off-by: Michael Haggerty 
---
 xdiff/xdiffi.c | 293 +++--
 1 file changed, 203 insertions(+), 90 deletions(-)

diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c
index 8a5832a..44fded6 100644
--- a/xdiff/xdiffi.c
+++ b/xdiff/xdiffi.c
@@ -413,126 +413,239 @@ static int recs_match(xrecord_t *rec1, xrecord_t *rec2, 
long flags)
 flags));
 }
 
-int xdl_change_compact(xdfile_t *xdf, xdfile_t *xdfo, long flags) {
-   long ix, ixo, ixs, ixref, grpsiz, nrec = xdf->nrec;
-   char *rchg = xdf->rchg, *rchgo = xdfo->rchg;
-   unsigned int blank_lines;
-   xrecord_t **recs = xdf->recs;
+/*
+ * Represent a group of changed lines in an xdfile_t (i.e., a contiguous group
+ * of lines that was inserted or deleted from the corresponding version of the
+ * file). We consider there to be such a group at the beginning of the file, at
+ * the end of the file, and between any two unchanged lines, though most such
+ * groups will usually be empty.
+ *
+ * If the first line in a group is equal to the line following the group, then
+ * the group can be slid down. Similarly, if the last line in a group is equal
+ * to the line preceding the group, then the group can be slid up. See
+ * group_slide_down() and group_slide_up().
+ *
+ * Note that loops that are testing for changed lines in xdf->rchg do not need
+ * index bounding since the array is prepared with a zero at position -1 and N.
+ */
+struct group {
+   /*
+* The index of the first changed line in the group, or the index of
+* the unchanged line above which the (empty) group is located.
+*/
+   long start;
 
/*
-* This is the same of what GNU diff does. Move back and forward
-* change groups for a consistent and pretty diff output. This also
-* helps in finding joinable change groups and reduce the diff size.
+* The index of the first unchanged line after the group. For an empty
+* group, end is equal to start.
 */
-   for (ix = ixo = 0;;) {
-   /*
-* Find the first changed line in the to-be-compacted file.
-* We need to keep track of both indexes, so if we find a
-* changed lines group on the other file, while scanning the
-* to-be-compacted file, we need to skip it properly. Note
-* that loops that are testing for changed lines on rchg* do
-* not need index bounding since the array is prepared with
-* a zero at position -1 and N.
-*/
-   for (; ix < nrec && !rchg[ix]; ix++)
-   while (rchgo[ixo++]);
-   if (ix == nrec)
-   break;
+   long end;
+};
+
+/*
+ * Initialize g to point at the first group in xdf.
+ */
+static void group_init(xdfile_t *xdf, struct group *g)
+{
+