Re: diff -e: mishandling of bare dots

Stefan Kempf Sun, 21 Feb 2016 06:42:39 -0800

Martin Natano wrote:
> When diff encounters a line that consists of a single dot, it emits two
> dots instead, stops the current command and emits a substitute command
> to replace the double dot with a single one. Then it restarts the
> (original) command if necessary and inserts further lines. This is done
> because a single dot on a line does have special meaning in ed. (It
> stops text insertion.)
> 
> However, there are multiple issues with the current implementation,
> resulting in mangled output:
> 
> - The line number for the substitute command should be the number of the
> most recently inserted line. diff instead uses the number of the first
> inserted line of the current hunk. The first character of that line is
> removed when applying the diff, while the superfluous dot is not.
> 
> - The line number of the restarted command is not adjusted for the
> number of lines already inserted, resulting in the reordering of lines..
> 
> - When there is a bare dot in the replacement text of a change command,
> too many lines are deleted, because a second change command is emitted.
> An append command should be emitted instead, because the target lines
> have already been removed by the first change command.
> 
> The following patch fixes all those issues.


Hmm, something still does not seem right. For example, try
diff -e on the contents below (some other tests attached):

A
B
C

vs.

W
X
.
Y
Z

> Index: diffreg.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/diff/diffreg.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 diffreg.c
> --- diffreg.c 26 Oct 2015 12:52:27 -0000      1.90
> +++ diffreg.c 13 Feb 2016 16:35:08 -0000
> @@ -1075,8 +1075,12 @@ proceed:
>                * back and restart where we left off.
>                */
>               diff_output(".\n");
> -             diff_output("%ds/.//\n", a);
> +             diff_output("%ds/.//\n", a + i - 1);

This I understand. We really need to substitute the line that contains
the ..

About the changes below, is something like this the idea behind it?
The first change command has already removed lines [a,b] and replaced them
with lines [c, c + i]. So for the rest of the lines in [c + i + 1, d], we
need to make sure that they are appended after the line that follows the
inserted '.'

Can't we set b = a + i - 1 and a = b + 1 to make sure that the rest
of the 'to' file is appended after the proper line?

>               a += i;
> +             if (a > b)
> +                     b += i;
> +             else
> +                     b = a - 1;
>               c += i;
>               goto restart;
>       }

W
X
.
Y
Z

W
X
.
Y
Z

A
B
C

W
X
.
Y
Z

A
B
C

W
X
.
Y
Z
.

Re: diff -e: mishandling of bare dots

Reply via email to