Re: [RFC PATCH v5 2/4] add -p: select modified lines correctly

2018-07-27 Thread Junio C Hamano
Phillip Wood  writes:

> The code actually looks at the lines that are selected rather than
> omitted. So in the example above it groups them as [1,2] (because they
> are contiguous), [4],[5] (these are split because one is an insertion
> and one a deletion) and [7]. It then sees that there are two groups of
> deletions ([1,2],[4]) and two groups of insertions ([5],[7]) and so
> pairs up the deletions in [12] with the insertion in [5] and likewise
> with [4] and [7]. Lines 3 and 6 are never explicitly paired, although
> they basically behave as if they were. One the insertions are all paired
> up it walks over the list and creates a new hunk where the paired
> insertions come immediately after their corresponding deletions,
> unselected deletions are converted to context lines and unselected
> additions are dropped.

Now, without that much explanation in help text, can an average end
user use the feature, specifically, understand the reason why the
tool says it cannot handle a particular set of selected lines, and
follow the workaround suggested by the tool to do it in two (or
more) batches?  That was the real question I was getting at.  I
haven't played with the feature long enough to answer that question.

>>> Reported-by: Ævar Arnfjörð Bjarmason 
>> 
>> Is this fixing any bug?  I usually see "Reported-by" only for a
>> bugfix patch but this seems to be adding a new feature (and lack of
>> feature is usually not a bug).
>
> I guess I meant that the previous series was effectively buggy as it
> would give the wrong result for modified lines. I wanted to acknowledge
> that Ævar spent some time testing it and pointed that out.

Ah, I see.  We generally throw these into "Helped-by", I'd think.

Thanks.



Re: [RFC PATCH v5 2/4] add -p: select modified lines correctly

2018-07-27 Thread Phillip Wood
Hi Junio, thanks for the comments

On 26/07/18 20:30, Junio C Hamano wrote:
> Phillip Wood  writes:
> 
> An interesting problem you are solving ;-)
> 
>> For example given the hunk
>>   1 -* a longer description of the
>>   2 -  first item
>>   3 -* second
>>   4 -* third
>>   5 +* first
>>   6 +  second item
>>   7 +* the third item
>>
>> If the user selects 1,2,4–5,7 then we should generate
>>  -* a longer description of the
>>  -  first item
>>  +* first
>>   * second
>>  -* third
>>  +* the third item
> 
> I understood this example as "3 that is removal and 6 that is
> addition are excluded---we consider that these two lines (one in the
> pre-image and the other in the post-image) are _matching" As we> are 
> excluding a deletion, it becomes the common context line, and
> any removal or addition that appear before that must stay to happen
> before the common context line (i.e. removal of 1 and 2, and
> addition of 5, both precede common context line "second") and any
> removal or addition that appear after that must stay after the
> common context (i.e. removal of "third" and addition of "the third
> item" come after "second").
> 
> But then it is not clear to me what you mean by "group" below.  What
> groups does the above example have?  Ones before the retained
> "second" (i.e. removal 1, 2, 4 and addition 5) form one group and
> ones after it (i.e. removal 4 and addition 7) form another group?

The code actually looks at the lines that are selected rather than
omitted. So in the example above it groups them as [1,2] (because they
are contiguous), [4],[5] (these are split because one is an insertion
and one a deletion) and [7]. It then sees that there are two groups of
deletions ([1,2],[4]) and two groups of insertions ([5],[7]) and so
pairs up the deletions in [12] with the insertion in [5] and likewise
with [4] and [7]. Lines 3 and 6 are never explicitly paired, although
they basically behave as if they were. One the insertions are all paired
up it walks over the list and creates a new hunk where the paired
insertions come immediately after their corresponding deletions,
unselected deletions are converted to context lines and unselected
additions are dropped.

> 
>> Reported-by: Ævar Arnfjörð Bjarmason 
> 
> Is this fixing any bug?  I usually see "Reported-by" only for a
> bugfix patch but this seems to be adding a new feature (and lack of
> feature is usually not a bug).

I guess I meant that the previous series was effectively buggy as it
would give the wrong result for modified lines. I wanted to acknowledge
that Ævar spent some time testing it and pointed that out.

Best Wishes

Phillip



Re: [RFC PATCH v5 2/4] add -p: select modified lines correctly

2018-07-26 Thread Junio C Hamano
Phillip Wood  writes:

An interesting problem you are solving ;-)

> For example given the hunk
>   1 -* a longer description of the
>   2 -  first item
>   3 -* second
>   4 -* third
>   5 +* first
>   6 +  second item
>   7 +* the third item
>
> If the user selects 1,2,4–5,7 then we should generate
>   -* a longer description of the
>   -  first item
>   +* first
>* second
>   -* third
>   +* the third item

I understood this example as "3 that is removal and 6 that is
addition are excluded---we consider that these two lines (one in the
pre-image and the other in the post-image) are _matching".  As we
are excluding a deletion, it becomes the common context line, and
any removal or addition that appear before that must stay to happen
before the common context line (i.e. removal of 1 and 2, and
addition of 5, both precede common context line "second") and any
removal or addition that appear after that must stay after the
common context (i.e. removal of "third" and addition of "the third
item" come after "second").

But then it is not clear to me what you mean by "group" below.  What
groups does the above example have?  Ones before the retained
"second" (i.e. removal 1, 2, 4 and addition 5) form one group and
ones after it (i.e. removal 4 and addition 7) form another group?

> Reported-by: Ævar Arnfjörð Bjarmason 

Is this fixing any bug?  I usually see "Reported-by" only for a
bugfix patch but this seems to be adding a new feature (and lack of
feature is usually not a bug).


[RFC PATCH v5 2/4] add -p: select modified lines correctly

2018-07-26 Thread Phillip Wood
From: Phillip Wood 

When a set of lines is modified the hunk contains deletions followed
by insertions. To correctly stage a subset of the modified lines we
need to match up the selected deletions with the selected insertions
otherwise we end up with deletions and context lines followed by
insertions which is not what we want.

For example given the hunk
  1 -* a longer description of the
  2 -  first item
  3 -* second
  4 -* third
  5 +* first
  6 +  second item
  7 +* the third item

If the user selects 1,2,4–5,7 then we should generate
-* a longer description of the
-  first item
+* first
 * second
-* third
+* the third item

not
-* a longer description of the
-  first item
 * second
-* third
+* first
+* the third item

Currently the code can only cope with selections that contain the same
number of groups of insertions and deletions, though each group need
not contain the same number of insertions and deletions. If the user
wants to stage an unpaired deletion or insertion in a hunk where they
also want to stage modified lines they have to do it with two
invocations of 'git add -p'.

It would be possible to add some syntax to allow lines to be excluded
from groups to allow the user to stage such changes in a single
go. It may also be useful to allow users to explicitly group lines. If
in the example above the second item is deleted we have

  1 -* a longer description of the
  2 - first item
  3 -* second
  4 -* third
  5 +* first
  6 +* the third item

Selecting 1,2,4–6 will give an error. If lines could be grouped
explicitly then it would be possible to type something like
1,2,4,[5],6 to indicate that there are two groups of insertions giving

-* a longer description of the
- first item
+* first
 * second
-* third
+* the third item

We may want to be able to stage an insertion before an unselected
deletion to allow the user to stage a new paragraph before the
unmodified original in

  1 -original
  2 +a new paragraph before
  3 +original
  4 +
  5 +modified original

by specifying something like ^2-4 to give

+a new paragraph before
+original
+
 original

I'm not sure how common these cases are in real life and how much
effort it's worth putting into handling them at the moment when the
user can edit the hunk if need be. Perhaps it would be better to leave
it for future extensions when it becomes clearer what would be most
useful.

Reported-by: Ævar Arnfjörð Bjarmason 
Signed-off-by: Phillip Wood 
---
 git-add--interactive.perl  | 147 ++---
 t/t3701-add-interactive.sh | 108 ++-
 2 files changed, 242 insertions(+), 13 deletions(-)

diff --git a/git-add--interactive.perl b/git-add--interactive.perl
index cbc9e5698a..7e4daee2fc 100755
--- a/git-add--interactive.perl
+++ b/git-add--interactive.perl
@@ -1015,26 +1015,47 @@ sub color_diff {
 
 use constant {
NO_NEWLINE => 1,
+   LAST_ADD_DEL => 2,
+   FIRST_ADD => 4
 };
 
 sub label_hunk_lines {
my $hunk = shift;
my $text = $hunk->{TEXT};
-   my (@line_flags, @lines);
+   # A block contains the insertions and deletions occurring context
+   # lines.
+   my (@blocks, @line_flags, @lines, @modes);
my ($block, $label, $last_mode) = (0, 0, '');
for my $line (1..$#{$text}) {
$line_flags[$line] = 0;
my $mode = substr($text->[$line], 0, 1);
if ($mode eq '\\') {
$line_flags[$line - 1] |= NO_NEWLINE;
}
+   if ($mode ne '-' and $last_mode eq '-' or
+   $mode ne '+' and $last_mode eq '+') {
+   $line_flags[$line - 1] |= LAST_ADD_DEL;
+   }
+   if ($mode eq '+' and $last_mode ne '+') {
+   $line_flags[$line] |= FIRST_ADD;
+   }
if ($mode eq '-' or $mode eq '+') {
-   $lines[++$label] = $line;
+   $blocks[++$label] = $block;
+   $lines[$label] = $line;
+   $modes[$label] = $mode;
+   } elsif ($mode eq ' ' and $last_mode ne ' ') {
+   $block++;
}
+   $last_mode = $mode;
+   }
+   if ($last_mode eq '-' or $last_mode eq '+') {
+   $line_flags[-1] |= LAST_ADD_DEL;
}
if ($label > 1) {
$hunk->{LABELS} = {
+   BLOCKS => \@blocks,
LINES => \@lines,
+   MODES => \@modes
};
$hunk->{LINE_FLAGS} = \@line_flags;
return 1;
@@ -1061,11 +1082,14 @@ sub select_hunk_lines {
}
};
 
-   my ($lo,