Re: [PATCHv5 00/17] Diff machine: highlight moved lines.

2017-05-26 Thread Stefan Beller
On Thu, May 25, 2017 at 6:20 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> As you turn on/off normal coloring via "color.diff" and this only extends
>> the coloring scheme, I have the impression "color" is the right section.
>> Maybe color.diffStyle=[normal/enhanced] and "enhanced" means this
>> feature for now?
>
> Hmph, I thought the intent of color.diff is "is the diff command
> itself is colored?"  In other words, color.diff=false should give
> you monochrome if you say "diff --word-diff", etc.

Yes, but in my understanding the "diff" doesn't apply to
the command, but the part of the output. I arrived at that
understanding as other commands (show/log/..) also respect
that setting, so the "diff" in color.diff is not the command, but
referring to something else, the output being the closest match. :)

And by that understanding color.diffStyle is just natural?

But I think that setting would be a bad idea as we'd rather have
multiple uncorrelated settings for coloring, which a style argument
would not.

>> The only option in the "diff" section related to color is 
>> diff.wsErrorHighlight
>> which has a very similar purpose, so "diff.colorMoved" would fit in that
>> scheme.

With the above reasoning, this may be missnamed and should rather be
color.wsErrorHighlight as it applies to more than just the diff command.

Note: The average user may not aware that diff/show/log are tiny wrappers
around the same backend for the heavy lifting.

> I didn't have "should diff output highlight whitespace errors?" in
> mind when I wrote the message you are responding to, but yes, that
> is quite similar to "should diff output show lines moved and lines
> deleted/added differently?".
>
>> So with these questions, I wonder if we want to color moved lines
>> as "color.diff.context" (i.e. plain white text in the normal coloring scheme)
>> This would serve the intended purpose of
>> dimming the attention to moved lines.
>
> Yes, but two points.
>
>  (1) We want to do so while making it obvious where the boundary
>  between two moved blocks of text whose destination (for
>  moved-deleted lines) or source (for moved-added lines) is.

Yes, that is what the boundary color would accomplish.
Any two adjacent blocks with the same sign would have
their boundary line colored this way.

>  (2) My message was an impression from using the code to review a
>  patch that is meant to be "move without changing other things".

Ok, glad you found it somewhat useful already.

>  For other purposes, there may be cases where moved ones may
>  want to be highlighted, not dimmed.

I wonder what these use cases would be?
(barring a --find-copies harder extension that would not just search the
current diff, but the whole tree)

That hints at just having an extra slot for the moved block, but the default
color could be the same as color.diff.context for dimming.

By now I also think we may not need different colors for additions
or removals, but keeping two colors is easy enough.

>> Regarding the last point of marking up adjacent blocks (which would
>> indicate that there is a coherency issue or just moving from different
>> places), we could highlight the last line of the previous block
>> and first line of the next block in their "normal" colors (i.e.
>> color.diff.old and color.diff.new).
>
> Hmm.  That is an interesting thought.

I'll try the implementation for that and see if it looks good.

Thanks,
Stefan


Re: [PATCHv5 00/17] Diff machine: highlight moved lines.

2017-05-25 Thread Junio C Hamano
Stefan Beller  writes:

> As you turn on/off normal coloring via "color.diff" and this only extends
> the coloring scheme, I have the impression "color" is the right section.
> Maybe color.diffStyle=[normal/enhanced] and "enhanced" means this
> feature for now?

Hmph, I thought the intent of color.diff is "is the diff command
itself is colored?"  In other words, color.diff=false should give
you monochrome if you say "diff --word-diff", etc.

> The only option in the "diff" section related to color is 
> diff.wsErrorHighlight
> which has a very similar purpose, so "diff.colorMoved" would fit in that
> scheme.

I didn't have "should diff output highlight whitespace errors?" in
mind when I wrote the message you are responding to, but yes, that
is quite similar to "should diff output show lines moved and lines
deleted/added differently?".

> So with these questions, I wonder if we want to color moved lines
> as "color.diff.context" (i.e. plain white text in the normal coloring scheme)
> This would serve the intended purpose of
> dimming the attention to moved lines.

Yes, but two points.

 (1) We want to do so while making it obvious where the boundary
 between two moved blocks of text whose destination (for
 moved-deleted lines) or source (for moved-added lines) is.

 (2) My message was an impression from using the code to review a
 patch that is meant to be "move without changing other things".
 For other purposes, there may be cases where moved ones may
 want to be highlighted, not dimmed.

> Regarding the last point of marking up adjacent blocks (which would
> indicate that there is a coherency issue or just moving from different
> places), we could highlight the last line of the previous block
> and first line of the next block in their "normal" colors (i.e.
> color.diff.old and color.diff.new).

Hmm.  That is an interesting thought.


Re: [PATCHv5 00/17] Diff machine: highlight moved lines.

2017-05-25 Thread Stefan Beller
On Wed, May 24, 2017 at 11:44 PM, Junio C Hamano  wrote:
> I was trying to see how this is useful in code moving change, with
> this command line:
>
> $ git -c color.moved diff js/blame-lib~6 js/blame-lib blame.c blame.h 
> builtin/blame.c
>
> Some random observations:
>
>  * You do not seem to have any command line option yet.  I guess
>that is OK while the series is still in RFC state, but when we
>are ready to seriously start considering this for 'next', we'd
>need something like --color-moved that can be used across "diff",
>"log -p" and "show".

There is "--color-moved" as a command line option. (See diff.c:4290)
Oh, it is not documented! That will be fixed.

>  * As configuration variable names go, "color.moved" is probably in
>a wrong section.  Perhaps "diff.colorMoved" or something?

As you turn on/off normal coloring via "color.diff" and this only extends
the coloring scheme, I have the impression "color" is the right section.
Maybe color.diffStyle=[normal/enhanced] and "enhanced" means this
feature for now?

The only option in the "diff" section related to color is diff.wsErrorHighlight
which has a very similar purpose, so "diff.colorMoved" would fit in that
scheme.

>
>  * The fact that I am using
>
>  [diff "color"]
> old = red reverse
> whitespace = blue reverse
>
>on a "black ink on white paper" terminal might have an effect on
>this,

Yes very much!

>but lines printed in either bold green and on green
>background (i.e. not new ones but are the ones moved from
>elsewhere) stood out a lot more prominently than lines printed in
>green (i.e. truly new additions), and I felt that this is totally
>backwards from what I needed for this exercise.  That is, while
>reviewing a code moving change, I want to be able to concentrate
>primarily of three things:
>
>- What are the new lines that are *not* moved from elsewhere?
>  Did the submitter try to sneak in unrelated changes?
>
>- What are the changes that are truly lost, not moved to
>  elsewhere?
>
>- Do the lines moved from elsewhere form a coherent whole?  Where
>  is the boundary between blocks of text that are copied?  Do
>  adjacent two blocks of moved text come from the same old place
>  next to each other?

So with these questions, I wonder if we want to color moved lines
as "color.diff.context" (i.e. plain white text in the normal coloring scheme)
This would serve the intended purpose of
dimming the attention to moved lines.

Regarding the last point of marking up adjacent blocks (which would
indicate that there is a coherency issue or just moving from different
places), we could highlight the last line of the previous block
and first line of the next block in their "normal" colors (i.e.
color.diff.old and color.diff.new).

The very first version had some boundary coloring, but then
I switched to alternating block coloring based on an idea
by Jonathan Tan.

Maybe it is time to go back to boundary coloring, but optional
and apply the boundary color only if two blocks are adjacent?

Example of how it could look:
---8<---
diff --git a/poetry.txt b/poetry.txt
index 9d32b3b..cc50ca1 100644
--- a/poetry.txt
+++ b/poetry.txt
@@ -1,12 +1,4 @@
[W] -A simple text is
[W] -written in paragraphs and
[W] -many more lines.
[R] -
[W]  A diff is smaller
[W]  than the pre or post image text form
[W]  used for review
[W]
[W] -In between focus
[W] -is hard to keep consistently
[W] -errors may occur
[W] -
diff --git a/engineer.txt b/engineer.txt
new file mode 100644
index 000..8fbc0ce
--- /dev/null
+++ b/engineer.txt
@@ -0,0 +1,7 @@
[W] +A simple text is
[W] +written in paragraphs and
[B] +many more lines.
[B] +In between focus
[W] +is hard to keep consistently
[W] +errors may occur
[W] +
---8<---

W -> simple White text (diff.color.context)
R -> Red text (diff.color.old)
B -> Boundary marker (Maybe just diff.color.{old/new}
  or a new color to configure)

>
>Using colors that stand out more prominently than for the regular
>new/old lines is counter-productive for all of these, especially
>for the first two purposes.  I may suggest using cyan (or any
>color that is very close to the background) so that the presense
>of moved lines are merely felt without being distracting.  IOW,
>while reviewing code moving patch, moved parts want to be dimmed,
>not highlighted.

I agree. So I could resend the algorithm used with other defaults
or try out the "boundary only iff adjacent blocks, else context color".

Thanks,
Stefan


Re: [PATCHv5 00/17] Diff machine: highlight moved lines.

2017-05-24 Thread Junio C Hamano
I was trying to see how this is useful in code moving change, with
this command line:

$ git -c color.moved diff js/blame-lib~6 js/blame-lib blame.c blame.h 
builtin/blame.c

Some random observations:

 * You do not seem to have any command line option yet.  I guess
   that is OK while the series is still in RFC state, but when we
   are ready to seriously start considering this for 'next', we'd
   need something like --color-moved that can be used across "diff",
   "log -p" and "show".

 * As configuration variable names go, "color.moved" is probably in
   a wrong section.  Perhaps "diff.colorMoved" or something?

 * The fact that I am using

 [diff "color"] 
old = red reverse
whitespace = blue reverse

   on a "black ink on white paper" terminal might have an effect on
   this, but lines printed in either bold green and on green
   background (i.e. not new ones but are the ones moved from
   elsewhere) stood out a lot more prominently than lines printed in
   green (i.e. truly new additions), and I felt that this is totally
   backwards from what I needed for this exercise.  That is, while
   reviewing a code moving change, I want to be able to concentrate
   primarily of three things:

   - What are the new lines that are *not* moved from elsewhere?
 Did the submitter try to sneak in unrelated changes?

   - What are the changes that are truly lost, not moved to
 elsewhere?

   - Do the lines moved from elsewhere form a coherent whole?  Where
 is the boundary between blocks of text that are copied?  Do
 adjacent two blocks of moved text come from the same old place
 next to each other?

   Using colors that stand out more prominently than for the regular
   new/old lines is counter-productive for all of these, especially
   for the first two purposes.  I may suggest using cyan (or any
   color that is very close to the background) so that the presense
   of moved lines are merely felt without being distracting.  IOW,
   while reviewing code moving patch, moved parts want to be dimmed,
   not highlighted.



[PATCHv5 00/17] Diff machine: highlight moved lines.

2017-05-24 Thread Stefan Beller
v5:
* removed the color passing to the submodule to make the tests pass again.
* fixed an indentation issue that was introduced from v3 -> v4.
* I merged it with origin/next and tests pass here.

Thanks,
Stefan

diff to v4:
diff --git a/diff.c b/diff.c
index 23e70d348e..1292d3c4ad 100644
--- a/diff.c
+++ b/diff.c
@@ -751,7 +751,7 @@ static void mark_color_as_moved(struct diff_options *o,
 }
 
 static void emit_diff_line(struct diff_options *o,
-struct diff_line *e)
+  struct diff_line *e)
 {
const char *ws;
int has_trailing_newline, has_trailing_carriage_return;
@@ -804,7 +804,7 @@ static void emit_diff_line(struct diff_options *o,
 }
 
 static void append_diff_line(struct diff_options *o,
-  struct diff_line *e)
+struct diff_line *e)
 {
struct diff_line *f;
ALLOC_GROW(o->line_buffer,
diff --git a/submodule.c b/submodule.c
index 428c996c97..19c63197fb 100644
--- a/submodule.c
+++ b/submodule.c
@@ -550,8 +550,6 @@ void show_submodule_inline_diff(struct diff_options *o, 
const char *path,
 
/* TODO: other options may need to be passed here. */
argv_array_push(&cp.args, "diff");
-   if (o->use_color)
-   argv_array_push(&cp.args, "--color=always");
argv_array_pushf(&cp.args, "--line-prefix=%s", diff_line_prefix(o));
if (DIFF_OPT_TST(o, REVERSE_DIFF)) {
argv_array_pushf(&cp.args, "--src-prefix=%s%s/",

v4:
* interdiff to v3 (what is currently origin/sb/diff-color-move) below.
* renamed the "buffered_patch_line" to "diff_line". Originally I planned
  to not carry the "line" part as it can be a piece of a line as well.
  But for the intended functionality it is best to keep the name.
  If we'd want to add more functionality to say have a move detection
  for words as well, we'd rename the struct to have a better name then.
  For now diff_line is the best. (Thanks Jonathan Nieder!)
* tests to demonstrate it doesn't mess with --color-words as well as
  submodules. (Thanks Jonathan Tan!)
* added in the statics (Thanks Ramsay!)
* smaller scope for the hashmaps (Thanks Jonathan Tan!)
* some commit messages were updated, prior patch 4-7 is squashed into one
  (Thanks Jonathan Tan!)
* the tests added revealed an actual fault: now that the submodule process
  is not attached to a dupe of our stdout, it would stop coloring the
  output. We need to pass on use-color explicitly.
* updated the NEEDSWORK comment in the second last patch.

Thanks for bearing,
Stefan

v3:
* see interdiff below.
* fixing one invalid computation (Thanks Junio!)
* I reasoned more about submodule and word diffing, see the commit message
  of the last patch:
  
A note on the options '--submodule=diff' and '--color-words/--word-diff':
In the conversion to use emit_line in the prior patches both submodules
as well as word diff output carefully chose to call emit_line with sign=0.
All output with sign=0 is ignored for move detection purposes in this
patch, such that no weird looking output will be generated for these
cases. This leads to another thought: We could pass on '--color-moved' to
submodules such that they color up moved lines for themselves. If we'd do
so only line moves within a repository boundary are marked up.

* better name for emit_line outside of diff.[ch]

v2:
* emit_line now takes an argument that indicates if we want it
  to emit the line prefix as well. This should allow for a more faithful
  refactoring in the beginning. (Thanks Jonathan!)
* fixed memleaks (Thanks Brandon!)
* "git -c color.moved=true log -p" works now! (Thanks Jeff)
* interdiff below, though it is large.
* less intrusive than v1 (Thanks Jonathan!)

v1:

For details on *why* see the commit message of the last commit.

The first five patches are slight refactorings to get into good
shape, the next patches are funneling all output through emit_line_*.

The second last patch introduces an option to buffer up all output
before printing, and then the last patch can color up moved lines
of code.

Any feedback welcome.

Thanks,
Stefan

Stefan Beller (17):
  diff: readability fix
  diff: move line ending check into emit_hunk_header
  diff.c: factor out diff_flush_patch_all_file_pairs
  diff: introduce more flexible emit function
  diff.c: convert fn_out_consume to use emit_line
  diff.c: convert builtin_diff to use emit_line_*
  diff.c: convert emit_rewrite_diff to use emit_line_*
  diff.c: convert emit_rewrite_lines to use emit_line_*
  submodule.c: convert show_submodule_summary to use emit_line_fmt
  diff.c: convert emit_binary_diff_body to use emit_line_*
  diff.c: convert show_stats to use emit_line_*
  diff.c: convert word diffing to use emit_line_*
  diff.c: convert diff_flush to use emit_line_*
  diff.c: convert diff_summary to use emit_line_*
  diff.c: emit_line includes whitespace highlighting
  diff: buffer all ou