Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-14 Thread Stefan Beller
On Wed, Jun 14, 2017 at 2:54 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>> The color experts agreed that (3) might be the best solution
>> as this gives most flexibility:
>>
>> "I would be happy as I can configure the bounds highlighting
>> to not exist, it would degenerate to a pure Zebra, which is
>> very simple to understand. Junio seemed to like (2) a lot, so
>> he would configure both dim colors to be 'context', but configure
>> the highlight colors to be attention drawing. So everybody would
>> be happy. It is also not too many colors, we are good at for loops."
>
> Another thing I found a bit confusing in the description of choices
> in the documentation was that description for some began with "Based
> on X.", and as a plain reader, I couldn't tell if that is saying
> "the implementation happens to be similar or shares code with X"
> (which is not all that interesting to the end user) or "the meaning
> this mode tries to convey is the same as X but the presentation is a
> bit different" (in which case the end user is hinted that it is
> benefitial to understand what informacion the mode X shows and how).

it is the latter, as I wanted to save typing. Maybe it was savings
at the wrong place.

>
> For example, I view what I prefer (i.e. (2)) as a variant of Zebra
> (i.e. (1)).

Ok, so in a reroll I'll add Zebra as the base (It conveys all
information that we have, any further modifications is just giving
a better finish to the painting.) and then make commits on top of it
adding the taping and water-dunking method on top.

By organizing it in multiple commits the algorithm may become clearer.

> Conceptually, you paint the diff output using water
> soluble paint into a Zebra pattern, then apply thin strips of
> protective tape to places where two Zebra colors are adjacent to
> each other (i.e. do not cover the boundary between a block of a
> Zebra colored moved lines and a block of context lines), dunk the
> whole thing in water and then remove the strips of tape.  Regions
> covered by the strips of tape will retain the Zebra colors, while
> the remainder of the Zebra colored part are colored in a much subdued
> way.  Understanding how Zebra mode marks the moved lines would help
> understanding its output, but your implementation may not share much
> code with the actual implementation of Zebra-painting.
>
> Thanks.
>

Thanks,
Stefan


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-14 Thread Junio C Hamano
Stefan Beller  writes:

> The color experts agreed that (3) might be the best solution
> as this gives most flexibility:
>
> "I would be happy as I can configure the bounds highlighting
> to not exist, it would degenerate to a pure Zebra, which is
> very simple to understand. Junio seemed to like (2) a lot, so
> he would configure both dim colors to be 'context', but configure
> the highlight colors to be attention drawing. So everybody would
> be happy. It is also not too many colors, we are good at for loops."

Another thing I found a bit confusing in the description of choices
in the documentation was that description for some began with "Based
on X.", and as a plain reader, I couldn't tell if that is saying
"the implementation happens to be similar or shares code with X"
(which is not all that interesting to the end user) or "the meaning
this mode tries to convey is the same as X but the presentation is a
bit different" (in which case the end user is hinted that it is
benefitial to understand what informacion the mode X shows and how).

For example, I view what I prefer (i.e. (2)) as a variant of Zebra
(i.e. (1)).  Conceptually, you paint the diff output using water
soluble paint into a Zebra pattern, then apply thin strips of
protective tape to places where two Zebra colors are adjacent to
each other (i.e. do not cover the boundary between a block of a
Zebra colored moved lines and a block of context lines), dunk the
whole thing in water and then remove the strips of tape.  Regions
covered by the strips of tape will retain the Zebra colors, while
the remainder of the Zebra colored part are colored in a much subdued
way.  Understanding how Zebra mode marks the moved lines would help
understanding its output, but your implementation may not share much
code with the actual implementation of Zebra-painting.

Thanks.



Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-13 Thread Stefan Beller
On Wed, Jun 7, 2017 at 10:41 PM, Jacob Keller  wrote:
> On Mon, Jun 5, 2017 at 11:52 PM, Jacob Keller  wrote:
>>
>> I will try to find some time tomorrow to go over it in detail.
>>

https://public-inbox.org/git/20170613023151.9688-1-sbel...@google.com/
restarted the discussion on this feature, so I assembled a team of color experts
(my office mates), and we came to this conclusion more or less:

There are different ways to implement a move detection. Different
  people prefer different things, the main contenders are:

1)"Zebra mode" (in the last patch known as "alternate" mode.

This alternates between 2 colors for blocks of moved code.
This may or may not reset to the first color, but the main features are:
a) the whole block is in one color
b) adjacent blocks have different colors.

For this implementation we need 4 new colors (2 colors for +, 2 for -)

2)"Highlight bounds + dimming in between"

A block is generally in a dim color. Depending on further configuration
the boundaries are highlighted. There are different modes for that:
a) all boundaries are highlighted
b) bounds adjacent to other moved blocks of same sign are highlighted
c) no bounds highlighted, moved code is only dimmed.
(sucks for seeing blocks)
d) [NEW] only the first (last) border line.
This is consistent with the thread that started the discussion as
this is done in the RFC for blame.

  The configured colors "oldMoved" and "oldMovedAlternative"
  are assumed that one is highlighting and the other is the dim color.

  Depending on the exact implementation we'd need up to 6 new
  colors (dim, highlight, highlight alternative for each + and -)
  The user may configure these to be the same though, e.g.
  both dims could be 'context'.

3) Combine 1) and 2)

The algorithm would look like this:
a) Find moved blocks
b) Take one option from 2) to highlight the bounds. (all bounds
  or adjacent bounds are the only serious contenders IMHO)

  This needs 8 new colors configured:
  for each of (new, old):
for each of (Zebra black, Zebra white):
  for each of (dim, highlight):
define_color()

The color experts agreed that (3) might be the best solution
as this gives most flexibility:

"I would be happy as I can configure the bounds highlighting
to not exist, it would degenerate to a pure Zebra, which is
very simple to understand. Junio seemed to like (2) a lot, so
he would configure both dim colors to be 'context', but configure
the highlight colors to be attention drawing. So everybody would
be happy. It is also not too many colors, we are good at for loops."

I hope to have summarized this adequately.

Thanks,
Stefan


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-07 Thread Jacob Keller
On Mon, Jun 5, 2017 at 11:52 PM, Jacob Keller  wrote:
>
> I will try to find some time tomorrow to go over it in detail.
>
> Thanks,
> Jake

For the record, I got pulled into a project at work, so I won't have
spare time to look into this for the near future. :( Probably not
until next week.

Thanks,
Jake


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-07 Thread Stefan Beller
On Wed, Jun 7, 2017 at 2:58 PM, Ævar Arnfjörð Bjarmason
 wrote:
> On Wed, Jun 7, 2017 at 8:28 PM, Stefan Beller  wrote:
>> On Tue, Jun 6, 2017 at 3:05 PM, Jacob Keller  wrote:
>>> On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty  
>>> wrote:
 On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller  wrote:
>
> > [...]
> >  "git diff" has been taught to optionally paint new lines that are
> >  the same as deleted lines elsewhere differently from genuinely new
> >  lines.
> >
> >  Are we happy with these changes?


 I've been studiously ignoring this patch series due to lack of bandwidth.

> [...]
> Things to come, but not in this series as they are more advanced:
>
> Discuss if a block/line needs a minimum requirement.
>
> When doing reviews with this series, a couple of lines such
> as "\t\t}" were marked as a moved, which is not wrong as they
> really occurred in the text with opposing sign.
> But it was annoying as it drew my attention to just closing
> braces, which IMO is not the point of code review.
>
> To solve this issue I had the idea of a "minimum requirement", e.g.
> * at least 3 consecutive lines or
> * at least one line with at least 3 non-ws characters or
> * compute the entropy of a given moved block and if it is too low, do
>   not mark it up.

 Shooting from the hip here...

 It seems obvious that for a line to be marked as moved, a minimum
 requirement is that

 1. The line appears as both "+" and "-".

 That doesn't seem strong enough evidence though, and if that is the
 only criterion, I would expect a lot of boilerplate lines like "\t\t}"
 to be marked as moved. It seems like a lot of noise could be
 eliminated by *also* requiring that

 2a. The line doesn't appear elsewhere in the file(s) concerned.
>>
>> 'elsewhere' in the opposing sign (+,-) or all the diff (including ' ' 
>> context)?
>>
>> This rule opens up the discussion on multi-copies, which I imagine
>> happens a lot in configuration files. So say you have a prod and staging
>> environment, then you might be tempted to make patches titled as:
>>   "1. preparation: duplicate common code into prod and staging"
>>   "2. Make an actual change to staging"
>>
>> For 1. you still want to see that there is faithful copy, but we'd have
>> 2 postimages having these lines.
>>
>> Also what about de-duplication?
>> I just stumbled upon edb0c72428 ([PATCH] diff: consolidate test
>> helper script pieces., 2005-05-31) for unrelated reasons,
>> but the move coloring of the same content multiple times
>> helped me there to focus on the relevant part.
>>

 Rule (2a) would probably get rid of most boilerplate lines without
 having to try to measure entropy.
>>
>> But it would also get rid of good use cases when not being very careful.
>> I intentionally left out the (2a) as I am not yet sure how the move
>> detection for multiple occurrences in post and preimage should
>> work in the desired case. The suppression of little-entropy closing braces
>> might be a side effect of just this. Or it can be treated separately.
>>

 Maybe you are already using both criteria? I didn't see it in a quick
 perusal of the code.

 OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved
 *only* because they appear elsewhere in the file(s). If you did so,
 you would have gaps of supposedly non-moved lines in the middle of
 moved blocks. This suggests marking as moved lines matching (1) and
 (2a) but also lines matching (1) and the following:

 2b. The line is adjacent to to another line that is thought to have
 moved from the same old location to the same new location.
>>
>> This is what we do, a "block detection" by comparing "line runs" against
>> the current lines. Based on these line runs we detect one block and
>> color up adjacent blocks.
>>

 Rule (2b) would be applied recursively, with the net effect being that
 any line satisfying (1) and (2a) is allowed to carry along any
 neighboring lines within the same "+"/"-" block even if they are not
 unique.
>>
>> So you are saying each block has to have at least one unique line?
>> That doesn't go well with (de-)duplication IMHO.
>>
>> Thanks for your shot from the hip. I'll think about these rules more to see
>> if I can make sense of them for duplication still.
>
> I've just been skimming this topic so far, but a question, what variant of:
>
> git diff ... | grep ...
>
> Can I use to see whether the diff that's being emitted has hunks
> marked as moved? Presumably this needs -c ui.color=always & grepping
> for the color codes.
>
> The use-case being to say add that diff | grep -q to a for-loop to
> find all diffs in a repo that have hunks marked as 

Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-07 Thread Ævar Arnfjörð Bjarmason
On Wed, Jun 7, 2017 at 8:28 PM, Stefan Beller  wrote:
> On Tue, Jun 6, 2017 at 3:05 PM, Jacob Keller  wrote:
>> On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty  
>> wrote:
>>> On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller  wrote:

 > [...]
 >  "git diff" has been taught to optionally paint new lines that are
 >  the same as deleted lines elsewhere differently from genuinely new
 >  lines.
 >
 >  Are we happy with these changes?
>>>
>>>
>>> I've been studiously ignoring this patch series due to lack of bandwidth.
>>>
 [...]
 Things to come, but not in this series as they are more advanced:

 Discuss if a block/line needs a minimum requirement.

 When doing reviews with this series, a couple of lines such
 as "\t\t}" were marked as a moved, which is not wrong as they
 really occurred in the text with opposing sign.
 But it was annoying as it drew my attention to just closing
 braces, which IMO is not the point of code review.

 To solve this issue I had the idea of a "minimum requirement", e.g.
 * at least 3 consecutive lines or
 * at least one line with at least 3 non-ws characters or
 * compute the entropy of a given moved block and if it is too low, do
   not mark it up.
>>>
>>> Shooting from the hip here...
>>>
>>> It seems obvious that for a line to be marked as moved, a minimum
>>> requirement is that
>>>
>>> 1. The line appears as both "+" and "-".
>>>
>>> That doesn't seem strong enough evidence though, and if that is the
>>> only criterion, I would expect a lot of boilerplate lines like "\t\t}"
>>> to be marked as moved. It seems like a lot of noise could be
>>> eliminated by *also* requiring that
>>>
>>> 2a. The line doesn't appear elsewhere in the file(s) concerned.
>
> 'elsewhere' in the opposing sign (+,-) or all the diff (including ' ' 
> context)?
>
> This rule opens up the discussion on multi-copies, which I imagine
> happens a lot in configuration files. So say you have a prod and staging
> environment, then you might be tempted to make patches titled as:
>   "1. preparation: duplicate common code into prod and staging"
>   "2. Make an actual change to staging"
>
> For 1. you still want to see that there is faithful copy, but we'd have
> 2 postimages having these lines.
>
> Also what about de-duplication?
> I just stumbled upon edb0c72428 ([PATCH] diff: consolidate test
> helper script pieces., 2005-05-31) for unrelated reasons,
> but the move coloring of the same content multiple times
> helped me there to focus on the relevant part.
>
>>>
>>> Rule (2a) would probably get rid of most boilerplate lines without
>>> having to try to measure entropy.
>
> But it would also get rid of good use cases when not being very careful.
> I intentionally left out the (2a) as I am not yet sure how the move
> detection for multiple occurrences in post and preimage should
> work in the desired case. The suppression of little-entropy closing braces
> might be a side effect of just this. Or it can be treated separately.
>
>>>
>>> Maybe you are already using both criteria? I didn't see it in a quick
>>> perusal of the code.
>>>
>>> OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved
>>> *only* because they appear elsewhere in the file(s). If you did so,
>>> you would have gaps of supposedly non-moved lines in the middle of
>>> moved blocks. This suggests marking as moved lines matching (1) and
>>> (2a) but also lines matching (1) and the following:
>>>
>>> 2b. The line is adjacent to to another line that is thought to have
>>> moved from the same old location to the same new location.
>
> This is what we do, a "block detection" by comparing "line runs" against
> the current lines. Based on these line runs we detect one block and
> color up adjacent blocks.
>
>>>
>>> Rule (2b) would be applied recursively, with the net effect being that
>>> any line satisfying (1) and (2a) is allowed to carry along any
>>> neighboring lines within the same "+"/"-" block even if they are not
>>> unique.
>
> So you are saying each block has to have at least one unique line?
> That doesn't go well with (de-)duplication IMHO.
>
> Thanks for your shot from the hip. I'll think about these rules more to see
> if I can make sense of them for duplication still.

I've just been skimming this topic so far, but a question, what variant of:

git diff ... | grep ...

Can I use to see whether the diff that's being emitted has hunks
marked as moved? Presumably this needs -c ui.color=always & grepping
for the color codes.

The use-case being to say add that diff | grep -q to a for-loop to
find all diffs in a repo that have hunks marked as moved.


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-07 Thread Stefan Beller
On Tue, Jun 6, 2017 at 3:05 PM, Jacob Keller  wrote:
> On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty  wrote:
>> On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller  wrote:
>>>
>>> > [...]
>>> >  "git diff" has been taught to optionally paint new lines that are
>>> >  the same as deleted lines elsewhere differently from genuinely new
>>> >  lines.
>>> >
>>> >  Are we happy with these changes?
>>
>>
>> I've been studiously ignoring this patch series due to lack of bandwidth.
>>
>>> [...]
>>> Things to come, but not in this series as they are more advanced:
>>>
>>> Discuss if a block/line needs a minimum requirement.
>>>
>>> When doing reviews with this series, a couple of lines such
>>> as "\t\t}" were marked as a moved, which is not wrong as they
>>> really occurred in the text with opposing sign.
>>> But it was annoying as it drew my attention to just closing
>>> braces, which IMO is not the point of code review.
>>>
>>> To solve this issue I had the idea of a "minimum requirement", e.g.
>>> * at least 3 consecutive lines or
>>> * at least one line with at least 3 non-ws characters or
>>> * compute the entropy of a given moved block and if it is too low, do
>>>   not mark it up.
>>
>> Shooting from the hip here...
>>
>> It seems obvious that for a line to be marked as moved, a minimum
>> requirement is that
>>
>> 1. The line appears as both "+" and "-".
>>
>> That doesn't seem strong enough evidence though, and if that is the
>> only criterion, I would expect a lot of boilerplate lines like "\t\t}"
>> to be marked as moved. It seems like a lot of noise could be
>> eliminated by *also* requiring that
>>
>> 2a. The line doesn't appear elsewhere in the file(s) concerned.

'elsewhere' in the opposing sign (+,-) or all the diff (including ' ' context)?

This rule opens up the discussion on multi-copies, which I imagine
happens a lot in configuration files. So say you have a prod and staging
environment, then you might be tempted to make patches titled as:
  "1. preparation: duplicate common code into prod and staging"
  "2. Make an actual change to staging"

For 1. you still want to see that there is faithful copy, but we'd have
2 postimages having these lines.

Also what about de-duplication?
I just stumbled upon edb0c72428 ([PATCH] diff: consolidate test
helper script pieces., 2005-05-31) for unrelated reasons,
but the move coloring of the same content multiple times
helped me there to focus on the relevant part.

>>
>> Rule (2a) would probably get rid of most boilerplate lines without
>> having to try to measure entropy.

But it would also get rid of good use cases when not being very careful.
I intentionally left out the (2a) as I am not yet sure how the move
detection for multiple occurrences in post and preimage should
work in the desired case. The suppression of little-entropy closing braces
might be a side effect of just this. Or it can be treated separately.

>>
>> Maybe you are already using both criteria? I didn't see it in a quick
>> perusal of the code.
>>
>> OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved
>> *only* because they appear elsewhere in the file(s). If you did so,
>> you would have gaps of supposedly non-moved lines in the middle of
>> moved blocks. This suggests marking as moved lines matching (1) and
>> (2a) but also lines matching (1) and the following:
>>
>> 2b. The line is adjacent to to another line that is thought to have
>> moved from the same old location to the same new location.

This is what we do, a "block detection" by comparing "line runs" against
the current lines. Based on these line runs we detect one block and
color up adjacent blocks.

>>
>> Rule (2b) would be applied recursively, with the net effect being that
>> any line satisfying (1) and (2a) is allowed to carry along any
>> neighboring lines within the same "+"/"-" block even if they are not
>> unique.

So you are saying each block has to have at least one unique line?
That doesn't go well with (de-)duplication IMHO.

Thanks for your shot from the hip. I'll think about these rules more to see
if I can make sense of them for duplication still.

Thanks,
Stefan


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-06 Thread Jacob Keller
On Tue, Jun 6, 2017 at 2:50 AM, Michael Haggerty  wrote:
> On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller  wrote:
>>
>> > [...]
>> >  "git diff" has been taught to optionally paint new lines that are
>> >  the same as deleted lines elsewhere differently from genuinely new
>> >  lines.
>> >
>> >  Are we happy with these changes?
>
>
> I've been studiously ignoring this patch series due to lack of bandwidth.
>
>> [...]
>> Things to come, but not in this series as they are more advanced:
>>
>> Discuss if a block/line needs a minimum requirement.
>>
>> When doing reviews with this series, a couple of lines such
>> as "\t\t}" were marked as a moved, which is not wrong as they
>> really occurred in the text with opposing sign.
>> But it was annoying as it drew my attention to just closing
>> braces, which IMO is not the point of code review.
>>
>> To solve this issue I had the idea of a "minimum requirement", e.g.
>> * at least 3 consecutive lines or
>> * at least one line with at least 3 non-ws characters or
>> * compute the entropy of a given moved block and if it is too low, do
>>   not mark it up.
>
> Shooting from the hip here...
>
> It seems obvious that for a line to be marked as moved, a minimum
> requirement is that
>
> 1. The line appears as both "+" and "-".
>
> That doesn't seem strong enough evidence though, and if that is the
> only criterion, I would expect a lot of boilerplate lines like "\t\t}"
> to be marked as moved. It seems like a lot of noise could be
> eliminated by *also* requiring that
>
> 2a. The line doesn't appear elsewhere in the file(s) concerned.
>
> Rule (2a) would probably get rid of most boilerplate lines without
> having to try to measure entropy.
>
> Maybe you are already using both criteria? I didn't see it in a quick
> perusal of the code.
>
> OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved
> *only* because they appear elsewhere in the file(s). If you did so,
> you would have gaps of supposedly non-moved lines in the middle of
> moved blocks. This suggests marking as moved lines matching (1) and
> (2a) but also lines matching (1) and the following:
>
> 2b. The line is adjacent to to another line that is thought to have
> moved from the same old location to the same new location.
>
> Rule (2b) would be applied recursively, with the net effect being that
> any line satisfying (1) and (2a) is allowed to carry along any
> neighboring lines within the same "+"/"-" block even if they are not
> unique.
>
> Michael

This sounds reasonable to me, though I'm not sure how easy it is to implement.

Thanks,
Jake


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-06 Thread Michael Haggerty
On Mon, Jun 5, 2017 at 8:23 PM, Stefan Beller  wrote:
>
> > [...]
> >  "git diff" has been taught to optionally paint new lines that are
> >  the same as deleted lines elsewhere differently from genuinely new
> >  lines.
> >
> >  Are we happy with these changes?


I've been studiously ignoring this patch series due to lack of bandwidth.

> [...]
> Things to come, but not in this series as they are more advanced:
>
> Discuss if a block/line needs a minimum requirement.
>
> When doing reviews with this series, a couple of lines such
> as "\t\t}" were marked as a moved, which is not wrong as they
> really occurred in the text with opposing sign.
> But it was annoying as it drew my attention to just closing
> braces, which IMO is not the point of code review.
>
> To solve this issue I had the idea of a "minimum requirement", e.g.
> * at least 3 consecutive lines or
> * at least one line with at least 3 non-ws characters or
> * compute the entropy of a given moved block and if it is too low, do
>   not mark it up.

Shooting from the hip here...

It seems obvious that for a line to be marked as moved, a minimum
requirement is that

1. The line appears as both "+" and "-".

That doesn't seem strong enough evidence though, and if that is the
only criterion, I would expect a lot of boilerplate lines like "\t\t}"
to be marked as moved. It seems like a lot of noise could be
eliminated by *also* requiring that

2a. The line doesn't appear elsewhere in the file(s) concerned.

Rule (2a) would probably get rid of most boilerplate lines without
having to try to measure entropy.

Maybe you are already using both criteria? I didn't see it in a quick
perusal of the code.

OTOH, it would be silly to refuse to mark lines like "\t\t}" as moved
*only* because they appear elsewhere in the file(s). If you did so,
you would have gaps of supposedly non-moved lines in the middle of
moved blocks. This suggests marking as moved lines matching (1) and
(2a) but also lines matching (1) and the following:

2b. The line is adjacent to to another line that is thought to have
moved from the same old location to the same new location.

Rule (2b) would be applied recursively, with the net effect being that
any line satisfying (1) and (2a) is allowed to carry along any
neighboring lines within the same "+"/"-" block even if they are not
unique.

Michael


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-06 Thread Jacob Keller
On Mon, Jun 5, 2017 at 6:10 PM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>>>  "git diff" has been taught to optionally paint new lines that are
>>>  the same as deleted lines elsewhere differently from genuinely new
>>>  lines.
>>>
>>>  Are we happy with these changes?
>>
>> I advertised this series e.g. for reviewing Brandons
>> repo object refactoring series and used it myself to inspect
>> some patches there[1]. I am certainly happy (but biased) with
>> what we have available there.
>>
>> Jacob intended to use this series
>> for review as well, but has given no opinion yet.
>>
>> You seemed to have used it for js/blame-lib?
>>
>> --
>> Those patches had a wide reviewer audience cc'd,
>> so I would think people are aware of this series.
>
> I tried to, yes.  I haven't had a chance to see how well the current
> iteration fares "does the externally-visible goal make sense?" test.
>
> I do not think I saw a negative "an approach to show this kind of
> output would not be useful" reaction, so I assume at least people
> would want an alternative output format that would help reviewing a
> change that moves blocks of lines around.
>
> In any case, that is not a review.  A patch series wanting to do a
> good thing, and people agreeing that the externally visible effect
> it produces matches that good thing, is one thing.  A review that
> makes sure the code achieves the externally visible effect well
> (e.g. without overly inefficient algorithm, without buffer overflows
> or underflows, off-by-ones, etc.) is another thing, and I haven't
> seen anybody going with fine toothed comb to do that kind of review,
> hence my "are we happy?" inquiry.
>

I will try to find some time tomorrow to go over it in detail.

Thanks,
Jake


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-06 Thread Jacob Keller
On Mon, Jun 5, 2017 at 11:23 AM, Stefan Beller  wrote:
>> * sb/diff-color-move (2017-06-01) 17 commits
>>  - diff.c: color moved lines differently
>>  - diff: buffer all output if asked to
>>  - diff.c: emit_line includes whitespace highlighting
>>  - diff.c: convert diff_summary to use emit_line_*
>>  - diff.c: convert diff_flush to use emit_line_*
>>  - diff.c: convert word diffing to use emit_line_*
>>  - diff.c: convert show_stats to use emit_line_*
>>  - diff.c: convert emit_binary_diff_body to use emit_line_*
>>  - submodule.c: convert show_submodule_summary to use emit_line_fmt
>>  - diff.c: convert emit_rewrite_lines to use emit_line_*
>>  - diff.c: convert emit_rewrite_diff to use emit_line_*
>>  - diff.c: convert builtin_diff to use emit_line_*
>>  - diff.c: convert fn_out_consume to use emit_line
>>  - diff: introduce more flexible emit function
>>  - diff.c: factor out diff_flush_patch_all_file_pairs
>>  - diff: move line ending check into emit_hunk_header
>>  - diff: readability fix
>>
>>  "git diff" has been taught to optionally paint new lines that are
>>  the same as deleted lines elsewhere differently from genuinely new
>>  lines.
>>
>>  Are we happy with these changes?
>
> I advertised this series e.g. for reviewing Brandons
> repo object refactoring series and used it myself to inspect
> some patches there[1]. I am certainly happy (but biased) with
> what we have available there.
>
> Jacob intended to use this series
> for review as well, but has given no opinion yet.

I haven't had any problems thus far. Been using it for the past few
days at $DAYJOB.

Haven't said anything yet because I haven't really had anything to
add. I like it.

Thanks,
Jake

>
> You seemed to have used it for js/blame-lib?
>
> --
> Those patches had a wide reviewer audience cc'd,
> so I would think people are aware of this series.
>
> --
> Things to come, but not in this series as they are more advanced:
>
> Discuss if a block/line needs a minimum requirement.
>
> When doing reviews with this series, a couple of lines such
> as "\t\t}" were marked as a moved, which is not wrong as they
> really occurred in the text with opposing sign.
> But it was annoying as it drew my attention to just closing
> braces, which IMO is not the point of code review.
>
> To solve this issue I had the idea of a "minimum requirement", e.g.
> * at least 3 consecutive lines or
> * at least one line with at least 3 non-ws characters or
> * compute the entropy of a given moved block and if it is too low, do
>   not mark it up.
>
> I am not sure if such a "minimum requirement" is the right approach
> at all. The nature of this discussion comes close to the diff heuristics
> at which Michael did present a wonderful solution, hence I had him cc'd
> on the series as he may have some good insights on how to improve
> the diffs. :)
>
> --
> In conclusion:
>
> We are happy to move to next as it seems technically sound.
>
> But we want more exposure on usage to point out UX bugs.
> (e.g. is the default mode for just giving --color-moved good for the
> majority of people/use cases? Are there subtle annoyances such
> as the closing braces?)
>
> So maybe merge to next with the strong option to evict it
> when finding more fundamentally wrong things?
>
> Thanks,
> Stefan
>
> [1]
> https://public-inbox.org/git/cagz79kzjf9idsvgyi-hskb6n8w0uhvcu4w-r89f0erjpxe_...@mail.gmail.com/


Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-05 Thread Junio C Hamano
Stefan Beller  writes:

>>  "git diff" has been taught to optionally paint new lines that are
>>  the same as deleted lines elsewhere differently from genuinely new
>>  lines.
>>
>>  Are we happy with these changes?
>
> I advertised this series e.g. for reviewing Brandons
> repo object refactoring series and used it myself to inspect
> some patches there[1]. I am certainly happy (but biased) with
> what we have available there.
>
> Jacob intended to use this series
> for review as well, but has given no opinion yet.
>
> You seemed to have used it for js/blame-lib?
>
> --
> Those patches had a wide reviewer audience cc'd,
> so I would think people are aware of this series.

I tried to, yes.  I haven't had a chance to see how well the current
iteration fares "does the externally-visible goal make sense?" test.

I do not think I saw a negative "an approach to show this kind of
output would not be useful" reaction, so I assume at least people
would want an alternative output format that would help reviewing a
change that moves blocks of lines around.

In any case, that is not a review.  A patch series wanting to do a
good thing, and people agreeing that the externally visible effect
it produces matches that good thing, is one thing.  A review that
makes sure the code achieves the externally visible effect well
(e.g. without overly inefficient algorithm, without buffer overflows
or underflows, off-by-ones, etc.) is another thing, and I haven't
seen anybody going with fine toothed comb to do that kind of review,
hence my "are we happy?" inquiry.



Re: What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-05 Thread Stefan Beller
> * sb/diff-color-move (2017-06-01) 17 commits
>  - diff.c: color moved lines differently
>  - diff: buffer all output if asked to
>  - diff.c: emit_line includes whitespace highlighting
>  - diff.c: convert diff_summary to use emit_line_*
>  - diff.c: convert diff_flush to use emit_line_*
>  - diff.c: convert word diffing to use emit_line_*
>  - diff.c: convert show_stats to use emit_line_*
>  - diff.c: convert emit_binary_diff_body to use emit_line_*
>  - submodule.c: convert show_submodule_summary to use emit_line_fmt
>  - diff.c: convert emit_rewrite_lines to use emit_line_*
>  - diff.c: convert emit_rewrite_diff to use emit_line_*
>  - diff.c: convert builtin_diff to use emit_line_*
>  - diff.c: convert fn_out_consume to use emit_line
>  - diff: introduce more flexible emit function
>  - diff.c: factor out diff_flush_patch_all_file_pairs
>  - diff: move line ending check into emit_hunk_header
>  - diff: readability fix
>
>  "git diff" has been taught to optionally paint new lines that are
>  the same as deleted lines elsewhere differently from genuinely new
>  lines.
>
>  Are we happy with these changes?

I advertised this series e.g. for reviewing Brandons
repo object refactoring series and used it myself to inspect
some patches there[1]. I am certainly happy (but biased) with
what we have available there.

Jacob intended to use this series
for review as well, but has given no opinion yet.

You seemed to have used it for js/blame-lib?

--
Those patches had a wide reviewer audience cc'd,
so I would think people are aware of this series.

--
Things to come, but not in this series as they are more advanced:

Discuss if a block/line needs a minimum requirement.

When doing reviews with this series, a couple of lines such
as "\t\t}" were marked as a moved, which is not wrong as they
really occurred in the text with opposing sign.
But it was annoying as it drew my attention to just closing
braces, which IMO is not the point of code review.

To solve this issue I had the idea of a "minimum requirement", e.g.
* at least 3 consecutive lines or
* at least one line with at least 3 non-ws characters or
* compute the entropy of a given moved block and if it is too low, do
  not mark it up.

I am not sure if such a "minimum requirement" is the right approach
at all. The nature of this discussion comes close to the diff heuristics
at which Michael did present a wonderful solution, hence I had him cc'd
on the series as he may have some good insights on how to improve
the diffs. :)

--
In conclusion:

We are happy to move to next as it seems technically sound.

But we want more exposure on usage to point out UX bugs.
(e.g. is the default mode for just giving --color-moved good for the
majority of people/use cases? Are there subtle annoyances such
as the closing braces?)

So maybe merge to next with the strong option to evict it
when finding more fundamentally wrong things?

Thanks,
Stefan

[1]
https://public-inbox.org/git/cagz79kzjf9idsvgyi-hskb6n8w0uhvcu4w-r89f0erjpxe_...@mail.gmail.com/


What's cooking in git.git (Jun 2017, #03; Mon, 5)

2017-06-04 Thread Junio C Hamano
Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

With many fixes accumulated since v2.13.0, the first maintenance
release v2.13.1 has been tagged.  Thanks for all the help.  I'll be
mostly offline in the later part of this week, so please expect slow
response.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/c-translators-comment-style (2017-05-31) 1 commit
  (merged to 'next' on 2017-06-01 at df7fb9199a)
 + C style: use standard style for "TRANSLATORS" comments

 Update the C style recommendation for notes for translators, as
 recent versions of gettext tools can work with our style of
 multi-line comments.


* ab/sha1dc-maint (2017-05-22) 1 commit
  (merged to 'next' on 2017-05-30 at 9eb40bf912)
 + sha1dc: update from upstream
 (this branch is used by ab/sha1dc.)

 The "collision detecting" SHA-1 implementation shipped with 2.13
 was quite broken on some big-endian platforms and/or platforms that
 do not like unaligned fetches.  Update to the upstream code which
 has already fixed these issues.


* ab/t3070-test-dedup (2017-05-29) 1 commit
  (merged to 'next' on 2017-05-30 at 71eadec33b)
 + wildmatch test: remove redundant duplicate test

 Test cleanup.


* ad/pull-remote-doc (2017-06-02) 1 commit
  (merged to 'next' on 2017-06-02 at 32915e88db)
 + docs: fix formatting and grammar

 Docfix.


* ah/doc-filter-branch-export-env (2017-05-29) 1 commit
  (merged to 'next' on 2017-05-30 at a62168bf77)
 + doc: filter-branch does not require re-export of vars

 Docfix.


* ah/doc-rev-parse-short-default (2017-06-01) 1 commit
  (merged to 'next' on 2017-06-01 at f64fcc2a26)
 + doc: rewrite description for rev-parse --short

 Doc update.


* jh/close-index-before-stat (2017-04-28) 1 commit
  (merged to 'next' on 2017-05-16 at 0c0372eb02)
 + read-cache: close index.lock in do_write_index

 Originally merged to 'next' on 2017-04-30

 The timestamp of the index file is now taken after the file is
 closed, to help Windows, on which a stale timestamp is reported by
 fstat() on a file that is opened for writing and data was written
 but not yet closed.


* jk/connect-symref-info-leak-fix (2017-05-26) 1 commit
  (merged to 'next' on 2017-05-30 at d8b75d2dd9)
 + connect.c: fix leak in parse_one_symref_info()

 Leakfix.


* jk/drop-free-refspecs (2017-06-01) 1 commit
  (merged to 'next' on 2017-06-01 at 8f455319fc)
 + remote: drop free_refspecs() function

 Code clean-up.


* jk/unbreak-am-h (2017-05-30) 1 commit
  (merged to 'next' on 2017-06-01 at ee2233d409)
 + am: handle "-h" argument earlier
 (this branch is used by jk/consistent-h.)

 "git am -h" triggered a BUG().


* jk/url-insteadof-config (2017-06-01) 1 commit
  (merged to 'next' on 2017-06-01 at e7ea6032c2)
 + docs/config: mention protocol implications of url.insteadOf

 The interaction of "url.*.insteadOf" and custom URL scheme's
 whitelisting is now documented better.


* js/blame-lib (2017-05-25) 29 commits
  (merged to 'next' on 2017-05-30 at b4678b36a7)
 + blame: move entry prepend to libgit
 + blame: move scoreboard setup to libgit
 + blame: move scoreboard-related methods to libgit
 + blame: move fake-commit-related methods to libgit
 + blame: move origin-related methods to libgit
 + blame: move core structures to header
 + blame: create entry prepend function
 + blame: create scoreboard setup function
 + blame: create scoreboard init function
 + blame: rework methods that determine 'final' commit
 + blame: wrap blame_sort and compare_blame_final
 + blame: move progress updates to a scoreboard callback
 + blame: make sanity_check use a callback in scoreboard
 + blame: move no_whole_file_rename flag to scoreboard
 + blame: move xdl_opts flags to scoreboard
 + blame: move show_root flag to scoreboard
 + blame: move reverse flag to scoreboard
 + blame: move contents_from to scoreboard
 + blame: move copy/move thresholds to scoreboard
 + blame: move stat counters to scoreboard
 + blame: rename nth_line function
 + blame: rename ent_score function
 + blame: rename coalesce function
 + blame: rename origin-related functions
 + blame: rename scoreboard structure to blame_scoreboard
 + blame: rename origin structure to blame_origin
 + blame: remove unused parameters
 + blame: move textconv_object with related functions
 + blame: remove unneeded dependency on blob.h

 The internal logic used in "git blame" has been libified to make it
 easier to use by cgit.


* mb/diff-default-to-indent-heuristics (2017-05-09) 4 commits
  (merged to 'next' on 2017-05-29 at 7645575e21)
 + add--interactive: drop diff.indentHeuristic handling
 + diff: enable indent heuristic