Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-06-03 Thread Junio C Hamano
Ramkumar Ramachandra artag...@gmail.com writes:

 The documentation of -S and -G is very sketchy.  Completely rewrite the
 sections in Documentation/diff-options.txt and
 Documentation/gitdiffcore.txt.

Will queue; thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-06-02 Thread Junio C Hamano
Ramkumar Ramachandra artag...@gmail.com writes:

 Without
 --pickaxe-all, only the filepairs matching the given
 criterion is left in the output; all filepairs are left in
 the output when --pickaxe-all is used and if at least one
 filepair matches the given criterion.

 Why do a poor-man's version of --pickaxe-all here, when the last
 paragraph already does justice to this?

The point of the first paragraph is to serve to help both:

 (1) people who read about this rather technical part of the
 diffcore pipeline machinery, to prepare them by listing what
 they will learn about in the section; and

 (2) those who already have read and want to skim over, by giving a
 concise summary.

It may have been poor because it was merely something like this
patch, though.


 While what you're saying is technically true, I think it is important
 to explain the interaction between diffcore-pickaxe and
 diffcore-rename as I have done.  Someone who wants to understand what
 `git log -S` does will come to this page and read this section:
 without reading diffcore-rename, she will have an incomplete picture;
 what's the harm in explaining diffcore-rename in the context of
 diffcore-pickaxe?

That is a red-herring; that is exactly why we upfront say that the
diffcore machinery is a pipeline and we describe upstream processing
like rename before we talk about pickaxe in the same document.

This document is the most accurate _technical_ documentation of how
the pipeline works (and it is not in section 1 of the manual set).
If you want to improve end-user documentation by adding explanation
for interactions between pipeline stages and also pathspec, I am all
for it, but I think that belongs to the larger git help diff, not
git help diffcore.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-06-02 Thread Ramkumar Ramachandra
Junio C Hamano wrote:
 Why do a poor-man's version of --pickaxe-all here, when the last
 paragraph already does justice to this?

 The point of the first paragraph is to serve to help both:

My question pertains to whether or not the explanation of
--pickaxe-all can wait till the last paragraph.  You want to
explain all the options in the first paragraph, but not
--pickaxe-regex?

 While what you're saying is technically true, I think it is important
 to explain the interaction between diffcore-pickaxe and
 diffcore-rename as I have done.  Someone who wants to understand what
 `git log -S` does will come to this page and read this section:
 without reading diffcore-rename, she will have an incomplete picture;
 what's the harm in explaining diffcore-rename in the context of
 diffcore-pickaxe?

 This document is the most accurate _technical_ documentation of how
 the pipeline works (and it is not in section 1 of the manual set).
 If you want to improve end-user documentation by adding explanation
 for interactions between pipeline stages and also pathspec, I am all
 for it, but I think that belongs to the larger git help diff, not
 git help diffcore.

It's impossible to explain in the diff-options, because I do not have
even have access to the word filepair, and no diffcore-rename
machinery to point to.  I could attempt a vague approximation, but I
do not think it's worth it.  I'll remove it if you insist.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-31 Thread Ramkumar Ramachandra
Junio C Hamano wrote:
 [...]

I agree with everything else, and made changes accordingly.

 This transformation limits the set of filepairs to those
 that change specified strings between the preimage and the
 postimage in a certain way.

Definitely good.

 -Sblock of text and -Gregex options are used to specify
 different ways these strings are sought.

Definitely better than the two kinds of pickaxe thing.

 Without
 --pickaxe-all, only the filepairs matching the given
 criterion is left in the output; all filepairs are left in
 the output when --pickaxe-all is used and if at least one
 filepair matches the given criterion.

Why do a poor-man's version of --pickaxe-all here, when the last
paragraph already does justice to this?

When `-S` or `-G` are used without `--pickaxe-all`, only filepairs
that match their respective criterion are kept in the output.  When
`--pickaxe-all` is used, if even one filepair matches their respective
criterion in a changeset, the entire changeset is kept.  This behavior
is designed to make reviewing changes in the context of the whole
changeset easier.

 I am not sure why it is necessary to say anything about what the
 previous step (diffcore-rename) might have done.  The input of this
 (or any other) step in the diffcore pipeline is a preimage-postimage
 filepairs, and to this transformation the filename does not matter.
 Whether a file was moved (either wholesale, implying nothing
 changed, or renamed with modification at the same time) without
 touching the block of text, or a file did not get involved in any
 renaming, the only thing that matters is what the preimage and the
 postimage in a filepair has (or does not have).

While what you're saying is technically true, I think it is important
to explain the interaction between diffcore-pickaxe and
diffcore-rename as I have done.  Someone who wants to understand what
`git log -S` does will come to this page and read this section:
without reading diffcore-rename, she will have an incomplete picture;
what's the harm in explaining diffcore-rename in the context of
diffcore-pickaxe?

I did do a s/rename detection/diffcore-rename/ though, so the user
knows where to look for more on this rename thing.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-31 Thread Ramkumar Ramachandra
The documentation of -S and -G is very sketchy.  Completely rewrite the
sections in Documentation/diff-options.txt and
Documentation/gitdiffcore.txt.

References:
52e9578 ([PATCH] Introducing software archaeologist's tool pickaxe.)
f506b8e (git log/diff: add -Gregexp that greps in the patch text)

Inputs-from: Phil Hord phil.h...@gmail.com
Co-authored-by: Junio C Hamano gits...@pobox.com
Signed-off-by: Ramkumar Ramachandra artag...@gmail.com
---
 Documentation/diff-options.txt | 38 +++
 Documentation/gitdiffcore.txt  | 45 +-
 2 files changed, 57 insertions(+), 26 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index b8a9b86..a85288f 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -383,14 +383,36 @@ ifndef::git-format-patch[]
that matches other criteria, nothing is selected.
 
 -Sstring::
-   Look for differences that introduce or remove an instance of
-   string. Note that this is different than the string simply
-   appearing in diff output; see the 'pickaxe' entry in
-   linkgit:gitdiffcore[7] for more details.
+   Look for differences that change the number of occurrences of
+   the specified string (i.e. addition/deletion) in a file.
+   Intended for the scripter's use.
++
+It is useful when you're looking for an exact block of code (like a
+struct), and want to know the history of that block since it first
+came into being: use the feature iteratively to feed the interesting
+block in the preimage back into `-S`, and keep going until you get the
+very first version of the block.
 
 -Gregex::
-   Look for differences whose added or removed line matches
-   the given regex.
+   Look for differences whose patch text contains added/removed
+   lines that match regex.
++
+To illustrate the difference between `-Sregex --pickaxe-regex` and
+`-Gregex`, consider a commit with the following diff in the same
+file:
++
+
++return !regexec(regexp, two-ptr, 1, regmatch, 0);
+...
+-hit = !regexec(regexp, mf2.ptr, 1, regmatch, 0);
+
++
+While `git log -Gregexec\(regexp` will show this commit, `git log
+-Sregexec\(regexp --pickaxe-regex` will not (because the number of
+occurrences of that string did not change).
++
+See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
+information.
 
 --pickaxe-all::
When `-S` or `-G` finds a change, show all the changes in that
@@ -398,8 +420,8 @@ ifndef::git-format-patch[]
in string.
 
 --pickaxe-regex::
-   Make the string not a plain string but an extended POSIX
-   regex to match.
+   Treat the string given to `-S` as an extended POSIX regular
+   expression to match.
 endif::git-format-patch[]
 
 -Oorderfile::
diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
index 568d757..c8b3e51 100644
--- a/Documentation/gitdiffcore.txt
+++ b/Documentation/gitdiffcore.txt
@@ -222,26 +222,35 @@ version prefixed with '+'.
 diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
 -
 
-This transformation is used to find filepairs that represent
-changes that touch a specified string, and is controlled by the
--S option and the `--pickaxe-all` option to the 'git diff-*'
-commands.
-
-When diffcore-pickaxe is in use, it checks if there are
-filepairs whose result side and whose origin side have
-different number of specified string.  Such a filepair represents
-the string appeared in this changeset.  It also checks for the
-opposite case that loses the specified string.
-
-When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves
-only such filepairs that touch the specified string in its
-output.  When `--pickaxe-all` is used, diffcore-pickaxe leaves all
-filepairs intact if there is such a filepair, or makes the
-output empty otherwise.  The latter behaviour is designed to
-make reviewing of the changes in the context of the whole
+This transformation limits the set of filepairs to those that change
+specified strings between the preimage and the postimage in a certain
+way.  -Sblock of text and -Gregular expression options are used to
+specify different ways these strings are sought.
+
+-Sblock of text detects filepairs whose preimage and postimage
+have different number of occurrences of the specified block of text.
+By definition, it will not detect in-file moves.  Also, when a
+changeset moves a file wholesale without affecting the interesting
+string, diffcore-rename kicks in as usual, and `-S` omits the filepair
+(since the number of occurrences of that string didn't change in that
+rename-detected filepair).  When used with `--pickaxe-regex`, treat
+the block of text as an extended POSIX regular expression to match,
+instead of a literal string.
+
+-Gregular expression (mnemonic: grep) detects filepairs whose
+textual diff 

Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-24 Thread Ramkumar Ramachandra
Junio C Hamano wrote:
 [...]

I agree with the other comments, and have made suitable changes.
Let's review your block now.

 This transformation is used to find filepairs that represent
 two kinds of changes, and is controlled by the -S, -G and
 --pickaxe-all options.

Why do you call this a transformation?  Is git log --author=foo a
transformation on the git-log output?  Then how is git log -Sfoo a
transformation?

Two kinds of changes controlled by three different options?  Isn't the
original much clearer?

The title says diffcore-pickaxe, and the first paragraph says:

There are two kinds of pickaxe: the S kind (corresponding to 'git log
-S') and the G kind (mnemonic: grep; corresponding to 'git log -G').

 The -Sblock of text option tells Git to consider that a
 filepair has differences only if the number of occurrences
 of the specified block of text is different between its
 preimage and its postimage, and treat other filepairs as if
 they did not have any change.

I'll rewrite this without the trailing and treat other filepairs as
if they did not have any change (which I'm not fond of).

 This is meant to be used with
 a block of text that is unique enough to occur only once (so
 expected the number of occurences is 1 vs 0 or 0 vs 1) to
 use with git log to find a commit that touched the block
 of text the last time.

You're saying how you think it's meant to be used, but in doing so
you've failed to describe its operation faithfully.  I've already
described how it's meant to be used in diff-options (digging a block
of text iteratively) and this is the place to explain what it is doing
faithfully.  Hence my previous writeup on changes in number of
occurrences and rename detection: I _had_ to read the code to
understand it properly, and your writeup is not helping by telling me
about idiomatic usage.

Also, you've dropped computational expense which was there in the original.

 When used with the --pickaxe-regex
 option, the block of text is used as a POSIX extended
 regular expression to match, instead of a literal string.

Better.

 The -Gregular expression option tells Git to consider
 that a filepair has differences only if a textual diff
 between its preimage and postimage would indicate a line
 that matches the given regular expression is changed, and
 treat other filepairs as if they did not have any change.

would indicate?  Really?  I'll rewrite this without the trailing
and treat other filepairs ...

You've once again dropped what it means in the context of in-file
moves (rename detection), and computational expense from the original.

 When -S or -G option is used without --pickaxe-all option,
 only filepairs that match their respective criterion are
 kept in the output.

Much better.

 When `--pickaxe-all` is used, all
 filepairs intact if there is such a filepair, or makes the
 output empty otherwise.

-ENOPARSE.  I didn't particularly like the original, and this isn't
better.  I'll rewrite it.

 This behaviour is designed to make
 reviewing of the changes in the context of the whole
 changeset easier.

Same as original.  Okay.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-24 Thread Ramkumar Ramachandra
The documentation of -S and -G is very sketchy.  Completely rewrite the
sections in Documentation/diff-options.txt and
Documentation/gitdiffcore.txt.

References:
52e9578 ([PATCH] Introducing software archaeologist's tool pickaxe.)
f506b8e (git log/diff: add -Gregexp that greps in the patch text)

Inputs-from: Phil Hord phil.h...@gmail.com
Co-authored-by: Junio C Hamano gits...@pobox.com
Signed-off-by: Ramkumar Ramachandra artag...@gmail.com
---
 Documentation/diff-options.txt | 38 +
 Documentation/gitdiffcore.txt  | 43 --
 2 files changed, 55 insertions(+), 26 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 104579d..2835eef 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -383,14 +383,36 @@ ifndef::git-format-patch[]
that matches other criteria, nothing is selected.
 
 -Sstring::
-   Look for differences that introduce or remove an instance of
-   string. Note that this is different than the string simply
-   appearing in diff output; see the 'pickaxe' entry in
-   linkgit:gitdiffcore[7] for more details.
+   Look for differences that change the number of occurrences of
+   the specified string (i.e. addition/deletion) in a file.
+   Intended for the scripter's use.
++
+It is especially useful when you're looking for an exact block of code
+(like a struct), and want to know the history of that block since it
+first came into being: use the feature iteratively to feed the
+interesting block in the preimage back into `-S`, and keep going until
+you get the very first version of the block.
 
 -Gregex::
-   Look for differences whose added or removed line matches
-   the given regex.
+   Look for differences whose patch text contains added/removed
+   lines that match regex.
++
+To illustrate the difference between `-Sregex --pickaxe-regex` and
+`-Gregex`, consider a commit with the following diff in the same
+file:
++
+
++return !regexec(regexp, two-ptr, 1, regmatch, 0);
+...
+-hit = !regexec(regexp, mf2.ptr, 1, regmatch, 0);
+
++
+While `git log -Gregexec\(regexp` will show this commit, `git log
+-Sregexec\(regexp --pickaxe-regex` will not (because the number of
+occurrences of that string did not change).
++
+See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
+information.
 
 --pickaxe-all::
When `-S` or `-G` finds a change, show all the changes in that
@@ -398,8 +420,8 @@ ifndef::git-format-patch[]
in string.
 
 --pickaxe-regex::
-   Make the string not a plain string but an extended POSIX
-   regex to match.
+   Treat the string given to `-S` as an extended POSIX regular
+   expression to match.
 endif::git-format-patch[]
 
 -Oorderfile::
diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
index 568d757..ef4c04a 100644
--- a/Documentation/gitdiffcore.txt
+++ b/Documentation/gitdiffcore.txt
@@ -222,26 +222,33 @@ version prefixed with '+'.
 diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
 -
 
-This transformation is used to find filepairs that represent
-changes that touch a specified string, and is controlled by the
--S option and the `--pickaxe-all` option to the 'git diff-*'
-commands.
-
-When diffcore-pickaxe is in use, it checks if there are
-filepairs whose result side and whose origin side have
-different number of specified string.  Such a filepair represents
-the string appeared in this changeset.  It also checks for the
-opposite case that loses the specified string.
-
-When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves
-only such filepairs that touch the specified string in its
-output.  When `--pickaxe-all` is used, diffcore-pickaxe leaves all
-filepairs intact if there is such a filepair, or makes the
-output empty otherwise.  The latter behaviour is designed to
-make reviewing of the changes in the context of the whole
+There are two kinds of pickaxe: the S kind (corresponding to 'git log
+-S') and the G kind (mnemonic: grep; corresponding to 'git log -G').
+
+-Sblock of text detects filepairs whose preimage and postimage
+have different number of occurrences of the specified block of text.
+By definition, it will not detect in-file moves.  Also, when a
+changeset moves a file wholesale without affecting the interesting
+string, rename detection kicks in as usual, and `-S` omits the
+filepair (since the number of occurrences of that string didn't change
+in that rename-detected filepair).  The implementation essentially
+runs a count, and is significantly cheaper than the G kind.  When used
+with `--pickaxe-regex`, treat the block of text as an extended POSIX
+regular expression to match, instead of a literal string.
+
+-Gregular expression detects filepairs whose textual diff has an
+added or a deleted 

Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-24 Thread Phil Hord
On Fri, May 24, 2013 at 5:37 AM, Ramkumar Ramachandra
artag...@gmail.com wrote:
 Junio C Hamano wrote:
 [...]

 I agree with the other comments, and have made suitable changes.
 Let's review your block now.

 This transformation is used to find filepairs that represent
 two kinds of changes, and is controlled by the -S, -G and
 --pickaxe-all options.

 Why do you call this a transformation?  Is git log --author=foo a
 transformation on the git-log output?  Then how is git log -Sfoo a
 transformation?

 Two kinds of changes controlled by three different options?  Isn't the
 original much clearer?

They are all three filters.  They transform the output by limiting it
to commits which meet specific conditions.  Transformation is used in
the network-graphs sense of the word.  It fits the beginning of the
document where it says this:

  The diffcore mechanism is fed a list of such comparison results
  (each of which is called filepair, although at this point each
  of them talks about a single file), and transforms such a list
  into another list.  There are currently 5 such transformations:

  - diffcore-break
  - diffcore-rename
  - diffcore-merge-broken
  - diffcore-pickaxe
  - diffcore-order
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-24 Thread Ramkumar Ramachandra
Phil Hord wrote:
 It fits the beginning of the
 document where it says this:

Ah, I missed that.  Either way, I'm quite happy with v3: we can change
the first paragraph to use the word transformation if we really
want.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-24 Thread Junio C Hamano
Phil Hord phil.h...@gmail.com writes:

 On Fri, May 24, 2013 at 5:37 AM, Ramkumar Ramachandra
 artag...@gmail.com wrote:
 Junio C Hamano wrote:
 [...]

 I agree with the other comments, and have made suitable changes.
 Let's review your block now.

 This transformation is used to find filepairs that represent
 two kinds of changes, and is controlled by the -S, -G and
 --pickaxe-all options.

 Why do you call this a transformation?  Is git log --author=foo a
 transformation on the git-log output?  Then how is git log -Sfoo a
 transformation?

 Two kinds of changes controlled by three different options?  Isn't the
 original much clearer?

 They are all three filters.  They transform the output by limiting it
 to commits which meet specific conditions.  Transformation is used in
 the network-graphs sense of the word.  It fits the beginning of the
 document where it says this:

   The diffcore mechanism is fed a list of such comparison results
   (each of which is called filepair, although at this point each
   of them talks about a single file), and transforms such a list
   into another list.  There are currently 5 such transformations:

   - diffcore-break
   - diffcore-rename
   - diffcore-merge-broken
   - diffcore-pickaxe
   - diffcore-order

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-24 Thread Junio C Hamano
Ramkumar Ramachandra artag...@gmail.com writes:

 The documentation of -S and -G is very sketchy.  Completely rewrite the
 sections in Documentation/diff-options.txt and
 Documentation/gitdiffcore.txt.

 References:
 52e9578 ([PATCH] Introducing software archaeologist's tool pickaxe.)
 f506b8e (git log/diff: add -Gregexp that greps in the patch text)

 Inputs-from: Phil Hord phil.h...@gmail.com
 Co-authored-by: Junio C Hamano gits...@pobox.com
 Signed-off-by: Ramkumar Ramachandra artag...@gmail.com
 ---
  Documentation/diff-options.txt | 38 +
  Documentation/gitdiffcore.txt  | 43 
 --
  2 files changed, 55 insertions(+), 26 deletions(-)

 diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
 index 104579d..2835eef 100644
 --- a/Documentation/diff-options.txt
 +++ b/Documentation/diff-options.txt
 @@ -383,14 +383,36 @@ ifndef::git-format-patch[]
   that matches other criteria, nothing is selected.
  
  -Sstring::
 - Look for differences that introduce or remove an instance of
 - string. Note that this is different than the string simply
 - appearing in diff output; see the 'pickaxe' entry in
 - linkgit:gitdiffcore[7] for more details.
 + Look for differences that change the number of occurrences of
 + the specified string (i.e. addition/deletion) in a file.
 + Intended for the scripter's use.
 ++
 +It is especially useful when you're looking for an exact block of code
 +(like a struct), and want to know the history of that block since it
 +first came into being: use the feature iteratively to feed the
 +interesting block in the preimage back into `-S`, and keep going until
 +you get the very first version of the block.

OK, even though I would not say especially nor useful if I were
writing it, as it is the only use case it was designed for.

  -Gregex::
 + Look for differences whose patch text contains added/removed
 + lines that match regex.
 ++
 +To illustrate the difference between `-Sregex --pickaxe-regex` and
 +`-Gregex`, consider a commit with the following diff in the same
 +file:
 ++
 +
 ++return !regexec(regexp, two-ptr, 1, regmatch, 0);
 +...
 +-hit = !regexec(regexp, mf2.ptr, 1, regmatch, 0);
 +
 ++
 +While `git log -Gregexec\(regexp` will show this commit, `git log
 +-Sregexec\(regexp --pickaxe-regex` will not (because the number of
 +occurrences of that string did not change).
 ++
 +See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
 +information.

OK.

  --pickaxe-regex::
 - Make the string not a plain string but an extended POSIX
 - regex to match.
 + Treat the string given to `-S` as an extended POSIX regular
 + expression to match.

OK.

 diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
 index 568d757..ef4c04a 100644
 --- a/Documentation/gitdiffcore.txt
 +++ b/Documentation/gitdiffcore.txt
 @@ -222,26 +222,33 @@ version prefixed with '+'.
  diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
  -
  
 -This transformation is used to find filepairs that represent
 -changes that touch a specified string, and is controlled by the
 --S option and the `--pickaxe-all` option to the 'git diff-*'
 -commands.
 -
 -When diffcore-pickaxe is in use, it checks if there are
 -filepairs whose result side and whose origin side have
 -different number of specified string.  Such a filepair represents
 -the string appeared in this changeset.  It also checks for the
 -opposite case that loses the specified string.
 -
 -When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves
 -only such filepairs that touch the specified string in its
 -output.  When `--pickaxe-all` is used, diffcore-pickaxe leaves all
 -filepairs intact if there is such a filepair, or makes the
 -output empty otherwise.  The latter behaviour is designed to
 -make reviewing of the changes in the context of the whole


 +There are two kinds of pickaxe: the S kind (corresponding to 'git log
 +-S') and the G kind (mnemonic: grep; corresponding to 'git log -G').

This is good as the beginning of a second paragraph or the second
sentence of the first paragraph.  This patch loses the description
of the general purpose of this machinery that should come at the
very beginning of the section (the original had a very good ut valid
only back when we had only -S; my how about this text did not have
a good one).

For example, the rename is about taking one set of filepairs and
expressing (some of) them as renames or copies by merging a deletion
filepair and a creation filepair into a rename-modify filepair, or
turning a creation filepair into a copy-modify filepair by finding a
preimage.  What does this transformation do?

Again here is my attempt for that missing first paragraph:

This transformation limits the set of filepairs to those
that 

Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-19 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Ramkumar Ramachandra artag...@gmail.com writes:
 ...
  -Gregex::
 -Look for differences whose added or removed line matches
 -the given regex.
 +Grep through the patch text of commits for added/removed lines
 +that match regex.  `--pickaxe-regex` is implied in this
 +mode.

 The same comment on differences vs commits apply to this.
 ...
 it will _not_ apply to users of -G.

s/.$/ unless they say --pickaxe-regex./; so -G does not imply it at
all.

 grep through, if the reader knows grep, with match regex, it
 is crystal clear that this expects a regular expression.  And that
 is the only thing that makes -G and --pickaxe-regex superficially
 related.

s/^The description begins with /;  Sorry, but I couldn't write
complete sentences on a bus ;-)

 -This transformation is used to find filepairs that represent
 -changes that touch a specified string, and is controlled by the
 --S option and the `--pickaxe-all` option to the 'git diff-*'
 -commands.
 -
 -When diffcore-pickaxe is in use, it checks if there are
 -filepairs whose result side and whose origin side have
 -different number of specified string.  Such a filepair represents
 -the string appeared in this changeset.  It also checks for the
 -opposite case that loses the specified string.
 -
 -When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves
 -only such filepairs that touch the specified string in its
 -output.  When `--pickaxe-all` is used, diffcore-pickaxe leaves all
 -filepairs intact if there is such a filepair, or makes the
 -output empty otherwise.  The latter behaviour is designed to
 -make reviewing of the changes in the context of the whole
 -changeset easier.

 This part is impossible to review on a bus, so I won't comment in
 this message.

 Why did you even have to touch the paragraph for --pickaxe-all?
 That applies to both -S and -G.  I thought it would be just the
 matter of slightly tweaking the introductory paragraph (which was
 written back when there was only -S), keeping the second paragraph
 for -S as-is, and insert an additional paragraph for -G before
 --pickaxe-all.

Now I see that the paragraph for --pickaxe-all needs to be touched;
the original talks about touch the specified string, which only
applies to -S and needs to be adjusted.

So here is my attempt of clarifying it.

This transformation is used to find filepairs that represent
two kinds of changes, and is controlled by the -S, -G and
--pickaxe-all options.

The -Sblock of text option tells Git to consider that a
filepair has differences only if the number of occurrences
of the specified block of text is different between its
preimage and its postimage, and treat other filepairs as if
they did not have any change.  This is meant to be used with
a block of text that is unique enough to occur only once (so
expected the number of occurences is 1 vs 0 or 0 vs 1) to
use with git log to find a commit that touched the block
of text the last time.  When used with the --pickaxe-regex
option, the block of text is used as a POSIX extended
regular expression to match, instead of a literal string.

The -Gregular expression option tells Git to consider
that a filepair has differences only if a textual diff
between its preimage and postimage would indicate a line
that matches the given regular expression is changed, and
treat other filepairs as if they did not have any change.

When -S or -G option is used without --pickaxe-all option,
only filepairs that match their respective criterion are
kept in the output.  When `--pickaxe-all` is used, all
filepairs intact if there is such a filepair, or makes the
output empty otherwise.  This behaviour is designed to make
reviewing of the changes in the context of the whole
changeset easier.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-17 Thread Ramkumar Ramachandra
The documentation of -S and -G is very sketchy.  Completely rewrite the
sections in Documentation/diff-options.txt and
Documentation/gitdiffcore.txt.

References:
52e9578 ([PATCH] Introducing software archaeologist's tool pickaxe.)
f506b8e (git log/diff: add -Gregexp that greps in the patch text)

Inputs-from: Phil Hord phil.h...@gmail.com
Signed-off-by: Ramkumar Ramachandra artag...@gmail.com
---
 Documentation/diff-options.txt | 37 ++---
 Documentation/gitdiffcore.txt  | 47 +-
 2 files changed, 57 insertions(+), 27 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index 104579d..b61a666 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -383,14 +383,35 @@ ifndef::git-format-patch[]
that matches other criteria, nothing is selected.
 
 -Sstring::
-   Look for differences that introduce or remove an instance of
-   string. Note that this is different than the string simply
-   appearing in diff output; see the 'pickaxe' entry in
-   linkgit:gitdiffcore[7] for more details.
+   Look for commits that change the number of occurrences of the
+   specified string (i.e. addition/ deletion) in a file.
+   Intended for the scripter's use.
++
+It is especially useful when you're looking for an exact block of code
+(like a struct), and want to know the history of that block since it
+first came into being.
 
 -Gregex::
-   Look for differences whose added or removed line matches
-   the given regex.
+   Grep through the patch text of commits for added/removed lines
+   that match regex.  `--pickaxe-regex` is implied in this
+   mode.
++
+To illustrate the difference between `-Sregex --pickaxe-regex` and
+`-Gregex`, consider a commit with the following diff in the same
+file:
++
+
++return !regexec(regexp, two-ptr, 1, regmatch, 0);
+...
+-hit = !regexec(regexp, mf2.ptr, 1, regmatch, 0);
+
++
+While `git log -Gregexec\(regexp` will show this commit, `git log
+-Sregexec\(regexp --pickaxe-regex` will not (because the number of
+occurrences of that string didn't change).
++
+See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
+information.
 
 --pickaxe-all::
When `-S` or `-G` finds a change, show all the changes in that
@@ -398,8 +419,8 @@ ifndef::git-format-patch[]
in string.
 
 --pickaxe-regex::
-   Make the string not a plain string but an extended POSIX
-   regex to match.
+   Treat the string not as a plain string, but an extended
+   POSIX regex to match.  It is implied when `-G` is used.
 endif::git-format-patch[]
 
 -Oorderfile::
diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
index 568d757..d0f2b91 100644
--- a/Documentation/gitdiffcore.txt
+++ b/Documentation/gitdiffcore.txt
@@ -222,25 +222,34 @@ version prefixed with '+'.
 diffcore-pickaxe: For Detecting Addition/Deletion of Specified String
 -
 
-This transformation is used to find filepairs that represent
-changes that touch a specified string, and is controlled by the
--S option and the `--pickaxe-all` option to the 'git diff-*'
-commands.
-
-When diffcore-pickaxe is in use, it checks if there are
-filepairs whose result side and whose origin side have
-different number of specified string.  Such a filepair represents
-the string appeared in this changeset.  It also checks for the
-opposite case that loses the specified string.
-
-When `--pickaxe-all` is not in effect, diffcore-pickaxe leaves
-only such filepairs that touch the specified string in its
-output.  When `--pickaxe-all` is used, diffcore-pickaxe leaves all
-filepairs intact if there is such a filepair, or makes the
-output empty otherwise.  The latter behaviour is designed to
-make reviewing of the changes in the context of the whole
-changeset easier.
-
+There are two kinds of pickaxe: the S kind (corresponding to 'git log
+-S') and the G kind (mnemonic: grep; corresponding to 'git log -G').
+
+The S kind detects filepairs whose result side and origin side
+have different number of occurrences of specified string.  By
+definition, it will not detect in-file moves.  Also, when a commit
+moves a file wholesale without affecting the string being looked at,
+rename detection kicks in as usual, and 'git log -S' omits the commit
+(since the number of occurrences of that string didn't change in that
+rename-detected filepair).  The implementation essentially runs a
+count, and is significantly cheaper than the G kind.
+
+The G kind detects filepairs whose patch text has an added or a
+deleted line that matches the given regexp.  This means that it can
+detect in-file (or what rename-detection considers the same file)
+moves.  The implementation of 'git log -G' runs diff twice and greps,
+and this can be quite expensive.
+
+When `--pickaxe-regex` is used with 

Re: [PATCH 2/2] diffcore-pickaxe doc: document -S and -G properly

2013-05-17 Thread Junio C Hamano
Ramkumar Ramachandra artag...@gmail.com writes:

 diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
 index 104579d..b61a666 100644
 --- a/Documentation/diff-options.txt
 +++ b/Documentation/diff-options.txt
 @@ -383,14 +383,35 @@ ifndef::git-format-patch[]
   that matches other criteria, nothing is selected.
  
  -Sstring::
 - Look for differences that introduce or remove an instance of
 - string. Note that this is different than the string simply
 - appearing in diff output; see the 'pickaxe' entry in
 - linkgit:gitdiffcore[7] for more details.
 + Look for commits that change the number of occurrences of the

The first part of the change is misguided.  First of all, this text
also appears in the documentation of git diff and -S limits the
output to those filepairs that match its criteria.

git log A..B -- foo looks for commits that has changes in paths
that matches 'foo'.  git log that selects commits is affected by
the outcome of what diff shows.  This looks for *commits* is a
characteristic of log, not specific to the -S option.

The aspect of diff behaviour that is affected by -S is what is
considered as difference.  Usually diff says preimage and
postimage are different for any change, but -S changes that.  The
preimage and postimage has to have different number of specified
block of text to be considered different.

And that is why the original says look for differences.

 + specified string (i.e. addition/ deletion) in a file.
 + Intended for the scripter's use.
 ++
 +It is especially useful when you're looking for an exact block of code
 +(like a struct), and want to know the history of that block since it
 +first came into being.

I am not sure this half-way description is a good idea.  If you want
to use it to discover what happend since it first came into being,
you need to use this (and the feature is designed for such a use
pattern) iteratively, find the first commit that changes the block
of the text that appear in the latest version, find the corresponding
block of interest in the preimage and then feed that to -S and start
digging from that first-discovered commit.  If the text describes
that iteration fully, I think that is fine, but if you read the
above literally, it looks as if you feed one version of -S and it
would do the necessary adjustment on its own, which is misleading.

  -Gregex::
 - Look for differences whose added or removed line matches
 - the given regex.
 + Grep through the patch text of commits for added/removed lines
 + that match regex.  `--pickaxe-regex` is implied in this
 + mode.

The same comment on differences vs commits apply to this.

If there were any behaviour change --pickaxe-regex introduces to -S
that also applies to -G other than the pattern is used as a regular
expression, I would agree it is a good idea to say that other
behaviour is implied, but as far as I know, there is no such
implication.  Drop --pickaxe-regex, as it is not even implied in the
code:

} else if ((argcount = short_opt('G', av, optarg))) {
options-pickaxe = optarg;
options-pickaxe_opts |= DIFF_PICKAXE_KIND_G;
return argcount;
}

so even if there were some code added in the future that does

if (options-pickaxe_opts  DIFF_PICKAXE_REGEX) {
do this thing only when the user said
--pickaxe-regex from the command line
}

it will _not_ apply to users of -G.

grep through, if the reader knows grep, with match regex, it
is crystal clear that this expects a regular expression.  And that
is the only thing that makes -G and --pickaxe-regex superficially
related.

 ++
 +To illustrate the difference between `-Sregex --pickaxe-regex` and
 +`-Gregex`, consider a commit with the following diff in the same
 +file:
 ++
 +
 ++return !regexec(regexp, two-ptr, 1, regmatch, 0);
 +...
 +-hit = !regexec(regexp, mf2.ptr, 1, regmatch, 0);
 +
 ++
 +While `git log -Gregexec\(regexp` will show this commit, `git log
 +-Sregexec\(regexp --pickaxe-regex` will not (because the number of
 +occurrences of that string didn't change).
 ++
 +See the 'pickaxe' entry in linkgit:gitdiffcore[7] for more
 +information.

This is a readable example (I think we tend not to use contraction,
so if I were writing it, I wouldn't have written didn't change,
though).

  --pickaxe-regex::
 - Make the string not a plain string but an extended POSIX
 - regex to match.
 + Treat the string not as a plain string, but an extended
 + POSIX regex to match.  It is implied when `-G` is used.

Ditto.  Rather, 

Treat the string given to the -S option as an extended
POSIX regular expression to match.

 diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt
 index 568d757..d0f2b91 100644
 --- a/Documentation/gitdiffcore.txt
 +++ b/Documentation/gitdiffcore.txt
 @@ -222,25 +222,34 @@ version