Re: [GUILT 06/28] Fix and simplify the do_get_patch function.

2014-05-07 Thread Per Cederqvist
n Tue, May 6, 2014 at 9:08 PM, Jeff Sipek jef...@josefsipek.net wrote:
 On Sun, Mar 23, 2014 at 10:03:48PM +0100, Per Cederqvist wrote:
 On Sun, Mar 23, 2014 at 6:09 PM, Jeff Sipek jef...@josefsipek.net wrote:

  On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote:
  When extracting the patch, we only want the actual patches.  We don't
  want the --- delimiter.  Simplify the extraction by simply deleting
  everything before the first diff  line.  (Use sed instead of awk to
  simplify the code.)
 
  Without this patch, guilt fold and guilt push sometimes fails if
  guilt.diffstat is true.

 Hrm, I just did a test and guilt-push seems to work with a patch such as:

 aoeuaoeu
 this is
 ---
 not a patch!
 ---
  foo |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

 diff --git a/foo b/foo
 index e69de29..203a557 100644
 --- a/foo
 +++ b/foo
 @@ -0,0 +1 @@
 +aoue

 What sort of thing breaks fold/push?

This sequence breaks:

  mkdir guiltdemo
  cd guiltdemo
  git init
  git commit --allow-empty -mstart
  guilt init
  git config guilt.diffstat true
  guilt new empty-1
  guilt new empty-2
  guilt pop
  guilt fold empty-2
  guilt pop
  guilt push

That is, create two empty patches, fold them together, pop them, and try to push
the combined (still empty) patch.

The last command fails with:

  Applying patch..empty-1
  fatal: unrecognized input
  To force apply this patch, use 'guilt push -f'

At this point, the patch series consists of a single patch, empty-1,
which contains
the five characters -, -, -, newline, newline.

The 06 patch (Added test cases for guilt fold.) contains a test case
for this issue.
Which opens a style issue. When you fix a bug, should you:

 - commit the bug fix first and the test case later, so that git bisect always
   finds commits where make test works, or
 - commit the test case first, and the bug fix later, so that it is more obvious
   that you are actually fixing a pre-existing bug, or
 - commit the test case and the bug fix in the same commit?

In this series I have committed bug fixes first and test cases later, but
I'm not convinced that is the right thing to do.

 ...
  diff --git a/guilt b/guilt
  index 8701481..c59cd0f 100755
  --- a/guilt
  +++ b/guilt
  @@ -332,12 +332,7 @@ do_make_header()
   # usage: do_get_patch patchfile
   do_get_patch()
   {
  - awk '
  -BEGIN{}
  -/^(diff |---$|--- )/ {patch = 1}
  -patch == 1 {print $0}
  -END{}
  -'  $1
  + sed -n '/^diff /,$p'  $1
 
  So, the thought behind this mess was to allow minimal patches to work.  The
  'diff' line is *not* required by patch(1)!

 I see. I can see that this is sometimes useful, even though I think
 it is more important that guilt actually works with the patches itself
 creates.

 Fair enough.  I'm convinced that we should assume that all patches start
 with 'diff '.

 ...
 I had to solve a similar problem when I wrote my utility for diffing two
 diff files (https://git.lysator.liu.se/pdiffdiff). Based on the experience
 I got doing that, I propose this new do_get_patch function:
 ...

 The idea is to collect lines that *might* start the patch in
 patchheader. Once we detect the start of the patch (via a
 line that matches ---  or any of the mode change lines)
 we print the patcheader and the current line. If we get a
 line that neither looks like a patchheader nor starts a patch,
 we discard the patcheader. This is the case of a commit
 message with a line that starts with diff .

 The function above solves the issue with lines that
 start with diff  in the commit message.  On the other
 hand, it introduces the same issue for lines that start
 with for instance old mode .

 Hrm.  I'm open to revisiting the get-patch/get-header functions to address
 the ambiguity issues in the future.

Postponing that for the future seems like a good plan. I think
there are three possible ways to deal with the problem.

 - Store the commit message, diffstat, and diffs in separate files.
   I don't like this option, as it would make it a lot less convenient
   to work with the patch files.

 - Add a quoting mechanism, so that you need to write diff or
   --- instead of just diff or --- if you want to include those
   characters at the start of the line in a commit message.

 - Ignore the problem. This is what guilt has done in the past,
   and I see no reason why it needs to change *in this patch series*.

 Actually, a smaller change that probably fixes the
 issue with diffstat is to replace

 /^(diff |---$|--- )/ {patch = 1}

 witih

 /^(diff |--- )/ {patch = 1}

 as the problem with the original implementation is that
 the ---\n line that starts the diffstat should not start
 the patch.

 (Thinking out loud...) I suppose there are three portions to a patch file...

 (1) the description
 (2) optional diffstat
 (3) the patch

 You just convinced me that the patch should start with '^diff '.  Currently,
 the diffstat begins with '^---$'.  Sadly, the description can contain what
 looks like the 

Re: [GUILT 06/28] Fix and simplify the do_get_patch function.

2014-05-06 Thread Jeff Sipek
On Sun, Mar 23, 2014 at 10:03:48PM +0100, Per Cederqvist wrote:
 On Sun, Mar 23, 2014 at 6:09 PM, Jeff Sipek jef...@josefsipek.net wrote:
 
  On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote:
  When extracting the patch, we only want the actual patches.  We don't
  want the --- delimiter.  Simplify the extraction by simply deleting
  everything before the first diff  line.  (Use sed instead of awk to
  simplify the code.)
 
  Without this patch, guilt fold and guilt push sometimes fails if
  guilt.diffstat is true.

Hrm, I just did a test and guilt-push seems to work with a patch such as:

aoeuaoeu
this is
---
not a patch!
---
 foo |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/foo b/foo
index e69de29..203a557 100644
--- a/foo
+++ b/foo
@@ -0,0 +1 @@
+aoue

What sort of thing breaks fold/push?

...
  diff --git a/guilt b/guilt
  index 8701481..c59cd0f 100755
  --- a/guilt
  +++ b/guilt
  @@ -332,12 +332,7 @@ do_make_header()
   # usage: do_get_patch patchfile
   do_get_patch()
   {
  - awk '
  -BEGIN{}
  -/^(diff |---$|--- )/ {patch = 1}
  -patch == 1 {print $0}
  -END{}
  -'  $1
  + sed -n '/^diff /,$p'  $1
 
  So, the thought behind this mess was to allow minimal patches to work.  The
  'diff' line is *not* required by patch(1)!
 
 I see. I can see that this is sometimes useful, even though I think
 it is more important that guilt actually works with the patches itself
 creates.

Fair enough.  I'm convinced that we should assume that all patches start
with 'diff '.

...
 I had to solve a similar problem when I wrote my utility for diffing two
 diff files (https://git.lysator.liu.se/pdiffdiff). Based on the experience
 I got doing that, I propose this new do_get_patch function:
...
 
 The idea is to collect lines that *might* start the patch in
 patchheader. Once we detect the start of the patch (via a
 line that matches ---  or any of the mode change lines)
 we print the patcheader and the current line. If we get a
 line that neither looks like a patchheader nor starts a patch,
 we discard the patcheader. This is the case of a commit
 message with a line that starts with diff .
 
 The function above solves the issue with lines that
 start with diff  in the commit message.  On the other
 hand, it introduces the same issue for lines that start
 with for instance old mode .

Hrm.  I'm open to revisiting the get-patch/get-header functions to address
the ambiguity issues in the future.

 Actually, a smaller change that probably fixes the
 issue with diffstat is to replace
 
 /^(diff |---$|--- )/ {patch = 1}
 
 witih
 
 /^(diff |--- )/ {patch = 1}
 
 as the problem with the original implementation is that
 the ---\n line that starts the diffstat should not start
 the patch.

(Thinking out loud...) I suppose there are three portions to a patch file...

(1) the description
(2) optional diffstat
(3) the patch

You just convinced me that the patch should start with '^diff '.  Currently,
the diffstat begins with '^---$'.  Sadly, the description can contain what
looks like the beginning of a diffstat or a patch.  In the case of
do_get_patch, we're only interested in the patch, so we can just look for
'^diff ' (because garbage before the diff itself seems to be ignored by
git).  (If we wanted to allow patches without the 'diff ' line, we'd need
'^(diff |--- )'.)

I feel like I'm missing something.

Jeff.

-- 
I'm somewhere between geek and normal.
- Linus Torvalds
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GUILT 06/28] Fix and simplify the do_get_patch function.

2014-03-23 Thread Jeff Sipek
On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote:
 When extracting the patch, we only want the actual patches.  We don't
 want the --- delimiter.  Simplify the extraction by simply deleting
 everything before the first diff  line.  (Use sed instead of awk to
 simplify the code.)
 
 Without this patch, guilt fold and guilt push sometimes fails if
 guilt.diffstat is true.
 
 Signed-off-by: Per Cederqvist ced...@opera.com
 ---
  guilt | 7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)
 
 diff --git a/guilt b/guilt
 index 8701481..c59cd0f 100755
 --- a/guilt
 +++ b/guilt
 @@ -332,12 +332,7 @@ do_make_header()
  # usage: do_get_patch patchfile
  do_get_patch()
  {
 - awk '
 -BEGIN{}
 -/^(diff |---$|--- )/ {patch = 1}
 -patch == 1 {print $0}
 -END{}
 -'  $1
 + sed -n '/^diff /,$p'  $1

So, the thought behind this mess was to allow minimal patches to work.  The
'diff' line is *not* required by patch(1)!

Is it a problem if a patch description contains a line that starts with
'diff '?  (The current code has this issue as well.)

For the record, this ambiguity is what's on several occasions made me
consider splitting the header and the patch into separate files.  Yeah, it'd
be messier.  :/

  }
  
  # usage: do_get_header patchfile
 -- 
 1.8.3.1
 

-- 
Defenestration n. (formal or joc.):
  The act of removing Windows from your computer in disgust, usually
  followed by the installation of Linux or some other Unix-like operating
  system.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GUILT 06/28] Fix and simplify the do_get_patch function.

2014-03-23 Thread Per Cederqvist
On Sun, Mar 23, 2014 at 6:09 PM, Jeff Sipek jef...@josefsipek.net wrote:

 On Fri, Mar 21, 2014 at 08:31:44AM +0100, Per Cederqvist wrote:
 When extracting the patch, we only want the actual patches.  We don't
 want the --- delimiter.  Simplify the extraction by simply deleting
 everything before the first diff  line.  (Use sed instead of awk to
 simplify the code.)

 Without this patch, guilt fold and guilt push sometimes fails if
 guilt.diffstat is true.

 Signed-off-by: Per Cederqvist ced...@opera.com
 ---
  guilt | 7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)

 diff --git a/guilt b/guilt
 index 8701481..c59cd0f 100755
 --- a/guilt
 +++ b/guilt
 @@ -332,12 +332,7 @@ do_make_header()
  # usage: do_get_patch patchfile
  do_get_patch()
  {
 - awk '
 -BEGIN{}
 -/^(diff |---$|--- )/ {patch = 1}
 -patch == 1 {print $0}
 -END{}
 -'  $1
 + sed -n '/^diff /,$p'  $1

 So, the thought behind this mess was to allow minimal patches to work.  The
 'diff' line is *not* required by patch(1)!

I see. I can see that this is sometimes useful, even though I think
it is more important that guilt actually works with the patches itself
creates.

 Is it a problem if a patch description contains a line that starts with
 'diff '?  (The current code has this issue as well.)

Yes.

 For the record, this ambiguity is what's on several occasions made me
 consider splitting the header and the patch into separate files.  Yeah, it'd
 be messier.  :/

I really like having the header and the patches in the same file. I had
to solve a similar problem when I wrote my utility for diffing two diff
files (https://git.lysator.liu.se/pdiffdiff). Based on the experience I got
doing that, I propose this new do_get_patch function:

# usage: do_get_patch patchfile
do_get_patch()
{
awk '
BEGIN {
patchheader = 
patch = 0
}
patch == 1 {
print $0
next
}
/^(diff |index |=|RCS file:|retrieving revision )/ {
patchheader = patchheader $0 \n
next
}
/^(--- |((new|deleted) file|old) mode )/ {
printf %s, patchheader
  -/^(diff |---$|--- )/ {patch = 1}  print $0
patch = 1
}
{
patchheader = 
}'  $1

}

The idea is to collect lines that *might* start the patch in
patchheader. Once we detect the start of the patch (via a
line that matches ---  or any of the mode change lines)
we print the patcheader and the current line. If we get a
line that neither looks like a patchheader nor starts a patch,
we discard the patcheader. This is the case of a commit
message with a line that starts with diff .

The function above solves the issue with lines that
start with diff  in the commit message.  On the other
hand, it introduces the same issue for lines that start
with for instance old mode .

Actually, a smaller change that probably fixes the
issue with diffstat is to replace

/^(diff |---$|--- )/ {patch = 1}

witih

/^(diff |--- )/ {patch = 1}

as the problem with the original implementation is that
the ---\n line that starts the diffstat should not start
the patch.

/ceder

  }

  # usage: do_get_header patchfile
 --
 1.8.3.1


 --
 Defenestration n. (formal or joc.):
   The act of removing Windows from your computer in disgust, usually
   followed by the installation of Linux or some other Unix-like operating
   system.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html