Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-27 Thread Johannes Schindelin
Hi,

On Mon, 6 Aug 2018, Eric Sunshine wrote:

> On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga  
> wrote:
> > But your suggestion did make me think about what behaviour I would
> > like to see, exactly. I like that Git removes commits that no longer
> > serve any purpose (because I've included their changes in earlier
> > commits). So I would not want to keep commits that become empty during
> > the rebase. What I would like to see is that commits that _start out_
> > as empty, are retained. Would such behaviour make sense? Or would that
> > be considered surprising behaviour?
> 
> I, personally, have no opinion since I don't use empty commits.
> Perhaps someone more experienced and more long-sighted will chime in.

I got aware about two weeks ago that my mail provider fails to deliver all
the mails that are addressed to me. Two days ago, I managed to get this
mail (and the thread) via public-inbox (thanks, Eric!!!). Hence my late
reply.

Hilco: if you provide a patch to extend the test suite to demonstrate the
breakage (via `test_expect_failure`), I promise to try my best to provide
a fix on top.

Ciao,
Johannes


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-07 Thread Junio C Hamano
Eric Sunshine  writes:

> On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga  
> wrote:
>> But your suggestion did make me think about what behaviour I would
>> like to see, exactly. I like that Git removes commits that no longer
>> serve any purpose (because I've included their changes in earlier
>> commits). So I would not want to keep commits that become empty during
>> the rebase. What I would like to see is that commits that _start out_
>> as empty, are retained. Would such behaviour make sense? Or would that
>> be considered surprising behaviour?
>
> I, personally, have no opinion since I don't use empty commits.
> Perhaps someone more experienced and more long-sighted will chime in.

0661e49a ("git-rebase.txt: document behavioral differences between
modes", 2018-06-27) added the following.  In short, "rebase -i"
should already behave that way.

+ * empty commits:
+
+am-based rebase will drop any "empty" commits, whether the
+commit started empty (had no changes relative to its parent to
+start with) or ended empty (all changes were already applied
+upstream in other commits).
+
+merge-based rebase does the same.
+
+interactive-based rebase will by default drop commits that
+started empty and halt if it hits a commit that ended up empty.
+The `--keep-empty` option exists for interactive rebases to allow
+it to keep commits that started empty.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-06 Thread Eric Sunshine
On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga  wrote:
> But your suggestion did make me think about what behaviour I would
> like to see, exactly. I like that Git removes commits that no longer
> serve any purpose (because I've included their changes in earlier
> commits). So I would not want to keep commits that become empty during
> the rebase. What I would like to see is that commits that _start out_
> as empty, are retained. Would such behaviour make sense? Or would that
> be considered surprising behaviour?

I, personally, have no opinion since I don't use empty commits.
Perhaps someone more experienced and more long-sighted will chime in.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-06 Thread Hilco Wijbenga
On Tue, Jul 31, 2018 at 11:22 PM, Eric Sunshine  wrote:
> On Tue, Jul 31, 2018 at 9:30 PM Hilco Wijbenga  
> wrote:
>> On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine  
>> wrote:
>> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit
>> > object corruption when "rebase -i --root" swaps in a new commit as root.
>> > Unfortunately, those bugs made it into v2.18.0 and have already
>> > corrupted at least one repository (a local project of mine). Patches 3/4
>> > and 4/4 are new.
>>
>> Does this also fix losing the initial commit if it is empty?
>>
>> git init ; git commit -m 'Initial commit' --allow-empty ; touch
>> file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git
>> rebase --root
>>
>> I would expect there to be 2 commits but the first one has
>> disappeared. (This usually happens with "git rebase -i --root" early
>> on in a new project.)
>
> This series should have no impact on that issue; it is fixing root
> commit object corruption, and does not touch or change how the commit
> graph is maintained or how the rebasing machinery itself works (and
> the --allow-empty stuff is part of that machinery).
>
> As Dscho is cc:'d already, perhaps he can chime in as a resident expert.
>
> By the way, what happens if you use --keep-empty while rebasing?

I was not aware of that flag. But, although I was expecting it to, if
I use it with the rebase, I don't see any different behaviour. I can't
really make out what its purpose is from "Keep the commits that do not
change anything from its parents in the result.".

But your suggestion did make me think about what behaviour I would
like to see, exactly. I like that Git removes commits that no longer
serve any purpose (because I've included their changes in earlier
commits). So I would not want to keep commits that become empty during
the rebase. What I would like to see is that commits that _start out_
as empty, are retained. Would such behaviour make sense? Or would that
be considered surprising behaviour?


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-02 Thread Eric Sunshine
On Wed, Aug 1, 2018 at 7:25 PM brian m. carlson
 wrote:
> On Tue, Jul 31, 2018 at 03:33:27AM -0400, Eric Sunshine wrote:
> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> > object corruption when "rebase -i --root" swaps in a new commit as root.
> > Unfortunately, those bugs made it into v2.18.0 and have already
> > corrupted at least one repository (a local project of mine). Patches 3/4
> > and 4/4 are new.
>
> I looked at this series and it seems sane and logical to me.  Thanks for
> acting quickly to fix the corruption.
>
> Reviewed-by: brian m. carlson 

Thanks for the review.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-01 Thread brian m. carlson
On Tue, Jul 31, 2018 at 03:33:27AM -0400, Eric Sunshine wrote:
> This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> object corruption when "rebase -i --root" swaps in a new commit as root.
> Unfortunately, those bugs made it into v2.18.0 and have already
> corrupted at least one repository (a local project of mine). Patches 3/4
> and 4/4 are new.
> 
> v1 fixed these bugs:
> 
> * trailing garbage on the commit's "author" header
> 
> * extra trailing digit on "author" header's timezone (caused by two
>   separate bugs)
> 
> v2 fixes those same bugs, plus:
> 
> * eliminates a bogus "@" prepended to the "author" header timestamp
>   which renders the header corrupt
> 
> * takes care to validate information coming from
>   "rebase-merge/author-script" before incorporating it into the "author"
>   header since that file may be hand-edited, and bogus hand-edited
>   values could corrupt the commit object.

I looked at this series and it seems sane and logical to me.  Thanks for
acting quickly to fix the corruption.

Reviewed-by: brian m. carlson 
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-08-01 Thread Eric Sunshine
On Tue, Jul 31, 2018 at 9:30 PM Hilco Wijbenga  wrote:
> On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine  
> wrote:
> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> > object corruption when "rebase -i --root" swaps in a new commit as root.
> > Unfortunately, those bugs made it into v2.18.0 and have already
> > corrupted at least one repository (a local project of mine). Patches 3/4
> > and 4/4 are new.
>
> Does this also fix losing the initial commit if it is empty?
>
> git init ; git commit -m 'Initial commit' --allow-empty ; touch
> file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git
> rebase --root
>
> I would expect there to be 2 commits but the first one has
> disappeared. (This usually happens with "git rebase -i --root" early
> on in a new project.)

This series should have no impact on that issue; it is fixing root
commit object corruption, and does not touch or change how the commit
graph is maintained or how the rebasing machinery itself works (and
the --allow-empty stuff is part of that machinery).

As Dscho is cc:'d already, perhaps he can chime in as a resident expert.

By the way, what happens if you use --keep-empty while rebasing?


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Hilco Wijbenga
Hi Eric,

On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine  wrote:
> This is a re-roll of [1] which fixes sequencer bugs resulting in commit
> object corruption when "rebase -i --root" swaps in a new commit as root.
> Unfortunately, those bugs made it into v2.18.0 and have already
> corrupted at least one repository (a local project of mine). Patches 3/4
> and 4/4 are new.
>
> v1 fixed these bugs:
>
> * trailing garbage on the commit's "author" header
>
> * extra trailing digit on "author" header's timezone (caused by two
>   separate bugs)
>
> v2 fixes those same bugs, plus:
>
> * eliminates a bogus "@" prepended to the "author" header timestamp
>   which renders the header corrupt
>
> * takes care to validate information coming from
>   "rebase-merge/author-script" before incorporating it into the "author"
>   header since that file may be hand-edited, and bogus hand-edited
>   values could corrupt the commit object.
>

Does this also fix losing the initial commit if it is empty?

Given

git init ; git commit -m 'Initial commit' --allow-empty ; touch
file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git
rebase --root

I would expect there to be 2 commits but the first one has
disappeared. (This usually happens with "git rebase -i --root" early
on in a new project.)

Cheers,
Hilco


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Eric Sunshine
On Tue, Jul 31, 2018 at 6:46 AM Eric Sunshine  wrote:
> Anyhow, thanks for reading over the series. I appreciate it even if
> our "sense of priority" doesn't always align (as evidenced by your
> review comments and my responses).

To be clear, the changes you suggest all make sense, and would be
welcome (especially the bug fixes), but I consider them lower priority
than the fixes in this series, and here's why:

The commit object corruption caused by the bugs fixed by this series
are unavoidable. Anyone using "rebase -i --root" to swap in a new
commit as root is going to end up with a corrupt repository no matter
what. There's no way to side step it. And, most people won't know how
to fix the corruption, assuming they even notice it.

If I understand correctly, the issues you describe are unlikely to
come up in practice. The only way they can arise is if someone hand
edits the script (something only power users will do) _and_ botches
the edit in the process, or if the person's name contains an
apostrophe (possible, though perhaps uncommon?). Also, (if again I
understand correctly) they are only data "corruptions", not genuine
broken-repository corruptions, thus are more likely to be fixable by a
typical user.

So it's not that I think your proposed fixes and suggestions are
unimportant, I just don't think they belong in this series, and would
be happy to see them atop it.

Thanks.


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Phillip Wood
Hi Eric

On 31/07/18 11:46, Eric Sunshine wrote:
> On Tue, Jul 31, 2018 at 6:06 AM Phillip Wood  
> wrote:
>> On 31/07/18 08:33, Eric Sunshine wrote:
>>> Patch 2/4 of this series conflicts with Akinori MUSHA's
>>> 'am/sequencer-author-script-fix' which takes a stab at fixing one of the
>>> four (or so) bugs fixed by this series (namely, adding a missing closing
>>> quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
>>> probably ought to be dropped (without prejudice) in favor of this series
>>> for the following reasons:
>>> [...]
>>> The test added by Akinori MUSHA's patch may still have value, and it may
>>> make sense to re-submit it, however, doing so need not hold up this
>>> (higher priority) series.
>>
>> Yes I think it does, also the patch that Johannes and I have on top of
>> it to fix the quoting of "'" in write_author_script() relies on fixing
>> the missing trailing quote and handling of "'" at the same time,
>> hopefully we can get that rebased on top of these asap.
> 
> I'm not sure if "Yes I think it does" means that Akinori's test has
> value or if it means that this series should be held back waiting for
> other changes built atop it.

It means we should use a test based on that with the quoting fixes on
top of this series and progress them together if possible. I think the
quoting patch I just sent is good now so hopefully there wont be any
issue with it holding up your fixes.

> Anyhow, thanks for reading over the series. I appreciate it even if
> our "sense of priority" doesn't always align (as evidenced by your
> review comments and my responses).

My "sense of priority" was informed by the hope that the quoting patch
wouldn't hold things up (let's hope I was right about that!).

Best Wishes

Phillip



Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Eric Sunshine
On Tue, Jul 31, 2018 at 6:06 AM Phillip Wood  wrote:
> On 31/07/18 08:33, Eric Sunshine wrote:
> > Patch 2/4 of this series conflicts with Akinori MUSHA's
> > 'am/sequencer-author-script-fix' which takes a stab at fixing one of the
> > four (or so) bugs fixed by this series (namely, adding a missing closing
> > quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
> > probably ought to be dropped (without prejudice) in favor of this series
> > for the following reasons:
> > [...]
> > The test added by Akinori MUSHA's patch may still have value, and it may
> > make sense to re-submit it, however, doing so need not hold up this
> > (higher priority) series.
>
> Yes I think it does, also the patch that Johannes and I have on top of
> it to fix the quoting of "'" in write_author_script() relies on fixing
> the missing trailing quote and handling of "'" at the same time,
> hopefully we can get that rebased on top of these asap.

I'm not sure if "Yes I think it does" means that Akinori's test has
value or if it means that this series should be held back waiting for
other changes built atop it.

Anyhow, thanks for reading over the series. I appreciate it even if
our "sense of priority" doesn't always align (as evidenced by your
review comments and my responses).


Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Phillip Wood

Hi Eric

On 31/07/18 08:33, Eric Sunshine wrote:

This is a re-roll of [1] which fixes sequencer bugs resulting in commit
object corruption when "rebase -i --root" swaps in a new commit as root.
Unfortunately, those bugs made it into v2.18.0 and have already
corrupted at least one repository (a local project of mine). Patches 3/4
and 4/4 are new.

v1 fixed these bugs:

* trailing garbage on the commit's "author" header

* extra trailing digit on "author" header's timezone (caused by two
   separate bugs)

v2 fixes those same bugs, plus:

* eliminates a bogus "@" prepended to the "author" header timestamp
   which renders the header corrupt

* takes care to validate information coming from
   "rebase-merge/author-script" before incorporating it into the "author"
   header since that file may be hand-edited, and bogus hand-edited
   values could corrupt the commit object.

Patch 2/4 of this series conflicts with Akinori MUSHA's
'am/sequencer-author-script-fix' which takes a stab at fixing one of the
four (or so) bugs fixed by this series (namely, adding a missing closing
quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
probably ought to be dropped (without prejudice) in favor of this series
for the following reasons:

* It has the undesirable property of adding an unwanted extra blank line
   to "rebase-merge/author-script"; this series doesn't make that
   mistake.

* The fix in this series (patch 2/4) is more complete, also ensuring
   that the return value of sq_dequote() is checked.

* And, most importantly, this series sells the change as a fix for a
   genuine serious bug (commit object corruption), whereas that patch
   talks only about adjusting the file content to make it parseable by
   the shell. It's important for future readers of this change to
   understand that the bug (missing closing quote) had much more dire
   consequences than merely being syntactically invalid as a shell
   script.

The test added by Akinori MUSHA's patch may still have value, and it may
make sense to re-submit it, however, doing so need not hold up this
(higher priority) series.


Yes I think it does, also the patch that Johannes and I have on top of 
it to fix the quoting of "'" in write_author_script() relies on fixing 
the missing trailing quote and handling of "'" at the same time, 
hopefully we can get that rebased on top of these asap.


Best Wishes

Phillip


Phillip's proposed[2] unification of parsers and other fixes and
cleanups can easily be built atop this series and, likewise, need not
prevent these (higher priority) patches from graduating independently.

[1]: 
https://public-inbox.org/git/20180730092929.71114-1-sunsh...@sunshineco.com/
[2]: 
https://public-inbox.org/git/1f172fc1-4b51-cdf7-e841-5ca41e209...@talktalk.net/

Eric Sunshine (4):
   sequencer: fix "rebase -i --root" corrupting author header
   sequencer: fix "rebase -i --root" corrupting author header timezone
   sequencer: fix "rebase -i --root" corrupting author header timestamp
   sequencer: don't die() on bogus user-edited timestamp

  sequencer.c   | 39 +--
  t/t3404-rebase-interactive.sh | 10 -
  2 files changed, 33 insertions(+), 16 deletions(-)

Range-diff against v1:
1:  ba10ae4e5d ! 1:  8c47555bcf sequencer: fix "rebase -i --root" corrupting 
author header
 @@ -11,8 +11,8 @@
  This is a result of read_author_ident() neglecting to NUL-terminate 
the
  buffer into which it composes the "author" header.
  
 -(Note that the extra "0" in the timezone is a separate bug which will be

 -fixed subsequently.)
 +(Note that the "@" preceding the timestamp and the extra "0" in the
 +timezone are separate bugs which will be fixed subsequently.)
  
  Security considerations: Construction of the "author" header by

  read_author_ident() happens in-place and in parallel with parsing the
 @@ -26,6 +26,7 @@
  inserted as part of the parsing process.
  
  Signed-off-by: Eric Sunshine 

 +Acked-by: Johannes Schindelin 
  
  diff --git a/sequencer.c b/sequencer.c

  --- a/sequencer.c
 @@ -61,7 +62,7 @@
  + set_fake_editor &&
  + FAKE_LINES="2 1" git rebase -i --root &&
  + git cat-file commit HEAD^ >out &&
 -+ grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
 ++ grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out
  +'
  +
   test_done
2:  c0400cda85 ! 2:  1d4064147e sequencer: fix "rebase -i --root" corrupting 
author header timezone
 @@ -22,6 +22,9 @@
  a NUL-terminator at the end of the shifted content, which explains the
  duplicated last digit in the timezone.
  
 +(Note that the "@" preceding the timestamp is a separate bug which

 +will be fixed subsequently.)
 +
  Signed-off-by: Eric Sunshine 
  
  diff --git a/sequencer.c b/sequencer.c

 @@ -56,8 

[PATCH v2 0/4] fix "rebase -i --root" corrupting root commit

2018-07-31 Thread Eric Sunshine
This is a re-roll of [1] which fixes sequencer bugs resulting in commit
object corruption when "rebase -i --root" swaps in a new commit as root.
Unfortunately, those bugs made it into v2.18.0 and have already
corrupted at least one repository (a local project of mine). Patches 3/4
and 4/4 are new.

v1 fixed these bugs:

* trailing garbage on the commit's "author" header

* extra trailing digit on "author" header's timezone (caused by two
  separate bugs)

v2 fixes those same bugs, plus:

* eliminates a bogus "@" prepended to the "author" header timestamp
  which renders the header corrupt

* takes care to validate information coming from
  "rebase-merge/author-script" before incorporating it into the "author"
  header since that file may be hand-edited, and bogus hand-edited
  values could corrupt the commit object.

Patch 2/4 of this series conflicts with Akinori MUSHA's
'am/sequencer-author-script-fix' which takes a stab at fixing one of the
four (or so) bugs fixed by this series (namely, adding a missing closing
quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch
probably ought to be dropped (without prejudice) in favor of this series
for the following reasons:

* It has the undesirable property of adding an unwanted extra blank line
  to "rebase-merge/author-script"; this series doesn't make that
  mistake.

* The fix in this series (patch 2/4) is more complete, also ensuring
  that the return value of sq_dequote() is checked.

* And, most importantly, this series sells the change as a fix for a
  genuine serious bug (commit object corruption), whereas that patch
  talks only about adjusting the file content to make it parseable by
  the shell. It's important for future readers of this change to
  understand that the bug (missing closing quote) had much more dire
  consequences than merely being syntactically invalid as a shell
  script.

The test added by Akinori MUSHA's patch may still have value, and it may
make sense to re-submit it, however, doing so need not hold up this
(higher priority) series.

Phillip's proposed[2] unification of parsers and other fixes and
cleanups can easily be built atop this series and, likewise, need not
prevent these (higher priority) patches from graduating independently.

[1]: 
https://public-inbox.org/git/20180730092929.71114-1-sunsh...@sunshineco.com/
[2]: 
https://public-inbox.org/git/1f172fc1-4b51-cdf7-e841-5ca41e209...@talktalk.net/

Eric Sunshine (4):
  sequencer: fix "rebase -i --root" corrupting author header
  sequencer: fix "rebase -i --root" corrupting author header timezone
  sequencer: fix "rebase -i --root" corrupting author header timestamp
  sequencer: don't die() on bogus user-edited timestamp

 sequencer.c   | 39 +--
 t/t3404-rebase-interactive.sh | 10 -
 2 files changed, 33 insertions(+), 16 deletions(-)

Range-diff against v1:
1:  ba10ae4e5d ! 1:  8c47555bcf sequencer: fix "rebase -i --root" corrupting 
author header
@@ -11,8 +11,8 @@
 This is a result of read_author_ident() neglecting to NUL-terminate the
 buffer into which it composes the "author" header.
 
-(Note that the extra "0" in the timezone is a separate bug which will 
be
-fixed subsequently.)
+(Note that the "@" preceding the timestamp and the extra "0" in the
+timezone are separate bugs which will be fixed subsequently.)
 
 Security considerations: Construction of the "author" header by
 read_author_ident() happens in-place and in parallel with parsing the
@@ -26,6 +26,7 @@
 inserted as part of the parsing process.
 
 Signed-off-by: Eric Sunshine 
+Acked-by: Johannes Schindelin 
 
 diff --git a/sequencer.c b/sequencer.c
 --- a/sequencer.c
@@ -61,7 +62,7 @@
 +  set_fake_editor &&
 +  FAKE_LINES="2 1" git rebase -i --root &&
 +  git cat-file commit HEAD^ >out &&
-+  grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
++  grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out
 +'
 +
  test_done
2:  c0400cda85 ! 2:  1d4064147e sequencer: fix "rebase -i --root" corrupting 
author header timezone
@@ -22,6 +22,9 @@
 a NUL-terminator at the end of the shifted content, which explains the
 duplicated last digit in the timezone.
 
+(Note that the "@" preceding the timestamp is a separate bug which
+will be fixed subsequently.)
+
 Signed-off-by: Eric Sunshine 
 
 diff --git a/sequencer.c b/sequencer.c
@@ -56,8 +59,8 @@
set_fake_editor &&
FAKE_LINES="2 1" git rebase -i --root &&
git cat-file commit HEAD^ >out &&
--  grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out
-+  grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
+-  grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out
++  grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
  '