Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
Hi, On Mon, 6 Aug 2018, Eric Sunshine wrote: > On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga > wrote: > > But your suggestion did make me think about what behaviour I would > > like to see, exactly. I like that Git removes commits that no longer > > serve any purpose (because I've included their changes in earlier > > commits). So I would not want to keep commits that become empty during > > the rebase. What I would like to see is that commits that _start out_ > > as empty, are retained. Would such behaviour make sense? Or would that > > be considered surprising behaviour? > > I, personally, have no opinion since I don't use empty commits. > Perhaps someone more experienced and more long-sighted will chime in. I got aware about two weeks ago that my mail provider fails to deliver all the mails that are addressed to me. Two days ago, I managed to get this mail (and the thread) via public-inbox (thanks, Eric!!!). Hence my late reply. Hilco: if you provide a patch to extend the test suite to demonstrate the breakage (via `test_expect_failure`), I promise to try my best to provide a fix on top. Ciao, Johannes
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
Eric Sunshine writes: > On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga > wrote: >> But your suggestion did make me think about what behaviour I would >> like to see, exactly. I like that Git removes commits that no longer >> serve any purpose (because I've included their changes in earlier >> commits). So I would not want to keep commits that become empty during >> the rebase. What I would like to see is that commits that _start out_ >> as empty, are retained. Would such behaviour make sense? Or would that >> be considered surprising behaviour? > > I, personally, have no opinion since I don't use empty commits. > Perhaps someone more experienced and more long-sighted will chime in. 0661e49a ("git-rebase.txt: document behavioral differences between modes", 2018-06-27) added the following. In short, "rebase -i" should already behave that way. + * empty commits: + +am-based rebase will drop any "empty" commits, whether the +commit started empty (had no changes relative to its parent to +start with) or ended empty (all changes were already applied +upstream in other commits). + +merge-based rebase does the same. + +interactive-based rebase will by default drop commits that +started empty and halt if it hits a commit that ended up empty. +The `--keep-empty` option exists for interactive rebases to allow +it to keep commits that started empty.
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
On Mon, Aug 6, 2018 at 9:20 PM Hilco Wijbenga wrote: > But your suggestion did make me think about what behaviour I would > like to see, exactly. I like that Git removes commits that no longer > serve any purpose (because I've included their changes in earlier > commits). So I would not want to keep commits that become empty during > the rebase. What I would like to see is that commits that _start out_ > as empty, are retained. Would such behaviour make sense? Or would that > be considered surprising behaviour? I, personally, have no opinion since I don't use empty commits. Perhaps someone more experienced and more long-sighted will chime in.
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
On Tue, Jul 31, 2018 at 11:22 PM, Eric Sunshine wrote: > On Tue, Jul 31, 2018 at 9:30 PM Hilco Wijbenga > wrote: >> On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine >> wrote: >> > This is a re-roll of [1] which fixes sequencer bugs resulting in commit >> > object corruption when "rebase -i --root" swaps in a new commit as root. >> > Unfortunately, those bugs made it into v2.18.0 and have already >> > corrupted at least one repository (a local project of mine). Patches 3/4 >> > and 4/4 are new. >> >> Does this also fix losing the initial commit if it is empty? >> >> git init ; git commit -m 'Initial commit' --allow-empty ; touch >> file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git >> rebase --root >> >> I would expect there to be 2 commits but the first one has >> disappeared. (This usually happens with "git rebase -i --root" early >> on in a new project.) > > This series should have no impact on that issue; it is fixing root > commit object corruption, and does not touch or change how the commit > graph is maintained or how the rebasing machinery itself works (and > the --allow-empty stuff is part of that machinery). > > As Dscho is cc:'d already, perhaps he can chime in as a resident expert. > > By the way, what happens if you use --keep-empty while rebasing? I was not aware of that flag. But, although I was expecting it to, if I use it with the rebase, I don't see any different behaviour. I can't really make out what its purpose is from "Keep the commits that do not change anything from its parents in the result.". But your suggestion did make me think about what behaviour I would like to see, exactly. I like that Git removes commits that no longer serve any purpose (because I've included their changes in earlier commits). So I would not want to keep commits that become empty during the rebase. What I would like to see is that commits that _start out_ as empty, are retained. Would such behaviour make sense? Or would that be considered surprising behaviour?
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
On Wed, Aug 1, 2018 at 7:25 PM brian m. carlson wrote: > On Tue, Jul 31, 2018 at 03:33:27AM -0400, Eric Sunshine wrote: > > This is a re-roll of [1] which fixes sequencer bugs resulting in commit > > object corruption when "rebase -i --root" swaps in a new commit as root. > > Unfortunately, those bugs made it into v2.18.0 and have already > > corrupted at least one repository (a local project of mine). Patches 3/4 > > and 4/4 are new. > > I looked at this series and it seems sane and logical to me. Thanks for > acting quickly to fix the corruption. > > Reviewed-by: brian m. carlson Thanks for the review.
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
On Tue, Jul 31, 2018 at 03:33:27AM -0400, Eric Sunshine wrote: > This is a re-roll of [1] which fixes sequencer bugs resulting in commit > object corruption when "rebase -i --root" swaps in a new commit as root. > Unfortunately, those bugs made it into v2.18.0 and have already > corrupted at least one repository (a local project of mine). Patches 3/4 > and 4/4 are new. > > v1 fixed these bugs: > > * trailing garbage on the commit's "author" header > > * extra trailing digit on "author" header's timezone (caused by two > separate bugs) > > v2 fixes those same bugs, plus: > > * eliminates a bogus "@" prepended to the "author" header timestamp > which renders the header corrupt > > * takes care to validate information coming from > "rebase-merge/author-script" before incorporating it into the "author" > header since that file may be hand-edited, and bogus hand-edited > values could corrupt the commit object. I looked at this series and it seems sane and logical to me. Thanks for acting quickly to fix the corruption. Reviewed-by: brian m. carlson -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
On Tue, Jul 31, 2018 at 9:30 PM Hilco Wijbenga wrote: > On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine > wrote: > > This is a re-roll of [1] which fixes sequencer bugs resulting in commit > > object corruption when "rebase -i --root" swaps in a new commit as root. > > Unfortunately, those bugs made it into v2.18.0 and have already > > corrupted at least one repository (a local project of mine). Patches 3/4 > > and 4/4 are new. > > Does this also fix losing the initial commit if it is empty? > > git init ; git commit -m 'Initial commit' --allow-empty ; touch > file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git > rebase --root > > I would expect there to be 2 commits but the first one has > disappeared. (This usually happens with "git rebase -i --root" early > on in a new project.) This series should have no impact on that issue; it is fixing root commit object corruption, and does not touch or change how the commit graph is maintained or how the rebasing machinery itself works (and the --allow-empty stuff is part of that machinery). As Dscho is cc:'d already, perhaps he can chime in as a resident expert. By the way, what happens if you use --keep-empty while rebasing?
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
Hi Eric, On Tue, Jul 31, 2018 at 12:33 AM, Eric Sunshine wrote: > This is a re-roll of [1] which fixes sequencer bugs resulting in commit > object corruption when "rebase -i --root" swaps in a new commit as root. > Unfortunately, those bugs made it into v2.18.0 and have already > corrupted at least one repository (a local project of mine). Patches 3/4 > and 4/4 are new. > > v1 fixed these bugs: > > * trailing garbage on the commit's "author" header > > * extra trailing digit on "author" header's timezone (caused by two > separate bugs) > > v2 fixes those same bugs, plus: > > * eliminates a bogus "@" prepended to the "author" header timestamp > which renders the header corrupt > > * takes care to validate information coming from > "rebase-merge/author-script" before incorporating it into the "author" > header since that file may be hand-edited, and bogus hand-edited > values could corrupt the commit object. > Does this also fix losing the initial commit if it is empty? Given git init ; git commit -m 'Initial commit' --allow-empty ; touch file.txt ; git add file.txt ; git commit -m 'Add file.txt' ; git rebase --root I would expect there to be 2 commits but the first one has disappeared. (This usually happens with "git rebase -i --root" early on in a new project.) Cheers, Hilco
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
On Tue, Jul 31, 2018 at 6:46 AM Eric Sunshine wrote: > Anyhow, thanks for reading over the series. I appreciate it even if > our "sense of priority" doesn't always align (as evidenced by your > review comments and my responses). To be clear, the changes you suggest all make sense, and would be welcome (especially the bug fixes), but I consider them lower priority than the fixes in this series, and here's why: The commit object corruption caused by the bugs fixed by this series are unavoidable. Anyone using "rebase -i --root" to swap in a new commit as root is going to end up with a corrupt repository no matter what. There's no way to side step it. And, most people won't know how to fix the corruption, assuming they even notice it. If I understand correctly, the issues you describe are unlikely to come up in practice. The only way they can arise is if someone hand edits the script (something only power users will do) _and_ botches the edit in the process, or if the person's name contains an apostrophe (possible, though perhaps uncommon?). Also, (if again I understand correctly) they are only data "corruptions", not genuine broken-repository corruptions, thus are more likely to be fixable by a typical user. So it's not that I think your proposed fixes and suggestions are unimportant, I just don't think they belong in this series, and would be happy to see them atop it. Thanks.
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
Hi Eric On 31/07/18 11:46, Eric Sunshine wrote: > On Tue, Jul 31, 2018 at 6:06 AM Phillip Wood > wrote: >> On 31/07/18 08:33, Eric Sunshine wrote: >>> Patch 2/4 of this series conflicts with Akinori MUSHA's >>> 'am/sequencer-author-script-fix' which takes a stab at fixing one of the >>> four (or so) bugs fixed by this series (namely, adding a missing closing >>> quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch >>> probably ought to be dropped (without prejudice) in favor of this series >>> for the following reasons: >>> [...] >>> The test added by Akinori MUSHA's patch may still have value, and it may >>> make sense to re-submit it, however, doing so need not hold up this >>> (higher priority) series. >> >> Yes I think it does, also the patch that Johannes and I have on top of >> it to fix the quoting of "'" in write_author_script() relies on fixing >> the missing trailing quote and handling of "'" at the same time, >> hopefully we can get that rebased on top of these asap. > > I'm not sure if "Yes I think it does" means that Akinori's test has > value or if it means that this series should be held back waiting for > other changes built atop it. It means we should use a test based on that with the quoting fixes on top of this series and progress them together if possible. I think the quoting patch I just sent is good now so hopefully there wont be any issue with it holding up your fixes. > Anyhow, thanks for reading over the series. I appreciate it even if > our "sense of priority" doesn't always align (as evidenced by your > review comments and my responses). My "sense of priority" was informed by the hope that the quoting patch wouldn't hold things up (let's hope I was right about that!). Best Wishes Phillip
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
On Tue, Jul 31, 2018 at 6:06 AM Phillip Wood wrote: > On 31/07/18 08:33, Eric Sunshine wrote: > > Patch 2/4 of this series conflicts with Akinori MUSHA's > > 'am/sequencer-author-script-fix' which takes a stab at fixing one of the > > four (or so) bugs fixed by this series (namely, adding a missing closing > > quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch > > probably ought to be dropped (without prejudice) in favor of this series > > for the following reasons: > > [...] > > The test added by Akinori MUSHA's patch may still have value, and it may > > make sense to re-submit it, however, doing so need not hold up this > > (higher priority) series. > > Yes I think it does, also the patch that Johannes and I have on top of > it to fix the quoting of "'" in write_author_script() relies on fixing > the missing trailing quote and handling of "'" at the same time, > hopefully we can get that rebased on top of these asap. I'm not sure if "Yes I think it does" means that Akinori's test has value or if it means that this series should be held back waiting for other changes built atop it. Anyhow, thanks for reading over the series. I appreciate it even if our "sense of priority" doesn't always align (as evidenced by your review comments and my responses).
Re: [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
Hi Eric On 31/07/18 08:33, Eric Sunshine wrote: This is a re-roll of [1] which fixes sequencer bugs resulting in commit object corruption when "rebase -i --root" swaps in a new commit as root. Unfortunately, those bugs made it into v2.18.0 and have already corrupted at least one repository (a local project of mine). Patches 3/4 and 4/4 are new. v1 fixed these bugs: * trailing garbage on the commit's "author" header * extra trailing digit on "author" header's timezone (caused by two separate bugs) v2 fixes those same bugs, plus: * eliminates a bogus "@" prepended to the "author" header timestamp which renders the header corrupt * takes care to validate information coming from "rebase-merge/author-script" before incorporating it into the "author" header since that file may be hand-edited, and bogus hand-edited values could corrupt the commit object. Patch 2/4 of this series conflicts with Akinori MUSHA's 'am/sequencer-author-script-fix' which takes a stab at fixing one of the four (or so) bugs fixed by this series (namely, adding a missing closing quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch probably ought to be dropped (without prejudice) in favor of this series for the following reasons: * It has the undesirable property of adding an unwanted extra blank line to "rebase-merge/author-script"; this series doesn't make that mistake. * The fix in this series (patch 2/4) is more complete, also ensuring that the return value of sq_dequote() is checked. * And, most importantly, this series sells the change as a fix for a genuine serious bug (commit object corruption), whereas that patch talks only about adjusting the file content to make it parseable by the shell. It's important for future readers of this change to understand that the bug (missing closing quote) had much more dire consequences than merely being syntactically invalid as a shell script. The test added by Akinori MUSHA's patch may still have value, and it may make sense to re-submit it, however, doing so need not hold up this (higher priority) series. Yes I think it does, also the patch that Johannes and I have on top of it to fix the quoting of "'" in write_author_script() relies on fixing the missing trailing quote and handling of "'" at the same time, hopefully we can get that rebased on top of these asap. Best Wishes Phillip Phillip's proposed[2] unification of parsers and other fixes and cleanups can easily be built atop this series and, likewise, need not prevent these (higher priority) patches from graduating independently. [1]: https://public-inbox.org/git/20180730092929.71114-1-sunsh...@sunshineco.com/ [2]: https://public-inbox.org/git/1f172fc1-4b51-cdf7-e841-5ca41e209...@talktalk.net/ Eric Sunshine (4): sequencer: fix "rebase -i --root" corrupting author header sequencer: fix "rebase -i --root" corrupting author header timezone sequencer: fix "rebase -i --root" corrupting author header timestamp sequencer: don't die() on bogus user-edited timestamp sequencer.c | 39 +-- t/t3404-rebase-interactive.sh | 10 - 2 files changed, 33 insertions(+), 16 deletions(-) Range-diff against v1: 1: ba10ae4e5d ! 1: 8c47555bcf sequencer: fix "rebase -i --root" corrupting author header @@ -11,8 +11,8 @@ This is a result of read_author_ident() neglecting to NUL-terminate the buffer into which it composes the "author" header. -(Note that the extra "0" in the timezone is a separate bug which will be -fixed subsequently.) +(Note that the "@" preceding the timestamp and the extra "0" in the +timezone are separate bugs which will be fixed subsequently.) Security considerations: Construction of the "author" header by read_author_ident() happens in-place and in parallel with parsing the @@ -26,6 +26,7 @@ inserted as part of the parsing process. Signed-off-by: Eric Sunshine +Acked-by: Johannes Schindelin diff --git a/sequencer.c b/sequencer.c --- a/sequencer.c @@ -61,7 +62,7 @@ + set_fake_editor && + FAKE_LINES="2 1" git rebase -i --root && + git cat-file commit HEAD^ >out && -+ grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out ++ grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out +' + test_done 2: c0400cda85 ! 2: 1d4064147e sequencer: fix "rebase -i --root" corrupting author header timezone @@ -22,6 +22,9 @@ a NUL-terminator at the end of the shifted content, which explains the duplicated last digit in the timezone. +(Note that the "@" preceding the timestamp is a separate bug which +will be fixed subsequently.) + Signed-off-by: Eric Sunshine diff --git a/sequencer.c b/sequencer.c @@ -56,8
[PATCH v2 0/4] fix "rebase -i --root" corrupting root commit
This is a re-roll of [1] which fixes sequencer bugs resulting in commit object corruption when "rebase -i --root" swaps in a new commit as root. Unfortunately, those bugs made it into v2.18.0 and have already corrupted at least one repository (a local project of mine). Patches 3/4 and 4/4 are new. v1 fixed these bugs: * trailing garbage on the commit's "author" header * extra trailing digit on "author" header's timezone (caused by two separate bugs) v2 fixes those same bugs, plus: * eliminates a bogus "@" prepended to the "author" header timestamp which renders the header corrupt * takes care to validate information coming from "rebase-merge/author-script" before incorporating it into the "author" header since that file may be hand-edited, and bogus hand-edited values could corrupt the commit object. Patch 2/4 of this series conflicts with Akinori MUSHA's 'am/sequencer-author-script-fix' which takes a stab at fixing one of the four (or so) bugs fixed by this series (namely, adding a missing closing quote to GIT_AUTHOR_DATE in "rebase-merge/author-script"). That patch probably ought to be dropped (without prejudice) in favor of this series for the following reasons: * It has the undesirable property of adding an unwanted extra blank line to "rebase-merge/author-script"; this series doesn't make that mistake. * The fix in this series (patch 2/4) is more complete, also ensuring that the return value of sq_dequote() is checked. * And, most importantly, this series sells the change as a fix for a genuine serious bug (commit object corruption), whereas that patch talks only about adjusting the file content to make it parseable by the shell. It's important for future readers of this change to understand that the bug (missing closing quote) had much more dire consequences than merely being syntactically invalid as a shell script. The test added by Akinori MUSHA's patch may still have value, and it may make sense to re-submit it, however, doing so need not hold up this (higher priority) series. Phillip's proposed[2] unification of parsers and other fixes and cleanups can easily be built atop this series and, likewise, need not prevent these (higher priority) patches from graduating independently. [1]: https://public-inbox.org/git/20180730092929.71114-1-sunsh...@sunshineco.com/ [2]: https://public-inbox.org/git/1f172fc1-4b51-cdf7-e841-5ca41e209...@talktalk.net/ Eric Sunshine (4): sequencer: fix "rebase -i --root" corrupting author header sequencer: fix "rebase -i --root" corrupting author header timezone sequencer: fix "rebase -i --root" corrupting author header timestamp sequencer: don't die() on bogus user-edited timestamp sequencer.c | 39 +-- t/t3404-rebase-interactive.sh | 10 - 2 files changed, 33 insertions(+), 16 deletions(-) Range-diff against v1: 1: ba10ae4e5d ! 1: 8c47555bcf sequencer: fix "rebase -i --root" corrupting author header @@ -11,8 +11,8 @@ This is a result of read_author_ident() neglecting to NUL-terminate the buffer into which it composes the "author" header. -(Note that the extra "0" in the timezone is a separate bug which will be -fixed subsequently.) +(Note that the "@" preceding the timestamp and the extra "0" in the +timezone are separate bugs which will be fixed subsequently.) Security considerations: Construction of the "author" header by read_author_ident() happens in-place and in parallel with parsing the @@ -26,6 +26,7 @@ inserted as part of the parsing process. Signed-off-by: Eric Sunshine +Acked-by: Johannes Schindelin diff --git a/sequencer.c b/sequencer.c --- a/sequencer.c @@ -61,7 +62,7 @@ + set_fake_editor && + FAKE_LINES="2 1" git rebase -i --root && + git cat-file commit HEAD^ >out && -+ grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out ++ grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out +' + test_done 2: c0400cda85 ! 2: 1d4064147e sequencer: fix "rebase -i --root" corrupting author header timezone @@ -22,6 +22,9 @@ a NUL-terminator at the end of the shifted content, which explains the duplicated last digit in the timezone. +(Note that the "@" preceding the timestamp is a separate bug which +will be fixed subsequently.) + Signed-off-by: Eric Sunshine diff --git a/sequencer.c b/sequencer.c @@ -56,8 +59,8 @@ set_fake_editor && FAKE_LINES="2 1" git rebase -i --root && git cat-file commit HEAD^ >out && -- grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9]*$" out -+ grep "^author ..* @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out +- grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9]*$" out ++ grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out '