Re: [RFC PATCH] ident: don't cache default date
On 20/04/18 09:11, Johannes Schindelin wrote: > Hi Phillip, > > On Thu, 19 Apr 2018, Phillip Wood wrote: > >> On 18/04/18 19:15, Johannes Sixt wrote: >>> Am 18.04.2018 um 19:47 schrieb Phillip Wood: On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote: > On Wed, Apr 18 2018, Phillip Wood wrote: >> From: Phillip Wood>> as it is created by running an separate instance of 'git commit'. If >> the reworded commit is follow by further picks, those later commits >> will have an earlier committer date than the reworded one. This is >> caused by git caching the default date used when GIT_COMMITTER_DATE is >> not set. Fix this by not caching the date. >> >> Users expect commits to have the same author and committer dates when >> the don't explicitly set them. As the date is now updated each time >> git_author_info() or git_committer_info() is run it is possible to end >> up with different author and committer dates. Fix this for >> 'commit-tree', 'notes' and 'merge' by using a single date in >> commit_tree_extended() and passing it explicitly to the new functions >> git_author_info_with_date() and git_committer_info_with_date() when >> neither the author date nor the committer date are explicitly >> set. 'commit' always passes the author date to commit_tree_extended() >> and relied on the date caching to have the same committer and author >> dates when neither was specified. Fix this by setting >> GIT_COMMITTER_DATE to be the same as the author date passed to >> commit_tree_extended(). >> >> Signed-off-by: Phillip Wood >> Reported-by: Johannes Sixt >> --- >> >> I'm slightly nervous that setting GIT_COMMITTER_DATE in >> builtin/commit.c will break someone's hook script. Maybe it would be >> better to add a committer parameter to commit_tree() and >> commit_tree_extended(). >>> >>> While I like the basic theme of your patch, I think we should fix this >>> case in a much simpler way, namely, use the infrastructure that was >>> introduced for git-am. >>> >>> I've shamelessly lifted the commit message from your patch. >> >> Thanks, that is a better way (I'm annoyed with myself for not having >> noticed reset_ident_date() when I edited the function above it) > > Don't be too annoyed. I did remember that The Linus had complained about > something similar and assumed that it had been fixed in the meantime, but > I failed to find it within 30 minutes where I tried to dig through > public-inbox and pu. Thanks, that makes we feel better about it (thanks for taking the time to try and find the original mail as well) Phillip > Thanks Hannes for remembering, and for coming up with the final form of > the patch! > > Ciao, > Dscho >
Re: [RFC PATCH] ident: don't cache default date
Hi Phillip, On Thu, 19 Apr 2018, Phillip Wood wrote: > On 18/04/18 19:15, Johannes Sixt wrote: > > Am 18.04.2018 um 19:47 schrieb Phillip Wood: > >> On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote: > >>> On Wed, Apr 18 2018, Phillip Wood wrote: > From: Phillip Wood> as it is created by running an separate instance of 'git commit'. If > the reworded commit is follow by further picks, those later commits > will have an earlier committer date than the reworded one. This is > caused by git caching the default date used when GIT_COMMITTER_DATE is > not set. Fix this by not caching the date. > > Users expect commits to have the same author and committer dates when > the don't explicitly set them. As the date is now updated each time > git_author_info() or git_committer_info() is run it is possible to end > up with different author and committer dates. Fix this for > 'commit-tree', 'notes' and 'merge' by using a single date in > commit_tree_extended() and passing it explicitly to the new functions > git_author_info_with_date() and git_committer_info_with_date() when > neither the author date nor the committer date are explicitly > set. 'commit' always passes the author date to commit_tree_extended() > and relied on the date caching to have the same committer and author > dates when neither was specified. Fix this by setting > GIT_COMMITTER_DATE to be the same as the author date passed to > commit_tree_extended(). > > Signed-off-by: Phillip Wood > Reported-by: Johannes Sixt > --- > > I'm slightly nervous that setting GIT_COMMITTER_DATE in > builtin/commit.c will break someone's hook script. Maybe it would be > better to add a committer parameter to commit_tree() and > commit_tree_extended(). > > > > While I like the basic theme of your patch, I think we should fix this > > case in a much simpler way, namely, use the infrastructure that was > > introduced for git-am. > > > > I've shamelessly lifted the commit message from your patch. > > Thanks, that is a better way (I'm annoyed with myself for not having > noticed reset_ident_date() when I edited the function above it) Don't be too annoyed. I did remember that The Linus had complained about something similar and assumed that it had been fixed in the meantime, but I failed to find it within 30 minutes where I tried to dig through public-inbox and pu. Thanks Hannes for remembering, and for coming up with the final form of the patch! Ciao, Dscho
Re: [RFC PATCH] ident: don't cache default date
On 18/04/18 19:15, Johannes Sixt wrote: > Am 18.04.2018 um 19:47 schrieb Phillip Wood: >> On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote: >>> On Wed, Apr 18 2018, Phillip Wood wrote: From: Phillip Woodas it is created by running an separate instance of 'git commit'. If the reworded commit is follow by further picks, those later commits will have an earlier committer date than the reworded one. This is caused by git caching the default date used when GIT_COMMITTER_DATE is not set. Fix this by not caching the date. Users expect commits to have the same author and committer dates when the don't explicitly set them. As the date is now updated each time git_author_info() or git_committer_info() is run it is possible to end up with different author and committer dates. Fix this for 'commit-tree', 'notes' and 'merge' by using a single date in commit_tree_extended() and passing it explicitly to the new functions git_author_info_with_date() and git_committer_info_with_date() when neither the author date nor the committer date are explicitly set. 'commit' always passes the author date to commit_tree_extended() and relied on the date caching to have the same committer and author dates when neither was specified. Fix this by setting GIT_COMMITTER_DATE to be the same as the author date passed to commit_tree_extended(). Signed-off-by: Phillip Wood Reported-by: Johannes Sixt --- I'm slightly nervous that setting GIT_COMMITTER_DATE in builtin/commit.c will break someone's hook script. Maybe it would be better to add a committer parameter to commit_tree() and commit_tree_extended(). > > While I like the basic theme of your patch, I think we should fix this > case in a much simpler way, namely, use the infrastructure that was > introduced for git-am. > > I've shamelessly lifted the commit message from your patch. Thanks, that is a better way (I'm annoyed with myself for not having noticed reset_ident_date() when I edited the function above it) Best Wishes Phillip > 8< > Subject: [PATCH] sequencer: reset the committer date before commits > > Now that the sequencer commits without forking when the commit message > isn't edited all the commits that are picked have the same committer > date. If a commit is reworded it's committer date will be a later time > as it is created by running an separate instance of 'git commit'. If > the reworded commit is follow by further picks, those later commits > will have an earlier committer date than the reworded one. This is > caused by git caching the default date used when GIT_COMMITTER_DATE is > not set. Reset the cached date before a commit is generated > in-process. > > Signed-off-by: Johannes Sixt > --- > sequencer.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/sequencer.c b/sequencer.c > index f9d1001dee..f0bac903a0 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -1148,6 +1148,8 @@ static int try_to_commit(struct strbuf *msg, const char > *author, > goto out; > } > > + reset_ident_date(); > + > if (commit_tree_extended(msg->buf, msg->len, , parents, >oid, author, opts->gpg_sign, extra)) { > res = error(_("failed to write commit object")); >
Re: [RFC PATCH] ident: don't cache default date
Johannes Sixtwrites: > While I like the basic theme of your patch, I think we should fix this > case in a much simpler way, namely, use the infrastructure that was > introduced for git-am. Yup. reset_ident_date() was introduced by 4d9c7e6f ("am: reset cached ident date for each patch", 2016-08-01) and the commit explains very well why it is a good idea to have both the caching and also the strategic resetting it introduces. Thanks, all. > I've shamelessly lifted the commit message from your patch. > > 8< > Subject: [PATCH] sequencer: reset the committer date before commits > > Now that the sequencer commits without forking when the commit message > isn't edited all the commits that are picked have the same committer > date. If a commit is reworded it's committer date will be a later time > as it is created by running an separate instance of 'git commit'. If > the reworded commit is follow by further picks, those later commits > will have an earlier committer date than the reworded one. This is > caused by git caching the default date used when GIT_COMMITTER_DATE is > not set. Reset the cached date before a commit is generated > in-process. > > Signed-off-by: Johannes Sixt > --- > sequencer.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/sequencer.c b/sequencer.c > index f9d1001dee..f0bac903a0 100644 > --- a/sequencer.c > +++ b/sequencer.c > @@ -1148,6 +1148,8 @@ static int try_to_commit(struct strbuf *msg, const char > *author, > goto out; > } > > + reset_ident_date(); > + > if (commit_tree_extended(msg->buf, msg->len, , parents, >oid, author, opts->gpg_sign, extra)) { > res = error(_("failed to write commit object"));
Re: [RFC PATCH] ident: don't cache default date
Am 18.04.2018 um 19:47 schrieb Phillip Wood: > On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote: >> On Wed, Apr 18 2018, Phillip Wood wrote: >>> From: Phillip Wood>>> as it is created by running an separate instance of 'git commit'. If >>> the reworded commit is follow by further picks, those later commits >>> will have an earlier committer date than the reworded one. This is >>> caused by git caching the default date used when GIT_COMMITTER_DATE is >>> not set. Fix this by not caching the date. >>> >>> Users expect commits to have the same author and committer dates when >>> the don't explicitly set them. As the date is now updated each time >>> git_author_info() or git_committer_info() is run it is possible to end >>> up with different author and committer dates. Fix this for >>> 'commit-tree', 'notes' and 'merge' by using a single date in >>> commit_tree_extended() and passing it explicitly to the new functions >>> git_author_info_with_date() and git_committer_info_with_date() when >>> neither the author date nor the committer date are explicitly >>> set. 'commit' always passes the author date to commit_tree_extended() >>> and relied on the date caching to have the same committer and author >>> dates when neither was specified. Fix this by setting >>> GIT_COMMITTER_DATE to be the same as the author date passed to >>> commit_tree_extended(). >>> >>> Signed-off-by: Phillip Wood >>> Reported-by: Johannes Sixt >>> --- >>> >>> I'm slightly nervous that setting GIT_COMMITTER_DATE in >>> builtin/commit.c will break someone's hook script. Maybe it would be >>> better to add a committer parameter to commit_tree() and >>> commit_tree_extended(). While I like the basic theme of your patch, I think we should fix this case in a much simpler way, namely, use the infrastructure that was introduced for git-am. I've shamelessly lifted the commit message from your patch. 8< Subject: [PATCH] sequencer: reset the committer date before commits Now that the sequencer commits without forking when the commit message isn't edited all the commits that are picked have the same committer date. If a commit is reworded it's committer date will be a later time as it is created by running an separate instance of 'git commit'. If the reworded commit is follow by further picks, those later commits will have an earlier committer date than the reworded one. This is caused by git caching the default date used when GIT_COMMITTER_DATE is not set. Reset the cached date before a commit is generated in-process. Signed-off-by: Johannes Sixt --- sequencer.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sequencer.c b/sequencer.c index f9d1001dee..f0bac903a0 100644 --- a/sequencer.c +++ b/sequencer.c @@ -1148,6 +1148,8 @@ static int try_to_commit(struct strbuf *msg, const char *author, goto out; } + reset_ident_date(); + if (commit_tree_extended(msg->buf, msg->len, , parents, oid, author, opts->gpg_sign, extra)) { res = error(_("failed to write commit object")); -- 2.17.0.69.g0c1d01d9b6
Re: [RFC PATCH] ident: don't cache default date
Hi Ævar, thanks for your comments On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote: > > On Wed, Apr 18 2018, Phillip Wood wrote: > >> From: Phillip Wood>> >> Now that the sequencer commits without forking when the commit message >> isn't edited all the commits that are picked have the same committer >> date. If a commit is reworded it's committer date will be a later time > > s/it's/its/ Thanks I'll change it >> as it is created by running an separate instance of 'git commit'. If >> the reworded commit is follow by further picks, those later commits >> will have an earlier committer date than the reworded one. This is >> caused by git caching the default date used when GIT_COMMITTER_DATE is >> not set. Fix this by not caching the date. >> >> Users expect commits to have the same author and committer dates when >> the don't explicitly set them. As the date is now updated each time >> git_author_info() or git_committer_info() is run it is possible to end >> up with different author and committer dates. Fix this for >> 'commit-tree', 'notes' and 'merge' by using a single date in >> commit_tree_extended() and passing it explicitly to the new functions >> git_author_info_with_date() and git_committer_info_with_date() when >> neither the author date nor the committer date are explicitly >> set. 'commit' always passes the author date to commit_tree_extended() >> and relied on the date caching to have the same committer and author >> dates when neither was specified. Fix this by setting >> GIT_COMMITTER_DATE to be the same as the author date passed to >> commit_tree_extended(). >> >> Signed-off-by: Phillip Wood >> Reported-by: Johannes Sixt >> --- >> >> I'm slightly nervous that setting GIT_COMMITTER_DATE in >> builtin/commit.c will break someone's hook script. Maybe it would be >> better to add a committer parameter to commit_tree() and >> commit_tree_extended(). >> >> This needs some tests. I think we could test that the sequencer gives >> each commit a new date with 'git rebase -i --exec="sleep 2" >> --force-rebase @~2' and test the committer dates of the rebased >> commits. I'm not sure if test -gt can cope with numbers that big >> though, maybe we'll have to be content with test !=. For 'git commit' >> I think doing GIT_EDITOR='sleep 2; echo message >"$1"' and checking >> the committer date and author date will work as the author date is set >> before the user edits the commit message. I'm not sure how to test >> callers of commit_tree() though (that's commit-tree, notes and merge) >> as I've not been able to come up with anything that will guarantee the >> author and committer dates are different if the code in >> commit_tree_extended() that ensures they are the same gets broken. > > The behavior you're describing where we end up with later commits in the > rebase with earlier CommitDate's definitely sounds like a minor > regression, and yeah we should have tests for this. > > My mental model of this is that we shouldn't be trying at all to adjust > the CommitDate in a sequence to be the same, and just make it be > whatever time() returns, except in the case where a date is passed > explicitly. > > My cursory reading of this RFC patch is that it's definitely an > improvement because we don't end up with things out of order, but is > there any reason for why we should be trying to aim to keep this > "consistent" within a single "git rebase" command by default, as opposed > to always just writing the current CommitDate at the time we make the > commit, that sounds like the most intuitive thing to me, and I can't see > any downsides with that. It's not trying to keep the date "consistent" within a single rebase command, each commit created by the sequencer gets a date stamp with the current time when the commit is created. What it is doing is keeping the author and committer dates the same for commit/commit-tree/merge/notes when the user does not specify either of them. While I don't think it particularly matters that they match (so long as the committer date is later than the author date), it is what git does at the moment and someone maybe relying on the current behavior. Best Wishes Phillip > > It leaks info about how long it took someone to do the rebase, but so > what? >
Re: [RFC PATCH] ident: don't cache default date
On Wed, Apr 18 2018, Phillip Wood wrote: > From: Phillip Wood> > Now that the sequencer commits without forking when the commit message > isn't edited all the commits that are picked have the same committer > date. If a commit is reworded it's committer date will be a later time s/it's/its/ > as it is created by running an separate instance of 'git commit'. If > the reworded commit is follow by further picks, those later commits > will have an earlier committer date than the reworded one. This is > caused by git caching the default date used when GIT_COMMITTER_DATE is > not set. Fix this by not caching the date. > > Users expect commits to have the same author and committer dates when > the don't explicitly set them. As the date is now updated each time > git_author_info() or git_committer_info() is run it is possible to end > up with different author and committer dates. Fix this for > 'commit-tree', 'notes' and 'merge' by using a single date in > commit_tree_extended() and passing it explicitly to the new functions > git_author_info_with_date() and git_committer_info_with_date() when > neither the author date nor the committer date are explicitly > set. 'commit' always passes the author date to commit_tree_extended() > and relied on the date caching to have the same committer and author > dates when neither was specified. Fix this by setting > GIT_COMMITTER_DATE to be the same as the author date passed to > commit_tree_extended(). > > Signed-off-by: Phillip Wood > Reported-by: Johannes Sixt > --- > > I'm slightly nervous that setting GIT_COMMITTER_DATE in > builtin/commit.c will break someone's hook script. Maybe it would be > better to add a committer parameter to commit_tree() and > commit_tree_extended(). > > This needs some tests. I think we could test that the sequencer gives > each commit a new date with 'git rebase -i --exec="sleep 2" > --force-rebase @~2' and test the committer dates of the rebased > commits. I'm not sure if test -gt can cope with numbers that big > though, maybe we'll have to be content with test !=. For 'git commit' > I think doing GIT_EDITOR='sleep 2; echo message >"$1"' and checking > the committer date and author date will work as the author date is set > before the user edits the commit message. I'm not sure how to test > callers of commit_tree() though (that's commit-tree, notes and merge) > as I've not been able to come up with anything that will guarantee the > author and committer dates are different if the code in > commit_tree_extended() that ensures they are the same gets broken. The behavior you're describing where we end up with later commits in the rebase with earlier CommitDate's definitely sounds like a minor regression, and yeah we should have tests for this. My mental model of this is that we shouldn't be trying at all to adjust the CommitDate in a sequence to be the same, and just make it be whatever time() returns, except in the case where a date is passed explicitly. My cursory reading of this RFC patch is that it's definitely an improvement because we don't end up with things out of order, but is there any reason for why we should be trying to aim to keep this "consistent" within a single "git rebase" command by default, as opposed to always just writing the current CommitDate at the time we make the commit, that sounds like the most intuitive thing to me, and I can't see any downsides with that. It leaks info about how long it took someone to do the rebase, but so what?
[RFC PATCH] ident: don't cache default date
From: Phillip WoodNow that the sequencer commits without forking when the commit message isn't edited all the commits that are picked have the same committer date. If a commit is reworded it's committer date will be a later time as it is created by running an separate instance of 'git commit'. If the reworded commit is follow by further picks, those later commits will have an earlier committer date than the reworded one. This is caused by git caching the default date used when GIT_COMMITTER_DATE is not set. Fix this by not caching the date. Users expect commits to have the same author and committer dates when the don't explicitly set them. As the date is now updated each time git_author_info() or git_committer_info() is run it is possible to end up with different author and committer dates. Fix this for 'commit-tree', 'notes' and 'merge' by using a single date in commit_tree_extended() and passing it explicitly to the new functions git_author_info_with_date() and git_committer_info_with_date() when neither the author date nor the committer date are explicitly set. 'commit' always passes the author date to commit_tree_extended() and relied on the date caching to have the same committer and author dates when neither was specified. Fix this by setting GIT_COMMITTER_DATE to be the same as the author date passed to commit_tree_extended(). Signed-off-by: Phillip Wood Reported-by: Johannes Sixt --- I'm slightly nervous that setting GIT_COMMITTER_DATE in builtin/commit.c will break someone's hook script. Maybe it would be better to add a committer parameter to commit_tree() and commit_tree_extended(). This needs some tests. I think we could test that the sequencer gives each commit a new date with 'git rebase -i --exec="sleep 2" --force-rebase @~2' and test the committer dates of the rebased commits. I'm not sure if test -gt can cope with numbers that big though, maybe we'll have to be content with test !=. For 'git commit' I think doing GIT_EDITOR='sleep 2; echo message >"$1"' and checking the committer date and author date will work as the author date is set before the user edits the commit message. I'm not sure how to test callers of commit_tree() though (that's commit-tree, notes and merge) as I've not been able to come up with anything that will guarantee the author and committer dates are different if the code in commit_tree_extended() that ensures they are the same gets broken. builtin/commit.c | 8 cache.h | 2 ++ commit.c | 22 +++--- ident.c | 24 ++-- 4 files changed, 47 insertions(+), 9 deletions(-) diff --git a/builtin/commit.c b/builtin/commit.c index 37fcb55ab0..4ad8317f88 100644 --- a/builtin/commit.c +++ b/builtin/commit.c @@ -584,6 +584,14 @@ static void determine_author_info(struct strbuf *author_ident) export_one("GIT_AUTHOR_NAME", author.name_begin, author.name_end, 0); export_one("GIT_AUTHOR_EMAIL", author.mail_begin, author.mail_end, 0); export_one("GIT_AUTHOR_DATE", author.date_begin, author.tz_end, '@'); + /* +* Ensure the author and committer dates are identical if nither is +* explicitly set +*/ + if ((!date || !*date) && (!getenv("GIT_COMMITTER_DATE") + || !*getenv("GIT_COMMITTER_DATE"))) + export_one("GIT_COMMITTER_DATE", author.date_begin, + author.tz_end, '@'); free(name); free(email); free(date); diff --git a/cache.h b/cache.h index a61b2d3f0d..9cde499507 100644 --- a/cache.h +++ b/cache.h @@ -1484,7 +1484,9 @@ int date_overflows(timestamp_t date); #define IDENT_NO_DATE 2 #define IDENT_NO_NAME 4 extern const char *git_author_info(int); +extern const char *git_author_info_with_date(int, const char*); extern const char *git_committer_info(int); +extern const char *git_committer_info_with_date(int, const char*); extern const char *fmt_ident(const char *name, const char *email, const char *date_str, int); extern const char *fmt_name(const char *name, const char *email); extern const char *ident_default_name(void); diff --git a/commit.c b/commit.c index 00c99c7272..457cc8b71b 100644 --- a/commit.c +++ b/commit.c @@ -1513,6 +1513,7 @@ int commit_tree_extended(const char *msg, size_t msg_len, const char *author, const char *sign_commit, struct commit_extra_header *extra) { + struct strbuf date_buf = STRBUF_INIT; int result; int encoding_is_utf8; struct strbuf buffer; @@ -1540,10 +1541,25 @@ int commit_tree_extended(const char *msg, size_t msg_len, } /* Person/date information */ - if (!author) - author = git_author_info(IDENT_STRICT); + if (!author) { + char *author_date = xstrdup_or_null(getenv("GIT_AUTHOR_DATE")); +