Re: [RFC PATCH] ident: don't cache default date

2018-04-20 Thread Phillip Wood
On 20/04/18 09:11, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Thu, 19 Apr 2018, Phillip Wood wrote:
> 
>> On 18/04/18 19:15, Johannes Sixt wrote:
>>> Am 18.04.2018 um 19:47 schrieb Phillip Wood:
 On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote:
> On Wed, Apr 18 2018, Phillip Wood wrote:
>> From: Phillip Wood 
>> as it is created by running an separate instance of 'git commit'.  If
>> the reworded commit is follow by further picks, those later commits
>> will have an earlier committer date than the reworded one. This is
>> caused by git caching the default date used when GIT_COMMITTER_DATE is
>> not set. Fix this by not caching the date.
>>
>> Users expect commits to have the same author and committer dates when
>> the don't explicitly set them. As the date is now updated each time
>> git_author_info() or git_committer_info() is run it is possible to end
>> up with different author and committer dates. Fix this for
>> 'commit-tree', 'notes' and 'merge' by using a single date in
>> commit_tree_extended() and passing it explicitly to the new functions
>> git_author_info_with_date() and git_committer_info_with_date() when
>> neither the author date nor the committer date are explicitly
>> set. 'commit' always passes the author date to commit_tree_extended()
>> and relied on the date caching to have the same committer and author
>> dates when neither was specified. Fix this by setting
>> GIT_COMMITTER_DATE to be the same as the author date passed to
>> commit_tree_extended().
>>
>> Signed-off-by: Phillip Wood 
>> Reported-by: Johannes Sixt 
>> ---
>>
>> I'm slightly nervous that setting GIT_COMMITTER_DATE in
>> builtin/commit.c will break someone's hook script. Maybe it would be
>> better to add a committer parameter to commit_tree() and
>> commit_tree_extended().
>>>
>>> While I like the basic theme of your patch, I think we should fix this
>>> case in a much simpler way, namely, use the infrastructure that was
>>> introduced for git-am.
>>>
>>> I've shamelessly lifted the commit message from your patch.
>>
>> Thanks, that is a better way (I'm annoyed with myself for not having
>> noticed reset_ident_date() when I edited the function above it)
> 
> Don't be too annoyed. I did remember that The Linus had complained about
> something similar and assumed that it had been fixed in the meantime, but
> I failed to find it within 30 minutes where I tried to dig through
> public-inbox and pu.

Thanks, that makes we feel better about it (thanks for taking the time
to try and find the original mail as well)

Phillip

> Thanks Hannes for remembering, and for coming up with the final form of
> the patch!
> 
> Ciao,
> Dscho
> 



Re: [RFC PATCH] ident: don't cache default date

2018-04-20 Thread Johannes Schindelin
Hi Phillip,

On Thu, 19 Apr 2018, Phillip Wood wrote:

> On 18/04/18 19:15, Johannes Sixt wrote:
> > Am 18.04.2018 um 19:47 schrieb Phillip Wood:
> >> On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote:
> >>> On Wed, Apr 18 2018, Phillip Wood wrote:
>  From: Phillip Wood 
>  as it is created by running an separate instance of 'git commit'.  If
>  the reworded commit is follow by further picks, those later commits
>  will have an earlier committer date than the reworded one. This is
>  caused by git caching the default date used when GIT_COMMITTER_DATE is
>  not set. Fix this by not caching the date.
> 
>  Users expect commits to have the same author and committer dates when
>  the don't explicitly set them. As the date is now updated each time
>  git_author_info() or git_committer_info() is run it is possible to end
>  up with different author and committer dates. Fix this for
>  'commit-tree', 'notes' and 'merge' by using a single date in
>  commit_tree_extended() and passing it explicitly to the new functions
>  git_author_info_with_date() and git_committer_info_with_date() when
>  neither the author date nor the committer date are explicitly
>  set. 'commit' always passes the author date to commit_tree_extended()
>  and relied on the date caching to have the same committer and author
>  dates when neither was specified. Fix this by setting
>  GIT_COMMITTER_DATE to be the same as the author date passed to
>  commit_tree_extended().
> 
>  Signed-off-by: Phillip Wood 
>  Reported-by: Johannes Sixt 
>  ---
> 
>  I'm slightly nervous that setting GIT_COMMITTER_DATE in
>  builtin/commit.c will break someone's hook script. Maybe it would be
>  better to add a committer parameter to commit_tree() and
>  commit_tree_extended().
> > 
> > While I like the basic theme of your patch, I think we should fix this
> > case in a much simpler way, namely, use the infrastructure that was
> > introduced for git-am.
> > 
> > I've shamelessly lifted the commit message from your patch.
> 
> Thanks, that is a better way (I'm annoyed with myself for not having
> noticed reset_ident_date() when I edited the function above it)

Don't be too annoyed. I did remember that The Linus had complained about
something similar and assumed that it had been fixed in the meantime, but
I failed to find it within 30 minutes where I tried to dig through
public-inbox and pu.

Thanks Hannes for remembering, and for coming up with the final form of
the patch!

Ciao,
Dscho

Re: [RFC PATCH] ident: don't cache default date

2018-04-19 Thread Phillip Wood
On 18/04/18 19:15, Johannes Sixt wrote:
> Am 18.04.2018 um 19:47 schrieb Phillip Wood:
>> On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote:
>>> On Wed, Apr 18 2018, Phillip Wood wrote:
 From: Phillip Wood 
 as it is created by running an separate instance of 'git commit'.  If
 the reworded commit is follow by further picks, those later commits
 will have an earlier committer date than the reworded one. This is
 caused by git caching the default date used when GIT_COMMITTER_DATE is
 not set. Fix this by not caching the date.

 Users expect commits to have the same author and committer dates when
 the don't explicitly set them. As the date is now updated each time
 git_author_info() or git_committer_info() is run it is possible to end
 up with different author and committer dates. Fix this for
 'commit-tree', 'notes' and 'merge' by using a single date in
 commit_tree_extended() and passing it explicitly to the new functions
 git_author_info_with_date() and git_committer_info_with_date() when
 neither the author date nor the committer date are explicitly
 set. 'commit' always passes the author date to commit_tree_extended()
 and relied on the date caching to have the same committer and author
 dates when neither was specified. Fix this by setting
 GIT_COMMITTER_DATE to be the same as the author date passed to
 commit_tree_extended().

 Signed-off-by: Phillip Wood 
 Reported-by: Johannes Sixt 
 ---

 I'm slightly nervous that setting GIT_COMMITTER_DATE in
 builtin/commit.c will break someone's hook script. Maybe it would be
 better to add a committer parameter to commit_tree() and
 commit_tree_extended().
> 
> While I like the basic theme of your patch, I think we should fix this
> case in a much simpler way, namely, use the infrastructure that was
> introduced for git-am.
> 
> I've shamelessly lifted the commit message from your patch.

Thanks, that is a better way (I'm annoyed with myself for not having
noticed reset_ident_date() when I edited the function above it)

Best Wishes

Phillip

>  8< 
> Subject: [PATCH] sequencer: reset the committer date before commits
> 
> Now that the sequencer commits without forking when the commit message
> isn't edited all the commits that are picked have the same committer
> date. If a commit is reworded it's committer date will be a later time
> as it is created by running an separate instance of 'git commit'.  If
> the reworded commit is follow by further picks, those later commits
> will have an earlier committer date than the reworded one. This is
> caused by git caching the default date used when GIT_COMMITTER_DATE is
> not set. Reset the cached date before a commit is generated
> in-process.
> 
> Signed-off-by: Johannes Sixt 
> ---
>  sequencer.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/sequencer.c b/sequencer.c
> index f9d1001dee..f0bac903a0 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1148,6 +1148,8 @@ static int try_to_commit(struct strbuf *msg, const char 
> *author,
>   goto out;
>   }
>  
> + reset_ident_date();
> +
>   if (commit_tree_extended(msg->buf, msg->len, , parents,
>oid, author, opts->gpg_sign, extra)) {
>   res = error(_("failed to write commit object"));
> 



Re: [RFC PATCH] ident: don't cache default date

2018-04-18 Thread Junio C Hamano
Johannes Sixt  writes:

> While I like the basic theme of your patch, I think we should fix this
> case in a much simpler way, namely, use the infrastructure that was
> introduced for git-am.

Yup.  reset_ident_date() was introduced by 4d9c7e6f ("am: reset
cached ident date for each patch", 2016-08-01) and the commit
explains very well why it is a good idea to have both the caching
and also the strategic resetting it introduces.

Thanks, all.

> I've shamelessly lifted the commit message from your patch.
>
>  8< 
> Subject: [PATCH] sequencer: reset the committer date before commits
>
> Now that the sequencer commits without forking when the commit message
> isn't edited all the commits that are picked have the same committer
> date. If a commit is reworded it's committer date will be a later time
> as it is created by running an separate instance of 'git commit'.  If
> the reworded commit is follow by further picks, those later commits
> will have an earlier committer date than the reworded one. This is
> caused by git caching the default date used when GIT_COMMITTER_DATE is
> not set. Reset the cached date before a commit is generated
> in-process.
>
> Signed-off-by: Johannes Sixt 
> ---
>  sequencer.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/sequencer.c b/sequencer.c
> index f9d1001dee..f0bac903a0 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1148,6 +1148,8 @@ static int try_to_commit(struct strbuf *msg, const char 
> *author,
>   goto out;
>   }
>  
> + reset_ident_date();
> +
>   if (commit_tree_extended(msg->buf, msg->len, , parents,
>oid, author, opts->gpg_sign, extra)) {
>   res = error(_("failed to write commit object"));


Re: [RFC PATCH] ident: don't cache default date

2018-04-18 Thread Johannes Sixt
Am 18.04.2018 um 19:47 schrieb Phillip Wood:
> On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote:
>> On Wed, Apr 18 2018, Phillip Wood wrote:
>>> From: Phillip Wood 
>>> as it is created by running an separate instance of 'git commit'.  If
>>> the reworded commit is follow by further picks, those later commits
>>> will have an earlier committer date than the reworded one. This is
>>> caused by git caching the default date used when GIT_COMMITTER_DATE is
>>> not set. Fix this by not caching the date.
>>>
>>> Users expect commits to have the same author and committer dates when
>>> the don't explicitly set them. As the date is now updated each time
>>> git_author_info() or git_committer_info() is run it is possible to end
>>> up with different author and committer dates. Fix this for
>>> 'commit-tree', 'notes' and 'merge' by using a single date in
>>> commit_tree_extended() and passing it explicitly to the new functions
>>> git_author_info_with_date() and git_committer_info_with_date() when
>>> neither the author date nor the committer date are explicitly
>>> set. 'commit' always passes the author date to commit_tree_extended()
>>> and relied on the date caching to have the same committer and author
>>> dates when neither was specified. Fix this by setting
>>> GIT_COMMITTER_DATE to be the same as the author date passed to
>>> commit_tree_extended().
>>>
>>> Signed-off-by: Phillip Wood 
>>> Reported-by: Johannes Sixt 
>>> ---
>>>
>>> I'm slightly nervous that setting GIT_COMMITTER_DATE in
>>> builtin/commit.c will break someone's hook script. Maybe it would be
>>> better to add a committer parameter to commit_tree() and
>>> commit_tree_extended().

While I like the basic theme of your patch, I think we should fix this
case in a much simpler way, namely, use the infrastructure that was
introduced for git-am.

I've shamelessly lifted the commit message from your patch.

 8< 
Subject: [PATCH] sequencer: reset the committer date before commits

Now that the sequencer commits without forking when the commit message
isn't edited all the commits that are picked have the same committer
date. If a commit is reworded it's committer date will be a later time
as it is created by running an separate instance of 'git commit'.  If
the reworded commit is follow by further picks, those later commits
will have an earlier committer date than the reworded one. This is
caused by git caching the default date used when GIT_COMMITTER_DATE is
not set. Reset the cached date before a commit is generated
in-process.

Signed-off-by: Johannes Sixt 
---
 sequencer.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index f9d1001dee..f0bac903a0 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1148,6 +1148,8 @@ static int try_to_commit(struct strbuf *msg, const char 
*author,
goto out;
}
 
+   reset_ident_date();
+
if (commit_tree_extended(msg->buf, msg->len, , parents,
 oid, author, opts->gpg_sign, extra)) {
res = error(_("failed to write commit object"));
-- 
2.17.0.69.g0c1d01d9b6


Re: [RFC PATCH] ident: don't cache default date

2018-04-18 Thread Phillip Wood
Hi Ævar, thanks for your comments

On 18/04/18 12:27, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Apr 18 2018, Phillip Wood wrote:
> 
>> From: Phillip Wood 
>>
>> Now that the sequencer commits without forking when the commit message
>> isn't edited all the commits that are picked have the same committer
>> date. If a commit is reworded it's committer date will be a later time
> 
> s/it's/its/

Thanks I'll change it

>> as it is created by running an separate instance of 'git commit'.  If
>> the reworded commit is follow by further picks, those later commits
>> will have an earlier committer date than the reworded one. This is
>> caused by git caching the default date used when GIT_COMMITTER_DATE is
>> not set. Fix this by not caching the date.
>>
>> Users expect commits to have the same author and committer dates when
>> the don't explicitly set them. As the date is now updated each time
>> git_author_info() or git_committer_info() is run it is possible to end
>> up with different author and committer dates. Fix this for
>> 'commit-tree', 'notes' and 'merge' by using a single date in
>> commit_tree_extended() and passing it explicitly to the new functions
>> git_author_info_with_date() and git_committer_info_with_date() when
>> neither the author date nor the committer date are explicitly
>> set. 'commit' always passes the author date to commit_tree_extended()
>> and relied on the date caching to have the same committer and author
>> dates when neither was specified. Fix this by setting
>> GIT_COMMITTER_DATE to be the same as the author date passed to
>> commit_tree_extended().
>>
>> Signed-off-by: Phillip Wood 
>> Reported-by: Johannes Sixt 
>> ---
>>
>> I'm slightly nervous that setting GIT_COMMITTER_DATE in
>> builtin/commit.c will break someone's hook script. Maybe it would be
>> better to add a committer parameter to commit_tree() and
>> commit_tree_extended().
>>
>> This needs some tests. I think we could test that the sequencer gives
>> each commit a new date with 'git rebase -i --exec="sleep 2"
>> --force-rebase @~2' and test the committer dates of the rebased
>> commits. I'm not sure if test -gt can cope with numbers that big
>> though, maybe we'll have to be content with test !=. For 'git commit'
>> I think doing GIT_EDITOR='sleep 2; echo message >"$1"' and checking
>> the committer date and author date will work as the author date is set
>> before the user edits the commit message. I'm not sure how to test
>> callers of commit_tree() though (that's commit-tree, notes and merge)
>> as I've not been able to come up with anything that will guarantee the
>> author and committer dates are different if the code in
>> commit_tree_extended() that ensures they are the same gets broken.
> 
> The behavior you're describing where we end up with later commits in the
> rebase with earlier CommitDate's definitely sounds like a minor
> regression, and yeah we should have tests for this.
> 
> My mental model of this is that we shouldn't be trying at all to adjust
> the CommitDate in a sequence to be the same, and just make it be
> whatever time() returns, except in the case where a date is passed
> explicitly.
> 
> My cursory reading of this RFC patch is that it's definitely an
> improvement because we don't end up with things out of order, but is
> there any reason for why we should be trying to aim to keep this
> "consistent" within a single "git rebase" command by default, as opposed
> to always just writing the current CommitDate at the time we make the
> commit, that sounds like the most intuitive thing to me, and I can't see
> any downsides with that.

It's not trying to keep the date "consistent" within a single rebase
command, each commit created by the sequencer gets a date stamp with the
current time when the commit is created. What it is doing is keeping the
author and committer dates the same for commit/commit-tree/merge/notes
when the user does not specify either of them. While I don't think it
particularly matters that they match (so long as the committer date is
later than the author date), it is what git does at the moment and
someone maybe relying on the current behavior.

Best Wishes

Phillip

> 
> It leaks info about how long it took someone to do the rebase, but so
> what?
> 



Re: [RFC PATCH] ident: don't cache default date

2018-04-18 Thread Ævar Arnfjörð Bjarmason

On Wed, Apr 18 2018, Phillip Wood wrote:

> From: Phillip Wood 
>
> Now that the sequencer commits without forking when the commit message
> isn't edited all the commits that are picked have the same committer
> date. If a commit is reworded it's committer date will be a later time

s/it's/its/

> as it is created by running an separate instance of 'git commit'.  If
> the reworded commit is follow by further picks, those later commits
> will have an earlier committer date than the reworded one. This is
> caused by git caching the default date used when GIT_COMMITTER_DATE is
> not set. Fix this by not caching the date.
>
> Users expect commits to have the same author and committer dates when
> the don't explicitly set them. As the date is now updated each time
> git_author_info() or git_committer_info() is run it is possible to end
> up with different author and committer dates. Fix this for
> 'commit-tree', 'notes' and 'merge' by using a single date in
> commit_tree_extended() and passing it explicitly to the new functions
> git_author_info_with_date() and git_committer_info_with_date() when
> neither the author date nor the committer date are explicitly
> set. 'commit' always passes the author date to commit_tree_extended()
> and relied on the date caching to have the same committer and author
> dates when neither was specified. Fix this by setting
> GIT_COMMITTER_DATE to be the same as the author date passed to
> commit_tree_extended().
>
> Signed-off-by: Phillip Wood 
> Reported-by: Johannes Sixt 
> ---
>
> I'm slightly nervous that setting GIT_COMMITTER_DATE in
> builtin/commit.c will break someone's hook script. Maybe it would be
> better to add a committer parameter to commit_tree() and
> commit_tree_extended().
>
> This needs some tests. I think we could test that the sequencer gives
> each commit a new date with 'git rebase -i --exec="sleep 2"
> --force-rebase @~2' and test the committer dates of the rebased
> commits. I'm not sure if test -gt can cope with numbers that big
> though, maybe we'll have to be content with test !=. For 'git commit'
> I think doing GIT_EDITOR='sleep 2; echo message >"$1"' and checking
> the committer date and author date will work as the author date is set
> before the user edits the commit message. I'm not sure how to test
> callers of commit_tree() though (that's commit-tree, notes and merge)
> as I've not been able to come up with anything that will guarantee the
> author and committer dates are different if the code in
> commit_tree_extended() that ensures they are the same gets broken.

The behavior you're describing where we end up with later commits in the
rebase with earlier CommitDate's definitely sounds like a minor
regression, and yeah we should have tests for this.

My mental model of this is that we shouldn't be trying at all to adjust
the CommitDate in a sequence to be the same, and just make it be
whatever time() returns, except in the case where a date is passed
explicitly.

My cursory reading of this RFC patch is that it's definitely an
improvement because we don't end up with things out of order, but is
there any reason for why we should be trying to aim to keep this
"consistent" within a single "git rebase" command by default, as opposed
to always just writing the current CommitDate at the time we make the
commit, that sounds like the most intuitive thing to me, and I can't see
any downsides with that.

It leaks info about how long it took someone to do the rebase, but so
what?


[RFC PATCH] ident: don't cache default date

2018-04-18 Thread Phillip Wood
From: Phillip Wood 

Now that the sequencer commits without forking when the commit message
isn't edited all the commits that are picked have the same committer
date. If a commit is reworded it's committer date will be a later time
as it is created by running an separate instance of 'git commit'.  If
the reworded commit is follow by further picks, those later commits
will have an earlier committer date than the reworded one. This is
caused by git caching the default date used when GIT_COMMITTER_DATE is
not set. Fix this by not caching the date.

Users expect commits to have the same author and committer dates when
the don't explicitly set them. As the date is now updated each time
git_author_info() or git_committer_info() is run it is possible to end
up with different author and committer dates. Fix this for
'commit-tree', 'notes' and 'merge' by using a single date in
commit_tree_extended() and passing it explicitly to the new functions
git_author_info_with_date() and git_committer_info_with_date() when
neither the author date nor the committer date are explicitly
set. 'commit' always passes the author date to commit_tree_extended()
and relied on the date caching to have the same committer and author
dates when neither was specified. Fix this by setting
GIT_COMMITTER_DATE to be the same as the author date passed to
commit_tree_extended().

Signed-off-by: Phillip Wood 
Reported-by: Johannes Sixt 
---

I'm slightly nervous that setting GIT_COMMITTER_DATE in
builtin/commit.c will break someone's hook script. Maybe it would be
better to add a committer parameter to commit_tree() and
commit_tree_extended().

This needs some tests. I think we could test that the sequencer gives
each commit a new date with 'git rebase -i --exec="sleep 2"
--force-rebase @~2' and test the committer dates of the rebased
commits. I'm not sure if test -gt can cope with numbers that big
though, maybe we'll have to be content with test !=. For 'git commit'
I think doing GIT_EDITOR='sleep 2; echo message >"$1"' and checking
the committer date and author date will work as the author date is set
before the user edits the commit message. I'm not sure how to test
callers of commit_tree() though (that's commit-tree, notes and merge)
as I've not been able to come up with anything that will guarantee the
author and committer dates are different if the code in
commit_tree_extended() that ensures they are the same gets broken.

 builtin/commit.c |  8 
 cache.h  |  2 ++
 commit.c | 22 +++---
 ident.c  | 24 ++--
 4 files changed, 47 insertions(+), 9 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 37fcb55ab0..4ad8317f88 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -584,6 +584,14 @@ static void determine_author_info(struct strbuf 
*author_ident)
export_one("GIT_AUTHOR_NAME", author.name_begin, author.name_end, 0);
export_one("GIT_AUTHOR_EMAIL", author.mail_begin, author.mail_end, 0);
export_one("GIT_AUTHOR_DATE", author.date_begin, author.tz_end, '@');
+   /*
+* Ensure the author and committer dates are identical if nither is
+* explicitly set
+*/
+   if ((!date || !*date) && (!getenv("GIT_COMMITTER_DATE")
+ || !*getenv("GIT_COMMITTER_DATE")))
+   export_one("GIT_COMMITTER_DATE", author.date_begin,
+  author.tz_end, '@');
free(name);
free(email);
free(date);
diff --git a/cache.h b/cache.h
index a61b2d3f0d..9cde499507 100644
--- a/cache.h
+++ b/cache.h
@@ -1484,7 +1484,9 @@ int date_overflows(timestamp_t date);
 #define IDENT_NO_DATE 2
 #define IDENT_NO_NAME 4
 extern const char *git_author_info(int);
+extern const char *git_author_info_with_date(int, const char*);
 extern const char *git_committer_info(int);
+extern const char *git_committer_info_with_date(int, const char*);
 extern const char *fmt_ident(const char *name, const char *email, const char 
*date_str, int);
 extern const char *fmt_name(const char *name, const char *email);
 extern const char *ident_default_name(void);
diff --git a/commit.c b/commit.c
index 00c99c7272..457cc8b71b 100644
--- a/commit.c
+++ b/commit.c
@@ -1513,6 +1513,7 @@ int commit_tree_extended(const char *msg, size_t msg_len,
 const char *author, const char *sign_commit,
 struct commit_extra_header *extra)
 {
+   struct strbuf date_buf = STRBUF_INIT;
int result;
int encoding_is_utf8;
struct strbuf buffer;
@@ -1540,10 +1541,25 @@ int commit_tree_extended(const char *msg, size_t 
msg_len,
}
 
/* Person/date information */
-   if (!author)
-   author = git_author_info(IDENT_STRICT);
+   if (!author) {
+   char *author_date = xstrdup_or_null(getenv("GIT_AUTHOR_DATE"));
+