Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-29 Thread Junio C Hamano
On Thu, Jul 28, 2016 at 5:37 PM, Linus Torvalds
 wrote:
> On Thu, Jul 28, 2016 at 5:29 PM, Jeff King  wrote:
>>
>> I guess we want something like:
>>
>> +void reset_ident_date(void)
>> +{
>> +   strbuf_reset(_default_date);
>> +}
>
> Looks sane.
>
>> and then to sprinkle calls liberally through builtin-ified programs when
>> they move from one unit of work to the next.
>
> Maybe we can just add it to the end of commit_tree_extended(), and
> just say "the cache is reset between commits".
>
> That way there is no sprinking in random places.

Hmph, wouldn't that equally work well if we cleared at the beginning of the
function, instead of at the end?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-29 Thread Junio C Hamano
Linus Torvalds  writes:

> So we do want to cache things for a single commit, it's just that for
> things like "git am" (or, like Junio wondered, "git rebase" - I didn't
> check) we probabyl just just flush the cache in between commits.

What I cautioned was indeed "git rebase", and the codepath that uses
"git am" internally would be fixed when "git am" is fixed, so it is
OK.

What I wondered was actually "git cherry-pick A..B", but I think it
runs "git commit" via the run_command() API as external process, so
it should also be safe.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-29 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Jul 28, 2016 at 05:37:08PM -0700, Linus Torvalds wrote:
>
>> > and then to sprinkle calls liberally through builtin-ified programs when
>> > they move from one unit of work to the next.
>> 
>> Maybe we can just add it to the end of commit_tree_extended(), and
>> just say "the cache is reset between commits".
>> 
>> That way there is no sprinking in random places.
>
> Hmm, yeah, that might work. As you mentioned, there are cases where we
> really do want the timestamps to match (especially between author and
> committer). So we would not want this reset to kick in where callers
> would not want it.
>
> So I'm trying to play devil's advocate and think of a case where
> somebody would not want the time reset after creating a commit.
>
> One obvious impact would be reflog entries, since we would reset the
> time between the object creation and the ref write (so your reflog entry
> would sometimes be a second or more after the commit time it writes).
> I'm not sure how much anybody _cares_ about that; they're much less
> intimate than author/committer times.

As long as it is understood that a commit object is created and then
a ref is updated to point at it in this order, I do not think there
is any confusion on the party who reads the reflog, I would think.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-29 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Jul 28, 2016 at 04:47:17PM -0700, Junio C Hamano wrote:
>
>> Also makes me wonder if "git cherry-pick A..B" shares the same
>> breakage.
>
> Probably.

It seems that "cherry-pick A..B" leads to sequencer.c::run_git_commit()
that uses run_command_v_opt() to drive "git commit", so we are safe.

> I guess we want something like:
>
> +void reset_ident_date(void)
> +{
> + strbuf_reset(_default_date);
> +}
> +
>
> and then to sprinkle calls liberally through builtin-ified programs when
> they move from one unit of work to the next.

ident_default_date() is currently the only one that sets this to be
cached, and that is to be used only when there is no user-specified
date.

When I saw the suggestion first time, I was worried if this had
interaction with things like GIT_COMMITTER_DATE environment (because
I didn't have easy access to the source) but it is not the case, so
the change looks very sensible.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-29 Thread Jeff King
On Thu, Jul 28, 2016 at 05:37:08PM -0700, Linus Torvalds wrote:

> > and then to sprinkle calls liberally through builtin-ified programs when
> > they move from one unit of work to the next.
> 
> Maybe we can just add it to the end of commit_tree_extended(), and
> just say "the cache is reset between commits".
> 
> That way there is no sprinking in random places.

Hmm, yeah, that might work. As you mentioned, there are cases where we
really do want the timestamps to match (especially between author and
committer). So we would not want this reset to kick in where callers
would not want it.

So I'm trying to play devil's advocate and think of a case where
somebody would not want the time reset after creating a commit.

One obvious impact would be reflog entries, since we would reset the
time between the object creation and the ref write (so your reflog entry
would sometimes be a second or more after the commit time it writes).
I'm not sure how much anybody _cares_ about that; they're much less
intimate than author/committer times.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-29 Thread Christian Couder
On Fri, Jul 29, 2016 at 2:32 AM, Linus Torvalds
 wrote:
> On Thu, Jul 28, 2016 at 5:21 PM, Jeff King  wrote:
>>
>> I do wonder if you would be happier giving each commit a "fake"
>> monotonically increasing time, so they are correctly ordered by commit
>> date.
>
> No, that would be nasty, partly for the corner case you mention, but
> partly because I just think it's wrong to try to lie about reality.
>
> The reason I noticed this in the first place was actually that I just
> wanted to take a look whether things had gotten slower or faster over
> time, and see how many patches per second I get from the patch-bombs
> Andrew sends me.
>
> So getting real time was what I was looking for.

Great!

> Also, before somebody asks: the reason git has always cached the
> "default time" string is because there's a reverse annoying thing,
> which is looking up time twice, and getting a difference of a second
> between author times and committer times just because of bad luck.
> That makes no sense either, and is actually why we have that
> "ident_default_date()" cache thing going on.
>
> So we do want to cache things for a single commit, it's just that for
> things like "git am" (or, like Junio wondered, "git rebase" - I didn't
> check) we probabyl just just flush the cache in between commits.

When Junio wrote:

> I have a suspicion that we would see more and more breakages like
> this in the near future, as there are folks trying to redo multi-commit
> commands in a single process.

he was maybe talking about my patch series to libify `git apply` and
use that in `git am` (which is itself used by `git rebase` when not in
interactive mode):

https://public-inbox.org/git/20160627182429.31550-1-chriscool%40tuxfamily.org/

that will likely mean more patches with the same real time anyway
especially with split-index on (see "Performance numbers" in that
email).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-28 Thread Linus Torvalds
On Thu, Jul 28, 2016 at 5:29 PM, Jeff King  wrote:
>
> I guess we want something like:
>
> +void reset_ident_date(void)
> +{
> +   strbuf_reset(_default_date);
> +}

Looks sane.

> and then to sprinkle calls liberally through builtin-ified programs when
> they move from one unit of work to the next.

Maybe we can just add it to the end of commit_tree_extended(), and
just say "the cache is reset between commits".

That way there is no sprinking in random places.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-28 Thread Linus Torvalds
On Thu, Jul 28, 2016 at 5:21 PM, Jeff King  wrote:
>
> I do wonder if you would be happier giving each commit a "fake"
> monotonically increasing time, so they are correctly ordered by commit
> date.

No, that would be nasty, partly for the corner case you mention, but
partly because I just think it's wrong to try to lie about reality.

The reason I noticed this in the first place was actually that I just
wanted to take a look whether things had gotten slower or faster over
time, and see how many patches per second I get from the patch-bombs
Andrew sends me.

So getting real time was what I was looking for.

Also, before somebody asks: the reason git has always cached the
"default time" string is because there's a reverse annoying thing,
which is looking up time twice, and getting a difference of a second
between author times and committer times just because of bad luck.
That makes no sense either, and is actually why we have that
"ident_default_date()" cache thing going on.

So we do want to cache things for a single commit, it's just that for
things like "git am" (or, like Junio wondered, "git rebase" - I didn't
check) we probabyl just just flush the cache in between commits.

Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-28 Thread Jeff King
On Thu, Jul 28, 2016 at 04:47:17PM -0700, Junio C Hamano wrote:

> Also makes me wonder if "git cherry-pick A..B" shares the same
> breakage.

Probably.

I guess we want something like:

diff --git a/ident.c b/ident.c
index 139c528..e20a772 100644
--- a/ident.c
+++ b/ident.c
@@ -184,6 +184,11 @@ static const char *ident_default_date(void)
return git_default_date.buf;
 }
 
+void reset_ident_date(void)
+{
+   strbuf_reset(_default_date);
+}
+
 static int crud(unsigned char c)
 {
return  c <= 32  ||

and then to sprinkle calls liberally through builtin-ified programs when
they move from one unit of work to the next.

We could also support resetting the whole ident string, but I don't
think there's any reason to (and unlike the datestamp, it has to do a
bit more looking around to come up with its values).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-28 Thread Jeff King
On Thu, Jul 28, 2016 at 04:35:58PM -0700, Linus Torvalds wrote:

> Now, it would be lovely if the new builtin git-am really was *so* fast
> that it applies a 100+-patch series in under a second, but no, that's
> not it. It's just that it only looks up the current time once.
> 
> That seems entirely accidental, I think that what happened is that
> "ident_default_date()" just ends up initializing the default date
> string once, and then the date is cached there, because it's now run
> as a single process for the whole series.
> 
> I think I'd rather get the "real" commit dates, even if they show that
> git only does a handful of commits a second rather than hundreds of
> commits..
> 
> Something that just clears git_default_date in between "git am"
> iterations, perhaps?

Yeah, your analysis makes sense and I think clearing the date between
patches is a reasonable solution.

I do wonder if you would be happier giving each commit a "fake"
monotonically increasing time, so they are correctly ordered by commit
date. I think that runs into some bizarre corner cases, though, like
adding 100 patches in 5 seconds, and ending up with commit timestamps
just slightly in the future (which is fine if you're then quiet, but
skews if you then follow-up in the next 95 seconds with another commit).
Compared to skew, having chunks of 20 commits with identical time stamps
is probably slightly less bad. At least it reflects reality.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Small trivial annoyance with the nice new builtin "git am"

2016-07-28 Thread Junio C Hamano
On Thu, Jul 28, 2016 at 4:35 PM, Linus Torvalds
 wrote:
> Ok, it's no longer *that* new, but I only now noticed..
>
> So I noticed that when I applied the last patch-bomb series from
> Andrew, all the commit date-stamps are idential.
> ...
> That seems entirely accidental, I think that what happened is that
> "ident_default_date()" just ends up initializing the default date
> string once, and then the date is cached there, because it's now run
> as a single process for the whole series.

Ouch, sorry, that certainly sounds utterly broken. I have a suspicion
that we would see more and more breakages like this in the near
future, as there are folks trying to redo multi-commit commands in
a single process. We need to be very careful to avoid the same
mistake.

Also makes me wonder if "git cherry-pick A..B" shares the same
breakage.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html