Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-24 Thread John Keeping
On Mon, Oct 21, 2013 at 10:40:22PM -0700, Martin von Zweigbergk wrote:
> On Wed, Oct 16, 2013 at 11:53 AM, John Keeping  wrote:
> > Commit 15a147e (rebase: use @{upstream} if no upstream specified,
> > 2011-02-09) says:
> >
> > Make it default to 'git rebase @{upstream}'. That is also what
> > 'git pull [--rebase]' defaults to, so it only makes sense that
> > 'git rebase' defaults to the same thing.
> >
> > but that isn't actually the case.  Since commit d44e712 (pull: support
> > rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
> > chosen the most recent reflog entry which is an ancestor of the current
> > branch if it can find one.
> >
> > Change rebase so that it uses the same logic.
> >
> > Signed-off-by: John Keeping 
> > ---
> >  git-rebase.sh | 8 
> >  t/t3400-rebase.sh | 6 --
> >  2 files changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/git-rebase.sh b/git-rebase.sh
> > index 226752f..fd36cf7 100755
> > --- a/git-rebase.sh
> > +++ b/git-rebase.sh
> > @@ -437,6 +437,14 @@ then
> > error_on_missing_default_upstream "rebase" "rebase" 
> > \
> > "against" "git rebase "
> > fi
> > +   for reflog in $(git rev-list -g "$upstream_name" 
> > 2>/dev/null)
> > +   do
> > +   if test "$reflog" = "$(git merge-base "$reflog" 
> > HEAD)"
> > +   then
> > +   upstream_name=$reflog
> > +   break
> > +   fi
> > +   done
> > ;;
> > *)  upstream_name="$1"
> > shift
> 
> A little later, "onto_name" gets assigned like so:
> 
>   onto_name=${onto-"$upstream_name"}
> 
> So if upstream_name was set above, then onto would get the same value,
> which is not what we want, right? It seems like this block of code
> should come a bit later.
> 
> I also think it not be run only when rebase was run without a given
> upstream. If the configured upstream is "origin/master", it seems like
> it would be surprising to get different behavior from "git rebase" and
> "git rebase origin/master".

Hmm... I think you're right.  I originally didn't want to change the
behaviour when the user specifies a branch, but in this case we want to
look for a common ancestor in the reflog of the upstream regardless of
where the ref came from.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-24 Thread Junio C Hamano
Martin von Zweigbergk  writes:

> I think
>
>   git merge-base HEAD $(git rev-list -g "$upstream_name")
>
> is roughly correct and hopefully fast enough. That can lead to too
> long a command line, so I was planning on teaching merge-base a
> --stdin option, but never got around to it.

Sorry for coming in late.

I think the above with s/HEAD/$curr_branch/ is a good way to compute
what the whole "for reflog in $(git rev-list -g $remoteref" loop
computes when one of the historic tips recorded in the reflog was
where $curr_branch forked from, i.e. the loop actually finds at
least one ancestor in the reflog and breaks out after setting
oldremoteref.  But it would give a completely different commit if
none of the reflog entries is a fork point.

A two patch series forthcoming.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-21 Thread Martin von Zweigbergk
On Mon, Oct 21, 2013 at 4:24 AM, John Keeping  wrote:
> On Sun, Oct 20, 2013 at 10:03:29PM -0700, Martin von Zweigbergk wrote:
>> On Wed, Oct 16, 2013 at 11:53 AM, John Keeping  wrote:
>> > Commit 15a147e (rebase: use @{upstream} if no upstream specified,
>> > 2011-02-09) says:
>> >
>> > Make it default to 'git rebase @{upstream}'. That is also what
>> > 'git pull [--rebase]' defaults to, so it only makes sense that
>> > 'git rebase' defaults to the same thing.
>> >
>> > but that isn't actually the case.  Since commit d44e712 (pull: support
>> > rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
>> > chosen the most recent reflog entry which is an ancestor of the current
>> > branch if it can find one.
>>
>> It is exactly this inconsistency between "git rebase" and "git pull
>> --rebase" that confused me enough to make me send my first email to
>> this list almost 4 years ago [1], so thanks for working on this! I
>> finished that thread with:
>>
>>   Would it make sense to teach "git rebase" the same tricks as "git
>> pull --rebase"?
>>
>> Then it took me a year before I sent a patch not unlike this one [2].
>> To summarize, the patch did not get accepted then because it makes
>> rebase a little slower (or a lot slower in some cases). "git pull
>> --rebase" is of course at least as slow in the same cases, but because
>> it often involves connecting to a remote host, people would probably
>> blame the connection rather than git itself even in those rare (?)
>> cases.
>>
>> I think
>>
>>   git merge-base HEAD $(git rev-list -g "$upstream_name")
>>
>> is roughly correct and hopefully fast enough. That can lead to too
>> long a command line, so I was planning on teaching merge-base a
>> --stdin option, but never got around to it.
>
> I'm not sure we should worry about the additional overhead here.  In the
> common case, we should hit a common ancestor within the first couple of
> reflog entries; and in the case that will be slow, it's likely that
> there are a lot of differences between the branches so the cherry
> comparison phase will take a while anyway.

Perhaps true. I created a simple commit based on my origin/master@{1}
in git.git, which happened to be 136 commits behind origin/master.
Before (a modified version of) your patch, it took 0.756s to rebase it
(best of 5) and afterwards it took 0.720s.

And in a worse case: The same test with one commit off my
origin/master@{13}, 2910 behind origin/master, shows an increase from
2.75s to 4.04s.

And a degenerate case: I created a test branch (called u) with
1000-entry reflog from the output of "git rev-list --first-parent
origin/master | head -1000 | tac" and created the same simple commit
as before off of the end of this reflog (u@{999}). This ended up 3769
commits behind u@{0} (aka origin/master). In this case it went from
3.43s to 3m32s. Obviously, this was a degenerate case designed to be
slow, but I think it's still worth noting that one can get such O(n^2)
behavior e.g. if one lets a branch get out of sync with an upstream
that's very frequently fetches (I've heard of people running
short-interval cron jobs that fetch from a remote).

I do like the feature, but I'm still concerned about this last case.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-21 Thread Martin von Zweigbergk
On Wed, Oct 16, 2013 at 11:53 AM, John Keeping  wrote:
> Commit 15a147e (rebase: use @{upstream} if no upstream specified,
> 2011-02-09) says:
>
> Make it default to 'git rebase @{upstream}'. That is also what
> 'git pull [--rebase]' defaults to, so it only makes sense that
> 'git rebase' defaults to the same thing.
>
> but that isn't actually the case.  Since commit d44e712 (pull: support
> rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
> chosen the most recent reflog entry which is an ancestor of the current
> branch if it can find one.
>
> Change rebase so that it uses the same logic.
>
> Signed-off-by: John Keeping 
> ---
>  git-rebase.sh | 8 
>  t/t3400-rebase.sh | 6 --
>  2 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/git-rebase.sh b/git-rebase.sh
> index 226752f..fd36cf7 100755
> --- a/git-rebase.sh
> +++ b/git-rebase.sh
> @@ -437,6 +437,14 @@ then
> error_on_missing_default_upstream "rebase" "rebase" \
> "against" "git rebase "
> fi
> +   for reflog in $(git rev-list -g "$upstream_name" 2>/dev/null)
> +   do
> +   if test "$reflog" = "$(git merge-base "$reflog" HEAD)"
> +   then
> +   upstream_name=$reflog
> +   break
> +   fi
> +   done
> ;;
> *)  upstream_name="$1"
> shift

A little later, "onto_name" gets assigned like so:

  onto_name=${onto-"$upstream_name"}

So if upstream_name was set above, then onto would get the same value,
which is not what we want, right? It seems like this block of code
should come a bit later.

I also think it not be run only when rebase was run without a given
upstream. If the configured upstream is "origin/master", it seems like
it would be surprising to get different behavior from "git rebase" and
"git rebase origin/master".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-21 Thread John Keeping
On Sun, Oct 20, 2013 at 10:03:29PM -0700, Martin von Zweigbergk wrote:
> On Wed, Oct 16, 2013 at 11:53 AM, John Keeping  wrote:
> > Commit 15a147e (rebase: use @{upstream} if no upstream specified,
> > 2011-02-09) says:
> >
> > Make it default to 'git rebase @{upstream}'. That is also what
> > 'git pull [--rebase]' defaults to, so it only makes sense that
> > 'git rebase' defaults to the same thing.
> >
> > but that isn't actually the case.  Since commit d44e712 (pull: support
> > rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
> > chosen the most recent reflog entry which is an ancestor of the current
> > branch if it can find one.
> 
> It is exactly this inconsistency between "git rebase" and "git pull
> --rebase" that confused me enough to make me send my first email to
> this list almost 4 years ago [1], so thanks for working on this! I
> finished that thread with:
> 
>   Would it make sense to teach "git rebase" the same tricks as "git
> pull --rebase"?
> 
> Then it took me a year before I sent a patch not unlike this one [2].
> To summarize, the patch did not get accepted then because it makes
> rebase a little slower (or a lot slower in some cases). "git pull
> --rebase" is of course at least as slow in the same cases, but because
> it often involves connecting to a remote host, people would probably
> blame the connection rather than git itself even in those rare (?)
> cases.
> 
> I think
> 
>   git merge-base HEAD $(git rev-list -g "$upstream_name")
> 
> is roughly correct and hopefully fast enough. That can lead to too
> long a command line, so I was planning on teaching merge-base a
> --stdin option, but never got around to it.

I'm not sure we should worry about the additional overhead here.  In the
common case, we should hit a common ancestor within the first couple of
reflog entries; and in the case that will be slow, it's likely that
there are a lot of differences between the branches so the cherry
comparison phase will take a while anyway.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-20 Thread Martin von Zweigbergk
On Wed, Oct 16, 2013 at 11:53 AM, John Keeping  wrote:
> Commit 15a147e (rebase: use @{upstream} if no upstream specified,
> 2011-02-09) says:
>
> Make it default to 'git rebase @{upstream}'. That is also what
> 'git pull [--rebase]' defaults to, so it only makes sense that
> 'git rebase' defaults to the same thing.
>
> but that isn't actually the case.  Since commit d44e712 (pull: support
> rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
> chosen the most recent reflog entry which is an ancestor of the current
> branch if it can find one.

It is exactly this inconsistency between "git rebase" and "git pull
--rebase" that confused me enough to make me send my first email to
this list almost 4 years ago [1], so thanks for working on this! I
finished that thread with:

  Would it make sense to teach "git rebase" the same tricks as "git
pull --rebase"?

Then it took me a year before I sent a patch not unlike this one [2].
To summarize, the patch did not get accepted then because it makes
rebase a little slower (or a lot slower in some cases). "git pull
--rebase" is of course at least as slow in the same cases, but because
it often involves connecting to a remote host, people would probably
blame the connection rather than git itself even in those rare (?)
cases.

I think

  git merge-base HEAD $(git rev-list -g "$upstream_name")

is roughly correct and hopefully fast enough. That can lead to too
long a command line, so I was planning on teaching merge-base a
--stdin option, but never got around to it.

Martin


[1] http://thread.gmane.org/gmane.comp.version-control.git/136339
[2] http://thread.gmane.org/gmane.comp.version-control.git/166710
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-16 Thread John Keeping
On Wed, Oct 16, 2013 at 12:24:13PM -0700, Jonathan Nieder wrote:
> John Keeping wrote:
> 
> >Since commit d44e712 (pull: support
> > rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
> > chosen the most recent reflog entry which is an ancestor of the current
> > branch if it can find one.
> >
> > Change rebase so that it uses the same logic.
> 
> Nice idea.
> 
> Could pull be made to rely on rebase for this as a follow-up?

It's not an obvious change to do that (at least to me) - pull does the
different steps of figuring out the base and then rebasing at different
points in the script.

> [...]
> > --- a/git-rebase.sh
> > +++ b/git-rebase.sh
> > @@ -437,6 +437,14 @@ then
> > error_on_missing_default_upstream "rebase" "rebase" \
> > "against" "git rebase "
> > fi
> > +   for reflog in $(git rev-list -g "$upstream_name" 2>/dev/null)
> > +   do
> > +   if test "$reflog" = "$(git merge-base "$reflog" HEAD)"
> 
> "git merge-base --is-ancestor" is faster.

We should probably change git-pull to use that as well.

> What should happen if HEAD is not a valid commit?  (Tested with:
> 
>   $ git checkout --orphan foo
>   $ cat >>.git/config <   [branch "foo"]
>   remote = origin
>   merge = refs/heads/master
>   EOF
>   $ bin-wrappers/git rebase 2>&1 | wc -l
>   83
> 
> ).

Adding "2>/dev/null" to the merge-base line silences most of that, all
we're left with is "fatal: Needed a single revision" which is the same
as I get from master's git-rebase.  I think silencing stderr is the
right thing to do here - in the worst case we just use the merge-base of
the two branches instead of the appropriate reflog entry.

> diff --git i/git-rebase.sh w/git-rebase.sh
> index fd36cf7..d2e2c2e 100755
> --- i/git-rebase.sh
> +++ w/git-rebase.sh
> @@ -439,7 +439,7 @@ then
>   fi
>   for reflog in $(git rev-list -g "$upstream_name" 2>/dev/null)
>   do
> - if test "$reflog" = "$(git merge-base "$reflog" HEAD)"
> + if git merge-base --is-ancestor "$reflog" HEAD
>   then
>   upstream_name=$reflog
>   break
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase: use reflog to find common base with upstream

2013-10-16 Thread Jonathan Nieder
Hi,

John Keeping wrote:

>Since commit d44e712 (pull: support
> rebased upstream + fetch + pull --rebase, 2009-07-19), pull has actually
> chosen the most recent reflog entry which is an ancestor of the current
> branch if it can find one.
>
> Change rebase so that it uses the same logic.

Nice idea.

Could pull be made to rely on rebase for this as a follow-up?

[...]
> --- a/git-rebase.sh
> +++ b/git-rebase.sh
> @@ -437,6 +437,14 @@ then
>   error_on_missing_default_upstream "rebase" "rebase" \
>   "against" "git rebase "
>   fi
> + for reflog in $(git rev-list -g "$upstream_name" 2>/dev/null)
> + do
> + if test "$reflog" = "$(git merge-base "$reflog" HEAD)"

"git merge-base --is-ancestor" is faster.

What should happen if HEAD is not a valid commit?  (Tested with:

$ git checkout --orphan foo
$ cat >>.git/config <&1 | wc -l
83

).

diff --git i/git-rebase.sh w/git-rebase.sh
index fd36cf7..d2e2c2e 100755
--- i/git-rebase.sh
+++ w/git-rebase.sh
@@ -439,7 +439,7 @@ then
fi
for reflog in $(git rev-list -g "$upstream_name" 2>/dev/null)
do
-   if test "$reflog" = "$(git merge-base "$reflog" HEAD)"
+   if git merge-base --is-ancestor "$reflog" HEAD
then
upstream_name=$reflog
break
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html