Re: [git-users] an elementary question how to switch/checkout a remote branch

Konstantin Khomoutov Wed, 16 Nov 2022 03:56:04 -0800

On Tue, Nov 15, 2022 at 10:26:04PM +0100, Uwe Brauer wrote:

> >> I am sorry for such an elementary question, but using mainly hg, I found
> >> the following very confusing.
> 
> > No, the question is not elementary, and yes, the behavior is confusing.
> 
> First of all thanks very much for this very detailed answer, and the
> solution to my problem. 
> 
> ,----
> | git checkout feature
> | git push origin feature
> `----
> 
> However I still have a lack of understanding of some concepts.
> 
> Since I am basically a hg user, bare with me.
> 
> I thought the main difference between git and hg, is (besides the
> command syntax) the way branches are considered, a git branch is similar
> to a bookmark, while a named branch has no equivalent in git.


This is true, but the terminology, as usually, is open to interpretaton
and allows some wiggle room for mind ;-)
When some text says Git branches are like bookmarks, and that Hg's named
branches have no equivalent in Git, that is true, but the meaning of this is
that 1) Git commits do not embed the name of the branch which was checked out
when the commit has been recorded; 2) Branches do not have any metadata except
the current name of the branch; 3) A brach records its tip commit; when a new
commit is recorded on a branch, this record is overwritten.

I find it easier to undertand if you think of a Git branch as a text file (and
in the reference Git implementation a branch _is_ a text file) which contains
the cryptographic hash of the commit it points at.
When you record a new commit, the file is rewritten with the new hash.
When you rename a branch, the file is renamed.
You see, we can say it looks like a bookmark to a commit.

> However I now see that also the concept of remote branch tracking sets
> git apart from mercurial.

Yes.
Looks like I failed to properly communicate that crucial bit about the
asymmetry Git has with handling remote repositories and branches from them.
I think I know why, I'll try to provide another perspective, hold on.

> And I have to say that the confusion I have may be caused partially by
> the naming convenience.
> 
>     1. That concerns the name origin for the remote which is confusing.
>        If I clone a repository that might be justified, but if I
>        generate a repository that I later push for the first time,
>        origin is well confusing (fortunately that
>        can be changed by git remote rename origin github)

Well, let's have a bit more discussion on remotes.

First, "a remote" is a purely configuration thing, and it's there to
make it easier for a local Git repository to communicate with other Git
repositories (they need not necessarily be truly remote - a remote repo might
be a normal Git repository on a flash drive or sit in another directory on
your computer's filesystem).
A remote has a name, and may have other metadata configured.
At a minimum, it has an URL associated with it - to locate the repository.

Now, the name "origin" is nothing special - just like "master".
It is created automatically only in a single case - in the repositories
created by `git clone` run with default settings.
Why "origin"?
Because `git clone` makes a local copy of a single repository, and the source
repository is naturally the origin of the created one ;-)

When you create a new local repository from scratch - by running `git init` -
it won't have any remotes configured, and if/when you will be about adding
one, you'll be able to pick any name you wish.

A couple more facts:

 - There can be any number of remotes configured in a local repository,
   and you can fetch from/push to any of them at will.

 - You're not required to use remotes, they are just a convenience: Git is
   fine with fetching from/pushing to a repository for which you specify
   an URL directly.

>     2. That concerns the prefix dropping  for local branches:
>        Is there a way to configure git that it does not drop the prefix
>        for local branches?
> 
> You mentioned 
> 
> git for-each-ref --format='%(refname)'
> 
> Well I presume I could use an alias but is there any setting that would
> allow me to see
> 
>  git branch -a 
> 
> Something like this 
> local/main

Please don't.
Right now you're falling into a common trap of trying to apply concepts you've
become familiar with while using system X to work with system Y which is based
on different concepts. This is never productive and can bite you in the rear
later on, so let's better try to understand "the why" of all this stuff.

> Anyway now to my biggest problem. 
> I am not entirely sure I understand your explanations about local and
> remote branches.

I see, I'll have another go at it in a moment.

> If I clone a repository, why doesn't git,
> copy/clone  *all* branches? That is the way cloning works for mercurial,
> and I find that perfectly logic and convenient.

Git does copy all the branches, really.
You can defeat it (which is called "shallow cloning", and it's routinely used
for running CI jobs, for instance) but this feature was not in Git since the
beginning, and you have explicitly tell `git clone` you want shallow cloning.

> You said 
> >  - "Remote" branches are sort of "bookmark" to the state of the branches
> >    in some named remote repository Git has contacted with - specifically,
> >    the last time it did so.
> 
> For example 
> 
>  git branch -a 
> 
> 
> 
> * main
>   remotes/origin/HEAD -> origin/main
>   remotes/origin/feature
>   remotes/origin/main
> 
> So there is no local copy of the feature branch on my local cloned 
> repository!?

There is a local copy of this branch in your local cloned repository.
You could easily vefify this by running any tool to explore a branch's
history, for instance, a simple

  git log origin/feature

would do the trick, and so would do

  gitk origin/feature

> In mercurial after I clone, I can checkout the branch I want work on it and
> then push (well there might be conflicts because somebody else pushed, but
> that is a general problem/feature of DVCS.)

Same with Git, but its asymmetry 
> 
> Then you write 
> 
> >   git checkout feature
> 
> > Git would notice there is no local branch named "feature" but there exist a
> > remote branch named "feature", and would create a local branch "feature"
> > pointing at the same commit the remote one points at, and tracking that 
> > remote
> > branch.
> 
> Does this mean that when I do that, then git downloads all commits of
> that branch? Because I would need them to make my changes?

Almost. The commits are already there so basically git merely creates a local
branch and makes it point to the commit of the remote branch.

> It seems that you say 
> 
> git checkout feature 
> 
> Is a shortcut for the following commands
> 
> > Alternatively, you could do that explicitly without involving any magic
> > of the short-circuit command invocation:
> 
> >   git branch feature origin/feature # create local branch off a remote one
> >   git checkout feature              # check local branch out
> >   git branch -u origin/feature      # make current branch track 
> > origin/feature
> 
> Does any of these command downloads the commits of the remote branch feature?

No. `git branch` reads the hash of the commit origin/feature points at
and creates another branch which points at the same commit (contains the same
hash).

> I think this crucial part I need to understand.

I think so, too. Let us try ;-)

First, just in case, I would reiterate that the term "remote branch" in Git
does not that the branch is not contained in a local repository.
It means that the branch records the state of a branch in a particular remote
repository. It has full (the same) history of its "origin" branch at the time
it was fetched from that repository.
So, to recap, after you cloned, you have all the branches from the source
repository, with full their history.

OK, now let's go back to our model example.

You have a repo somewhere, and it has two branches - "main" and "feature".
The branch "main" is marked here as main (a concept coincidentally matching
the name; before the era or SJWs, it'd be named "master" in most cases).

You run `git clone` to clone the repository. This command initializes a new
empty local repository, creates a single remote named "origin", then fetches
all the branches from the source repository. *All these branches are created
as "remote branches" in the local, receiving repository* - and are named in a
way to be clear than they "belong" to the remote named "origin".
So they are: origin/main and origin/feature.
Please contemplate this fact for some time ;-)
Then the command asks the source repository what branch it has designated as
main, it tells that it's named "main", and then the command takes this locally
created remote branch and runs

  git branch -u main origin/main

which results in a single local branch named "main" created, pointing to the
same commit origin/main points at, and set to track origin/main.
The subgraphs of commits reachable from both branches are hence completely
identical at this point - since they point at the same commit.

May be if `git clone` contained less "helpful magic" and did not bother to
automatically create a single local branch it would be easier to understand
conceptually, but it doesn't.


Let's now move back to the asymmetry.
I think the key problem impeding understanding of this (you're certainly not
alone on this) is that provided how 99.9% of tutorial material on DVCSes is
written, many people maintain that if there is a network of DVCS repos
communicated with each other, they all essentially are just replicas of
_logically the same thing._
This is where the problem lies, because Git does not make this assumption.

Any Git repository is able to communicate with any number of other Git
repositories maintaining completely disjoint projects with completely disjoint
histories. Again, please contemplate this fact for a while.

As I've demonstrated in my first response, you can create a repo for your
weekend toy project which you do not even push anywhere, and then fetch
complete repository containg the Linux kernel source code into your local
repository. You can then fetch there, say, the source code of the LaTeX
project, and then the sources of some documentation project of a book.
The full histories of all the branches of all of these projects will be
contained in your repository - along with your own developments.
As you can see, all the projects you have fetched probably have their own
branch named "main" or "master"; some of them may have other clashing names -
say, it's not too far-fetched to explain two or more of them to contain a
branch named "feature".

Now, if you want to have a local branch named "feature", how do you go about
that? Should your local branch be made based on the branch "feature"
registered for remote "linux"? Or on the same-named branch registered on
remote "latex" or remote "book"? Or may be it should start out based off some
already existing local branch of yours? Or maybe it should start completely
empty (possible in Git, too)?
Supposedly you see by now that there cannot be any answer to this which can be
provided automatically - only you can decide.

Let's come up with another example which is a bit less extreme but tries to
demonstrate the point.

Suppose you have started development of some project, and pushed it to
Github. There's now two repos: your local and its sort-of-copy on Github.
Now Alice comes along and starts helping you on the project.
But she despises Git for having had sold itself to a Big Tech company, so she
pushes her stuff to Bitbucket (which is also owned by a Big Tech company,
just a less controversial one).
So, to communicate with her, you add another remote to your local repo, and
you now have two of them.
OK, so here comes Bob, another fellow dev. Bob is even more a nut than Alice,
so it pushes his stuff to a rig located in his basement, on which he
self-hosts a bunch of Git repos. So you have the third remote.
Let's posit these remotes are named "github", "alice" and "bob".

Now suppose that both Alice and Bob started to work on different features,
but they are easy with the naming so their branches ended up using the same
name, "feature". They periodically push their stuff to their respective
branches in their respective repos.
And you fetch from them - ending up with updated alice/feature and bob/feature
branches.
For more fun, suppose that by the time these folks started their feature gigs,
you yourself were already working on your own local branch "feature"
implementing something distinct from what Alice and Bob do.
Can you see these remotes branches are different both different, and your
local branch is different from them?
And still you can manipulate all these branches.
Say, nothing prevents you from merging alice/feature to main and push it up to
Github. Or you might decide to rename your local "feature" to "my-own-feature"
and then fork the local "feature" from "alice/feature" to help Alice with
development. You can then push your developments anywhere under any name,
delete the local "feature" and fork a local branch with the same name off
bob/feature to help Bob. Or, in the last two cases you could have named the
local branches "alice-stuff" and "bob-crazy-idea".

This also reiterates that "Git branches are like bookmarks" idea: you can have
any names for any stuff, names can change at will, any time, and are
absolutely not sacred.

Note that, continuing the example, you are not required to push your stuff to
"your remote repository" either. For instance, you could record a bunch of
commits on top of the Alice's "feature" and push a branch with them to Alice's
Bitbucket repo and ask her to review them and merge. Or you could push it to
your Github repo and ask her to fetch it from there.

Basically, that's the reason why all branches of a cloned repo become remote
branches in your local repo: because Git cannot read your mind and somehow
infrer which model you're about to implement about working with branches here.
Also, Git branches are very lightweight and simple. They do not have multiple
heads or something, so if Git were to turn all the branches from a source repo
straight into local branches in the local repo, you would lose the ability to
compare the state of these branches as they are (were) in the source repo with
their state in the local replica. (Well, techincally it _is_ possible to
create a local repo in a way that there's no remote branches and local
branches are straight "mirrors" of the branches in the source repo but you
will then have a problem with updating stuff from the remote repo: if a branch
there contains any commits your "mirror" branch hasn't, where would you store
that new conflicting line of history?)

[...]
> > as you would end in that detached HEAD state.
> 
> I am not sure I understand this, what is the workflow? What is the
> benefit of this approach?

I think it's better to not digress there for now.
Let's try to make the asymmetry concept sink in first ;-)

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/git-users/20221116115554.svwuzfjeobm5inib%40carbon.

Re: [git-users] an elementary question how to switch/checkout a remote branch

Reply via email to