On Aug 30, 3:57 pm, Haasip Satang <haasip.sat...@googlemail.com>
wrote:
> I found something that might be interesting. To me the problem seems to be
> the way the pack files are constructed.
>
> I checked what happened with git log and what I get is the following:
>
> /lhome/gitadmin/repo/main> git log
>
>
>
>
>
> > commit 214baf2cea19d66e3a1817e8e6aa4883294be05f
> > Merge: ac974b0 8ad7c91
> > Author: gitadmin <gitadmin@gitrepo>
> > Date:   Wed Aug 29 13:02:42 2012 +0200
> >     Split 'xyz/' into commit '*8ad7c91aef6a4814fce80ab6e092fe7eeedc8090*'
> >     git-subtree-dir: xyz
> >     git-subtree-mainline: *ac974b0c9ac110a85c6f58fb460ee54a64992bda*
> >     git-subtree-split: *8ad7c91aef6a4814fce80ab6e092fe7eeedc8090*
> > commit *ac974b0c9ac110a85c6f58fb460ee54a64992bda*
> > Author: gitadmin <gitadmin@gitrepo>
> > Date:   Wed Aug 29 13:02:07 2012 +0200
> >     initial commit importing everything
> > commit 8ad7c91aef6a4814fce80ab6e092fe7eeedc8090
> > Author: gitadmin <gitadmin@gitrepo>
> > Date:   Wed Aug 29 13:02:07 2012 +0200
> >     [xyz] initial commit importing everything
>
> So interesting parts are the highlighted SHAs. So the subtree command
> actually does create a synthetic commit  (8ad7c91).
>
> I checked the contents of this commit with:
>
> git cat-file 8ad7c91aef6a4814fce80ab6e092fe7eeedc8090
>
> and saw that it only points to one tree object --> 0f201238. This tree
> object is the folder of only the files / subtree I need. So the commit
> contains EXACTLY what I want to have.
>
> So why does it not work? Here the interesting part (that I still don't
> get):
>
> When I clone that subtree branch using file:// git is smart enough to
> repack and create a separate pack file that only contains what I need /
> what the commit is pointing to.

When you ssh you are asking the far-away server to do the work. It has
only certain options available as to what it can do.

When you use the File:// you are using your own internal code and repo
knowledge so can be more efficient about selecting only the bits you
require.

The clone manual page covers the two styles of  /path/to/repo.git/ and
file:///path/to/repo.git/ and says the only difference is the first
assumes --local which I think you already supply. Either way git still
knows it isn't a served fetch of the pack and that it has to pull out
the bits it wants all by itself.

It may be worth reporting to the sub-tree developers, either for a
'fix' or for a documentation clarification.

>
> When I clone from remote via ssh though I get the big pack file that
> contains everything. Interestingly, when cloning locally via ssh git again
> repacks and gives me a new pack file.
>
> Any ideas how I can force the repacking when cloning?
>
> Am Donnerstag, 30. August 2012 14:18:51 UTC+2 schrieb Philip Oakley:
>
>
>
>
>
> > On Aug 30, 11:34 am, Haasip Satang <haasip.sat...@googlemail.com>
> > wrote:
> > > ;-) That's what I did as you can see in my explanation above ;-) The
> > > problem still seems to be that it only works locally on the same linux
> > > machine. When I try to clone from any remote machine (not matter which
> > OS)
> > > I end up getting the huge .git folder.
>
> > > So the question actually is why does
>
> > > git clone --depth 1 --no-hardlinks
> > *file:///*home/me/gitTests/subtreeRepo
> > > -b subtrees/xyz *xyz *
>
> > > give me a small clone (*but only locally), *while cloning from remote I
> > get
> > > a big one.
>
> > I'm going to 'guess' that it is one of two things.
>
> > One is the pack protocol that could mean that in one case you get a
> > compressed pack (though that doesn't shound like your case ;-).
>
> > And the other is that the --no-hardlinks, in conjunction with the
> > other options has severely limited the number of branches that are
> > cloned (copied locally), while --depth=1 will (could) still have
> > pulled down the lead commit for every branch and hidden them under /
> > remotes/.
>
> > > And as mentioned earlier as well, when cloning the small *xyz *from
> > remote
> > > then I end up with what I wanna have; a small xyz project on a remote
> > > machine.
>
> > > Why can I not directly clone xyz remotely and get the same result as
> > with
> > > the local clone?
>
> > > Am Donnerstag, 30. August 2012 09:00:04 UTC+2 schrieb Philip Oakley:
>
> > > >  Isn't a shallow clone a good use case for this? You only need the
> > latest
> > > > commit of each project you want to build and then it either works or
> > it
> > > > doesn't, and the clone is then deleted.
>
> > > > So is 'git clone --depth <depth>' what you need?
> > > > Use  <depth> := 1
>
> > > > Just a thought
>
> > > > Philip
>
> > > > -
> > > > ---- Original Message -----
>
> > > > *From:* Haasip Satang <javascript:>
> > > > *To:* git-...@googlegroups.com <javascript:>
> > > > *Sent:* Thursday, August 30, 2012 1:21 AM
> > > > *Subject:* [git-users] Size of cloned git subtrees - only history /
> > files
> > > > for subtree needed
>
> > > > Hi all,
>
> > > > in short the question of the lenghty explanation below will be: How
> > can I
> > > > create a clone of a subtree that only contains the data needed for
> > that
> > > > subtree in the .git folder.
>
> > > > In detail here is what I have tried already and what my setup looks
> > like:
> > > > We are having a big repository containing multiple projects (political
> > > > reasons, cannot avoid having that... at least for now). While this
> > works
> > > > fine for all the developers (they just clone the big repo and get all
> > the
> > > > projects they need), we are facing problems with our continuous build
> > > > system (Jenkins).
>
> > > > Here we would like to have a job for each single project; of course
> > > > WITHOUT having to clone the whole big repo for every job, as this
> > would
> > > > lead to a significant overhead on disk.
>
> > > > After searching around for some time I basically came across four
> > > > potential solutions:
>
> > > > 1. Sparse Checkout
> > > > 2. Submodules
> > > > 3. Individual Repos with a manager script like repo, mr, git-status,
> > and
> > > > all the others that exist to tackle that problem
> > > > 4. Subtrees
>
> > > > The problem with 1 is, you still get to clone the whole repo
> > (including
> > > > all history), only to then checkout a part of it --> still disk
> > overhead.
> > > > As for submodules, I personally don't really like them and don't think
> > the
> > > > should be used in this case and they are kinda difficult to handle and
> > can
> > > > be fragile anyway.
> > > > The additional script based solution seems kinda hacky as well, so I
> > > > didn't really follow up on that too much.
>
> > > > So my favorite solution so far is actually using git subtree, which is
> > > > more or less easy (especially since the subtree branches are only used
> > for
> > > > the CI builds / in a read only way, nothing needs to be pushed back to
> > the
> > > > bigrepo).
>
> > > > The problem is, however, when I clone the bare and then create the
> > subtree
> > > > branches in the cloned working copy and then try to clone these
> > subtree
> > > > branches only, I still seem to get the whole big history, including
> > all the
> > > > stuff outside the tree.
>
> > > > Is there any way to avoid that and create a synthetic project history
> > > > containing only data relevant for the subtree?
>
> > > > What I did to kinda get there is more a hacky way. I create the
> > subtree
> > > > branch using:
>
> > > >  git subtree split --prefix=xyz --annotate="[xy] " --rejoin -b
> > > > subtrees/xyz
>
> > > > Then I clone that with:
>
> > > > git clone --depth 1 --no-hardlinks
> > file:///home/me/gitTests/subtreeRepo -b
> > > > subtrees/xyz xyz
>
> > > > So creating a shallow clone (depth 1) seems to be the only way and
> > that
> > > > also only works on the local linux machine. If I clone the same
> > subtreeRepo
> > > > branch on a remote machine I actually get the whole big pack / history
> > with
> > > > it (which I of course don't want).
>
> > > > So what I did is I cloned the subtree branch locally and then cloned
> > that
> > > > repo from my remote Jenkins machine. While this seems to work (I
> > haven't
> > > > looked in if I'm getting the necessary change sets to send out the
> > emails
> > > > yet) it seems both, unnecessary complicated and very hacky.
>
> > > > To sum up, let me conclude with the question from the beginning: How
> > can I
> > > > create a clone of a subtree that only contains the data needed for
> > that
> > > > subtree in the .git folder.
>
> > > > Looking forward to your comments and ideas :)
>
> > > > Thanks, Haasip
>
> > > > --
> > > > You received this message because you are subscribed to the Google
> > Groups
> > > > "Git for human beings" group.
> > > > To view this discussion on the web visit
> > > >https://groups.google.com/d/msg/git-users/-/n5ZPYpDf4EIJ.
> > > > To post to this group, send email to 
> > > > git-...@googlegroups.com<javascript:>
>
> > > > .
> > > > To unsubscribe from this group, send email to
> > > > git-users+...@googlegroups.com <javascript:>.
> > > > For more options, visit this group at
> > > >http://groups.google.com/group/git-users?hl=en.
>
> > > > No virus found in this message.
> > > > Checked by AVG -www.avg.com
> > > > Version: 2012.0.2197 / Virus Database: 2437/5233 - Release Date:
> > 08/29/12- Hide quoted text -
>
> > > - Show quoted text -- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To post to this group, send email to git-users@googlegroups.com.
To unsubscribe from this group, send email to 
git-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/git-users?hl=en.

Reply via email to